A web scraping tool built with Python and BeautifulSoup.
Data extracted from https://outpost-daria-reborn.info, thanks to Kevin Bess (a.k.a. NeonHomer) for all his effort and dedication to the Daria community.
This project uses Poetry for dependency management.
- Python 3.7 or higher
- Poetry (installation guide)
- Clone the repository:
git clone https://github.com/RodrighoNS/daria-scraper.git
cd daria-scraper- Install dependencies with Poetry:
poetry installpoetry run daria-scraperThe scraper can be configured in config.py:
USER_AGENT: Custom user agent for requestsREQUEST_DELAY: Time between requests (in seconds)TIMEOUT: Request timeoutOUTPUT_FORMAT: Data output format (csv/json)DATA_DIR: Directory for saved data
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.