Daria Scraper

A web scraping tool built with Python and BeautifulSoup.

Data extracted from https://outpost-daria-reborn.info, thanks to Kevin Bess (a.k.a. NeonHomer) for all his effort and dedication to the Daria community.

Installation

This project uses Poetry for dependency management.

Prerequisites

Python 3.7 or higher
Poetry (installation guide)

Setup

Clone the repository:

git clone https://github.com/RodrighoNS/daria-scraper.git
cd daria-scraper

Install dependencies with Poetry:

poetry install

Usage

Basic Scraping

poetry run daria-scraper

Configuration Options

The scraper can be configured in config.py:

USER_AGENT: Custom user agent for requests
REQUEST_DELAY: Time between requests (in seconds)
TIMEOUT: Request timeout
OUTPUT_FORMAT: Data output format (csv/json)
DATA_DIR: Directory for saved data

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
daria_scraper		daria_scraper
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Daria Scraper

Installation

Prerequisites

Setup

Usage

Basic Scraping

Configuration Options

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Daria Scraper

Installation

Prerequisites

Setup

Usage

Basic Scraping

Configuration Options

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages