Skip to content

RodrighoNS/daria_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Daria Scraper

A web scraping tool built with Python and BeautifulSoup.

Data extracted from https://outpost-daria-reborn.info, thanks to Kevin Bess (a.k.a. NeonHomer) for all his effort and dedication to the Daria community.

Installation

This project uses Poetry for dependency management.

Prerequisites

Setup

  1. Clone the repository:
git clone https://github.com/RodrighoNS/daria-scraper.git
cd daria-scraper
  1. Install dependencies with Poetry:
poetry install

Usage

Basic Scraping

poetry run daria-scraper

Configuration Options

The scraper can be configured in config.py:

  • USER_AGENT: Custom user agent for requests
  • REQUEST_DELAY: Time between requests (in seconds)
  • TIMEOUT: Request timeout
  • OUTPUT_FORMAT: Data output format (csv/json)
  • DATA_DIR: Directory for saved data

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A web scraping tool built with Python and BeautifulSoup to extract data from https://outpost-daria-reborn.info.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors