This Python project scrapes data from the Pokemon Database (pokemondb.net) and stores it in a SQLite database.
The project consists of two main files:
- pokemondbscraper: Contains the
PokemonDBclass responsible for scraping, processing, and storing Pokemon data. - app.py: Uses the
PokemonDBclass to initiate the scraping process.
To run this project, ensure you have the following installed:
- Clone the Repository:
git clone git@github.com:imjbmkz/PokemonDBScraper.git
cd pokemondbscraper
- Setup: Run the following command to install the dependencies required.
poetry install
-
Running the Scraper:
- Modify
urlinapp.pyif you want to scrape a different page from Pokemon Database. - Run
app.pyusing the following command:poetry run python app.py
This will initiate the scraping process, which involves downloading HTML, processing data, and storing it in the SQLite database.
- Modify
-
Logs: Logs for INFO level are stored in
logs/std_out.logand ERROR level logs are stored inlogs/std_err.log. -
Database: The scraped data is stored in
pokemondb.dbin a table namedpokedex.
- scraper.py: Contains the
PokemonDBclass for scraping and processing logic. - app.py: Entry point for running the scraper. Initializes
PokemonDBwith the URL and compiles the data. - logs/:
logs/std_out.log: INFO level logs.logs/std_err.log: ERROR level logs.
- pokemondb.db: SQLite database file where the scraped data is stored.
Feel free to fork this repository, enhance the scraper, or fix any bugs you encounter. Pull requests are welcome.
This project is licensed under the MIT License - see the LICENSE file for details.