This project was made for a task from Tuwaiq academy, scraping book data from the Books to Scrape website. It collects information about books, including their titles, prices, stock availability, ratings, and descriptions, and saves the data into a CSV file for further analysis and visualization.
- Scrapes book data from multiple pages.
- Extracts details such as title, price, stock availability, rating, and description.
- Saves the data into a CSV file (
Scraped_data.csv). - Provides basic visualizations using Seaborn.
The project requires the following Python packages:
pandasbeautifulsoup4requestsseaborn
Install the dependencies using:
pip install -r requirements.txt- Clone this repository or download the project files.
- Ensure you have Python installed on your system.
- Install the required packages using the command above.
- Run the Jupyter Notebook (
webScraping.ipynb) to scrape the data and generate the CSV file.
- A CSV file named
Scraped_data.csvcontaining the scraped book data. - Visualizations of book ratings and prices.
- Ensure you have an active internet connection while running the notebook.
- The scraping process may take some time depending on the number of pages and books.
This project is for educational purposes only. Please ensure compliance with the website's terms of service when scraping data.