This project is a web scraper designed to extract data from Amazon product pages. It consists of a Bun backend that uses Puppeteer to scrape the data and an Express server to expose it, and a simple vanilla JavaScript frontend to display the scraped data.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
You will need to have the following software installed on your machine:
- bun: The backend of this project uses bun. You can install it by following the instructions on their website.
Installation instructions for bun:
- macOS / Linux:
curl -fsSL https://bun.sh/install | bash - Windows:
powershell -c "irm https://bun.sh/install.ps1 | iex"
-
Clone the repository:
git clone <repository-url>
-
Install backend dependencies: Navigate to the
backenddirectory and install the dependencies using bun.cd backend bun install -
Install frontend dependencies: Navigate to the
frontenddirectory and install the dependencies using npm.cd ../frontend npm install
To run the application, you need to start both the backend and the frontend servers.
-
Start the backend server: Navigate to the
backenddirectory and run the server.cd ../backend bun run index.jsThe server will start on the default port (usually 3000).
-
Start the frontend development server: Navigate to the
frontenddirectory and run the development server.cd ../frontend npm run devThe frontend will be available at
http://localhost:5173(or another port if 5173 is busy, check the output of the command).
The backend is an Express.js server that has an endpoint (e.g., /scrape). When this endpoint is called, it uses Puppeteer to launch a headless browser, navigate to an Amazon product URL, and scrape the desired information from the page. The scraped data is then returned as a JSON response.
The frontend is a simple web page that makes a request to the backend's /scrape endpoint and displays the returned data.