A powerful Chrome extension for web scraping with multiple pagination modes, smart field detection, and real-time visual feedback.
- 📄 Single Page - Scrape current page only
- 🔘 Button Pagination - Auto-click "Next" buttons with smart detection
- 🔗 Query Parameter Pagination - Background fetch for URL-based pagination (e.g.,
?page=2) - ♾️ Infinite Scroll - Auto-scroll with configurable delays and limits
- 🔄 Load More - Auto-click "Load More" buttons
- Smart Picker - Click any item to auto-detect fields and selectors
- Smart Detect - Auto-detect Next/Load More buttons by clicking them
- Real-time Progress - Visual overlay showing current item, page, and total count
- Beautiful Toggle UI - iOS-style switch between Button and Query Param pagination
- CSV export with UTF-8 BOM support
- JSON export
- Configurable field mapping
- Automatic filename generation
- Download the latest release or clone this repository
- Open Chrome and go to
chrome://extensions - Enable "Developer mode" (toggle in top right)
- Click "Load unpacked"
- Select the
Vibe-scraperfolder
- Open the sidebar - Click the extension icon
- Create a job:
- Enter a Job ID (e.g., "products")
- Select scraping mode
- Use Smart Picker:
- Click "🎯 Smart Picker"
- Click any item on the page
- Fields are auto-detected!
- Start scraping - Click "
▶️ Start Scraping" - Download results - CSV downloads automatically when complete
The extension works entirely in the browser by default. For JS-heavy sites, large datasets, or scraping that needs to run in the background, an optional local Python server is included.
| Extension only | Extension + Server | |
|---|---|---|
| Static HTML sites | ✅ | ✅ |
| JS-rendered SPAs | ❌ | ✅ (Playwright) |
| Background scraping | ❌ | ✅ |
cd vibe-scraper-server
pip install -r requirements.txt
playwright install chromium # one-time, ~300 MB
python server.pyOpen the extension — a green "Local server connected" badge will appear. If it shows orange, the server is not running. Click "How to start →" for instructions.
Then use "🖥️ Scrape via Server" alongside the usual Start Scraping button.
See vibe-scraper-server/README.md for full details.
Scrapes only the current page you're viewing.
Use when:
- You need data from one specific page
- No pagination required
Example:
https://example.com/products
→ Scrapes 24 items from this page only
Automatically clicks "Next" buttons to navigate through pages.
Use when:
- Site has visible Next/Previous buttons
- Content loads via traditional pagination
- Single Page Apps (SPAs) with dynamic content
Configuration:
- Next Button Selector: CSS selector for the button (or use 🎯 Detect)
- Max Pages: Maximum pages to scrape (default: 10)
Smart Detection: Click the 🎯 button, then click the "Next" button on the page - the selector is auto-filled!
Example:
Page 1: Scrape items → Click "Next"
Page 2: Scrape items → Click "Next"
Page 3: Scrape items → Done
Fetches pages via URL parameters without page reload.
Use when:
- URLs have
?page=Xformat - Server-side rendered HTML
- No JavaScript required for content
- Want faster scraping (no DOM waits)
Configuration:
- Parameter Name: URL parameter (e.g., "page")
- Max Pages: Maximum pages to fetch
How it works:
- Content script scrapes page 1
- Background script fetches page 2 HTML via
fetch() - Content script parses HTML with DOMParser
- Repeat for all pages
Example:
Page 1: https://example.com?page=1 (scraped from DOM)
Page 2: https://example.com?page=2 (fetched in background)
Page 3: https://example.com?page=3 (fetched in background)
→ No page reload, no state loss!
Limitations:
- Only works with static HTML
- Won't work on SPAs requiring JavaScript
- For SPAs, use Button Pagination instead
Automatically scrolls down to load more content.
Use when:
- Content loads on scroll (e.g., social media feeds)
- No "Load More" button
- Continuous scrolling behavior
Configuration:
- Max Scrolls: Maximum scroll attempts (default: 10)
- Delay: Milliseconds between scrolls (default: 3000ms)
Example:
Scroll 1 → Wait 3s → Scrape new items
Scroll 2 → Wait 3s → Scrape new items
Scroll 3 → Done
Automatically clicks "Load More" buttons.
Use when:
- Site has a "Load More", "Show More", or "View More" button
- Content loads dynamically after clicking
Configuration:
- Button Selector: CSS selector for the button (or use 🎯 Detect)
- Max Clicks: Maximum clicks (default: 10)
Smart Detection: Click the 🎯 button, then click the "Load More" button - selector auto-filled!
Example:
Click "Load More" → Wait → Scrape 12 new items
Click "Load More" → Wait → Scrape 12 new items
Button disappears → Done
The Smart Picker automatically detects:
- Item selector - Finds all similar items on the page
- Field selectors - Extracts text, links, images from each item
- Field names - Auto-generates names (title, price, url, etc.)
How to use:
- Click "🎯 Smart Picker"
- Click any item on the page (e.g., a product card)
- Fields are detected automatically
- Review in the preview panel
- Start scraping!
What it detects:
- Text content (product titles, descriptions)
- Links (href attributes)
- Images (src attributes)
- Prices (numerical text)
- And more...
- Fast job creation
- Smart Picker integration
- All 5 scraping modes
- Toggle between Button/Query Param
- Real-time field preview
- Advanced job management
- Visual mode selector with icons
- Edit existing jobs
- Export/import configurations
- Job history
Browser-only mode (default — no setup required):
┌────────────────────────────────────┐
│ Sidebar (popup.html) │
│ popup.js / ui.js / scraping.js │
│ editor.js / state.js │
└──────────────┬─────────────────────┘
│ chrome.tabs.sendMessage
↓
┌────────────────────────────────────┐
│ Content Scripts │
│ scraper-runner.js ← main engine │
│ smart-picker.js ← field detect │
│ selector-utils.js ← shared util │
│ pagination-detector.js │
└──────────────┬─────────────────────┘
│ chrome.runtime.sendMessage
↓
┌────────────────────────────────────┐
│ Background (service-worker.js) │
│ - Job management & message routing │
│ - HTTP fetching for query-param │
└────────────────────────────────────┘
With local server (optional — enables JS-heavy sites):
┌────────────────────────────────────┐
│ Sidebar (popup.html) │
│ "🖥️ Scrape via Server" button │
└──────────────┬─────────────────────┘
│ POST /scrape (job config JSON)
│ GET /status/{id} (polling)
│ GET /download/{id}
↓
┌────────────────────────────────────┐
│ Local Python Server (port 7823) │
│ FastAPI + uvicorn │
├────────────────────────────────────┤
│ httpx + BeautifulSoup (static HTML)│
│ Playwright (JS-rendered SPAs) │
└────────────────────────────────────┘
Problem: Traditional navigation causes page reload → state loss → can't continue scraping.
Solution: Split responsibilities:
-
Background Script: Fetches HTML via
fetch()APIconst response = await fetch('https://example.com?page=2'); const html = await response.text(); return { html, url };
-
Content Script: Parses HTML with DOMParser (has DOM APIs)
const parser = new DOMParser(); const doc = parser.parseFromString(html, 'text/html'); const items = doc.querySelectorAll(itemSelector);
Benefits:
- ✅ No page reload
- ✅ State preserved
- ✅ Faster scraping
- ✅ Works with any URL-based pagination
Vibe-scraper/
├── manifest.json # Extension configuration (MV3)
├── config/
│ └── example-config.json # Example job config
├── src/
│ ├── popup/ # Sidebar UI (ES modules)
│ │ ├── popup.html
│ │ ├── popup.js # Entry point — wires events
│ │ ├── state.js # Shared mutable state
│ │ ├── ui.js # DOM helpers
│ │ ├── scraping.js # Start/stop/server scraping
│ │ ├── editor.js # Job editor + Smart Picker
│ │ └── popup.css
│ ├── options/ # Settings page
│ │ ├── options.html
│ │ ├── options.js
│ │ └── options.css
│ ├── content/ # Content scripts
│ │ ├── scraper-runner.js # Main scraping engine
│ │ ├── smart-picker.js # Smart field detection
│ │ ├── selector-utils.js # Shared CSS selector utility
│ │ └── pagination-detector.js # Auto-detect pagination type
│ └── background/
│ └── service-worker.js # Background tasks & HTTP fetching
├── vibe-scraper-server/ # Optional local Python server
│ ├── server.py # FastAPI app (port 7823)
│ ├── scraper.py # httpx + Playwright scraping engine
│ ├── requirements.txt
│ └── README.md
└── public/
└── icons/
-
Chrome Extension APIs:
chrome.scripting- Script injectionchrome.tabs- Tab managementchrome.storage- Job storagechrome.sidePanel- Sidebar UIchrome.runtime- Messaging
-
Built-in APIs:
DOMParser- HTML parsing (content script)fetch()- HTTP requests (background)querySelector()- DOM querying
No external libraries required! Pure vanilla JavaScript.
- Optimized for space efficiency
- All controls visible without scrolling
- Professional appearance
- Font sizes: 12px body, 11px labels, 10px small text
- iOS-style design
- Smooth 0.3s animations
- Blue active state (#2196F3)
- Clear visual feedback
- Shows current item being scraped
- Page number and progress (X/Y)
- Total items scraped
- Extracting/Complete status
# Clone the repository
git clone https://github.com/CreativeAcer/Vibe-scraper.git
cd Vibe-scraper
# Load in Chrome
# 1. Open chrome://extensions
# 2. Enable Developer mode
# 3. Click "Load unpacked"
# 4. Select the Vibe-scraper folderThe extension uses Manifest V3 with no build process required.
Edit and reload:
- Make changes to source files
- Go to
chrome://extensions - Click reload button on Vibe Scraper
- Test changes
Test sites:
- https://scrapingtest.com/ecommerce/pagination - Button pagination
- https://scrapingtest.com/ecommerce/load-more - Load More buttons
- Any site with
?page=XURLs - Query param
Test each mode:
- Single Page
- Button Pagination (with Detect)
- Query Param Pagination
- Infinite Scroll
- Load More (with Detect)
Features:
- ✅ 5 scraping modes (Single, Button, Query Param, Infinite, Load More)
- ✅ Smart Picker with auto field detection
- ✅ Smart Detect for buttons
- ✅ Beautiful toggle UI (380px compact)
- ✅ Query param multi-page via background fetch
- ✅ Real-time progress overlay
- ✅ Stop button functionality
- ✅ CSV/JSON export
Technical:
- ✅ Background-orchestrated pagination for query params
- ✅ DOMParser in content script (service worker compatible)
- ✅ Relative URL resolution
- ✅ Stop flag for graceful termination
- ✅ Message channel timeout fix
Contributions are welcome! Please feel free to submit a Pull Request.
Areas for contribution:
- Additional scraping modes
- More smart detection patterns
- UI improvements
- Bug fixes
- Documentation
MIT License - see LICENSE file for details
-
Query Parameter Pagination (browser mode):
- Only works with static HTML
- Won't work on SPAs requiring JavaScript
- Solution: Use Button mode, or use the local server which handles JS rendering
-
Smart Picker:
- Works best with consistent HTML structure
- May need manual adjustment for complex layouts
-
CORS Restrictions (browser mode):
- Some sites block background fetch requests
- Solution: Use Button pagination, or use the local server (no CORS restrictions)
-
No authentication support:
- Login automation and session handling are not implemented
- Start the scrape from a page where you're already logged in
- Start Small: Test with Max Pages = 2 first
- Use Smart Detect: Let the extension find selectors
- Check Console: Open DevTools to see debug logs
- Save Jobs: Use Settings page to manage configurations
- Export Early: Download CSV after successful test run
No items found?
- Check if selector matches items:
document.querySelectorAll('your-selector') - Use Smart Picker to auto-detect
Pagination not working?
- Verify Next button selector
- Use Smart Detect (🎯)
- Check if it's a SPA (use Button mode)
Query param returns no data?
- Site might need JavaScript (use Button mode)
- Check if URL actually changes pages
- Verify parameter name is correct
For issues, questions, or feature requests:
- Open an issue on GitHub
- Check existing issues for solutions
- Include console logs for bug reports
Made with ❤️ for the web scraping community