Skip to content

Lynnux-useless-codes/Booru-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌌 Booru-Scraper

License: Unlicense Bash Fast

A high-performance, modular, and asynchronous bash-based scraper for Booru-style imageboards. Designed for speed, flexibility, and ease of use.

✨ Features

  • 🚀 Pipelined Execution: Asynchronous metadata fetching and file downloading ensures maximum bandwidth saturation.
  • 🧵 Multi-threaded: High-speed parallel downloads with configurable thread limits.
  • 🔄 Unlimited Downloading: Support for downloading all available results using amount: 0, -1, or *.
  • 🧩 Modular Drivers: Easily extendable with site-specific drivers (Rule34, Gelbooru, Safebooru, etc.).
  • 💎 Premium CLI UI: Real-time progress bars, animated loading spinners, and color-coded logging.
  • 💾 Smart Caching: Optional SHA-256 hash checking to prevent redundant downloads.
  • ⚙️ Flexible Config: Load settings from config.yaml or override them via command-line arguments.

🛠️ Installation

# Clone the repository
git clone https://github.com/lynnux-useless-codes/Booru-Scraper.git

# Enter the directory
cd Booru-Scraper

# Grant execution permissions
chmod +x downloader.sh

🚀 Quick Start

Basic Download

Download 20 images with specific tags:

./downloader.sh --amount 20 "sombra"

Download All Available

Download everything matching a tag:

./downloader.sh --amount 0 "tracer"

Using Site Drivers

Specify a different imageboard (e.g., Safebooru):

./downloader.sh --site safebooru --amount 10 "solo"

📂 Documentation

Detailed guides for various aspects of the scraper can be found in the docs/ directory:

📋 Prerequisites

Ensure your system has the following utilities installed:

  • bash (v4.0+)
  • curl
  • jq
  • sha256sum

🏗️ Project Structure

.
├── downloader.sh       # Main entry point
├── config.yaml         # User configuration
├── docs/               # Advanced documentation
└── src/
    ├── core/           # Engine and config modules
    ├── sites/          # Site-specific drivers
    └── utils/          # Progress, logging, and helpers

📜 License

This project is released into the public domain under the Unlicense. Feel free to use, modify, and distribute without any restrictions.

About

A CLI tool to scrape and download content from Booru sites using their API.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages