Skip to content

Sggin1/crawl

Repository files navigation

Web Crawler GUI

A simple Windows GUI application for crawling websites and generating AI-optimized markdown documentation.

Features

  • 🌐 Web Crawling: Crawl websites using Crawl4AI
  • 📝 Markdown Generation: Convert web pages to clean, AI-optimized markdown
  • 🗂️ Flat File Structure: All output files in one directory with clear prefixes
  • 🎯 Simple UI: Easy-to-use tkinter interface
  • 🔄 Fallback Support: Option to use Docling if Crawl4AI fails

Installation

# Install dependencies
pip install -r requirements.txt

# Run the application
python main.py

Usage

  1. Enter a URL in the input field
  2. Configure output options
  3. Click "Start Crawl"
  4. Wait for completion
  5. Find your markdown files in the output directory

Output File Prefixes

  • INDEX_* - Table of contents/index files
  • HOME_* - Homepage/landing pages
  • GUIDE_* - Tutorials and guides
  • API_* - API documentation
  • EXAMPLE_* - Code examples
  • FAQ_* - FAQ content
  • REF_* - Reference documentation
  • BLOG_* - Blog posts

Development

See CODING_STANDARDS.md for detailed coding standards and guidelines.

Project Structure

crawl-gui/
├── main.py                    # Entry point
├── gui/                       # GUI components (W_*, P_*, C_*)
├── crawler/                   # Crawling logic
├── markdown/                  # Markdown generation
├── storage/                   # File I/O
└── utils/                     # Utilities

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages