A comprehensive tool to download websites and YouTube videos for offline use.
OffInternet is a Python-based tool designed to reliably save complex web content—including complete websites with all assets or YouTube videos with related content—to your local machine for offline access.
- Complete Website Archival: Downloads entire websites including HTML, CSS, JavaScript, images, and other assets
- YouTube Video Downloader: Downloads YouTube videos and optionally fetches related/suggested videos
- Offline Access: Download once, view anytime—perfect for archiving, research, or areas with poor connectivity
- Cloudflare Bypass: Built-in support for websites protected by Cloudflare security
- Smart Crawling: Intelligent URL filtering with relaxed rules to capture more content
- Bundle Export: Saves all downloaded content as
.pagefiles (zipped archives with metadata) - Python-Powered: Built with 100% Python for transparency and easy customization
- Python 3.10+ installed on your system
- Chrome/Chromium browser (for Cloudflare bypass and JavaScript rendering)
- ChromeDriver (automatically managed by the script)
-
Clone the repository:
git clone https://github.com/lazerkatsweirdstuff/offinternet.git cd offinternet -
Install required dependencies:
pip install -r requirements.txt
-
Run the downloader:
python downloader.py
-
View downloaded content:
python page-browser.py
The browser will run on localhost:8000
python downloader.py -y https://youtube.com/watch?v=VIDEO_ID -m 5This downloads the video and up to 5 suggested videos. The -m flag sets the maximum number of videos to download.
python downloader.py -f urls.txt -m 10Downloads websites from a text file (urls.txt), with a limit of 10 pages per site.
python downloader.py https://example.com -m 20 --skip-assetsDownloads a website deeply (20 pages) but skips images and stylesheets to save space.
| Argument | Description |
|---|---|
URL |
Direct URL to download (website or YouTube) |
-y, --youtube |
Force YouTube downloader mode |
-w, --website |
Force website downloader mode |
-f FILE, --file FILE |
File containing list of URLs |
-o DIR, --output DIR |
Custom output directory |
-m N, --max-pages N |
Maximum pages/videos to download (default: 2) |
-d N, --depth N |
Crawl depth for websites (default: 1) |
-q, --quiet |
Quiet mode (no interactive prompts) |
-v, --verbose |
Verbose output |
--skip-assets |
Skip downloading CSS/images/assets |
--yt-quality QUALITY |
YouTube quality: best, 720p, 480p, 360p, worst (default: 720p) |
--yt-format FORMAT |
Custom YouTube download format string |
offinternet/
├── page-downloader.py # Main downloader
├── page-browser.py # Local web server for viewing .page files
├── README.md # This file
├── downloaded_sites/ # Default location for saved websites
| ├── youtube_videos/ # Default location for saved YouTube videos
└── executables/ # (Work in progress) Pre-built binaries
- Intelligent Crawling: Starts from a URL and follows internal links with configurable depth.
- Asset Capture: Downloads HTML, CSS, JavaScript, images, fonts, and other resources.
- Link Rewriting: Converts web URLs to local file paths for offline navigation.
- Bundle Creation: Packages everything into a
.pagefile (a ZIP archive with metadata).
- Video Download: Downloads the specified video using yt-dlp.
- Metadata Capture: Saves title, description, thumbnail, and video info.
- Related Content: Optionally downloads suggested videos.
- Embedded Player: Creates an HTML page with the video embedded for offline viewing.
- Format: Saved as
.pagefiles (ZIP archives containing HTML, assets, and metadata) - Viewing: Open
.pagefiles with the includedpage-browser.pyor any ZIP file utility - Compatibility: Works on Windows, macOS, and Linux
- Dependencies: See
requirements.txtfor complete list
The project is under active development. Pre-built executables for easier use might be available in the /executables directory (these are a work in progress, so the Python version is recommended for reliability).
Contributions are welcome! If you encounter a website that doesn't download correctly with OffInternet:
Submit an Issue: Open a GitHub Issue detailing the page you'd like to add support for
BSD-2-Clause license