✅ morphic.py - The main OCR tool with smart DPI handling
✅ requirements.txt - Python dependencies
✅ README.md - Full GitHub documentation
✅ UV_INSTALL.md - Fast installation with UV (optional)
Fast way (with UV - recommended):
# Install UV (one-time setup)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies (~5 seconds)
uv pip install -r requirements.txtTraditional way (with pip):
pip3.11 install -r requirements.txt # Takes ~45 seconds# macOS
brew install poppler
# Ubuntu/Debian
sudo apt-get install poppler-utilsMorphic needs your utilities.py file with:
Print(logType: str, message: str)CPU_and_Mem_usage() -> str
Put it in the same folder as morphic.py.
python3.11 morphic.pyYou'll see:
╔══════════════════════════════════════════════════════════╗
║ MORPHIC ║
║ Intelligent OCR with Downsampling ║
╚══════════════════════════════════════════════════════════╝
# Simple: PDF → Searchable PDF
python3.11 morphic.py \
--input-pdf-file your_scan.pdf \
--output-pdf-file searchable.pdf
# With downsampling: 600 DPI OCR → 300 DPI output
python3.11 morphic.py \
--input-pdf-file your_scan.pdf \
--output-pdf-file web.pdf \
--source-dpi 600 \
--output-pdf-dpi 300 \
--output-pdf-images-format jp2# Master (600 DPI, ~800 MB for 200 pages)
python3.11 morphic.py \
--input-image-folder ~/scans/book/ \
--output-pdf-file master_600dpi.pdf \
--output-pdf-dpi 600 \
--output-pdf-images-format jp2
# Web (300 DPI, ~200 MB, same OCR quality!)
python3.11 morphic.py \
--input-image-folder ~/scans/book/ \
--output-pdf-file web_300dpi.pdf \
--output-pdf-dpi 300 \
--output-pdf-images-format jp2
# Email (150 DPI, ~50 MB, same OCR quality!)
python3.11 morphic.py \
--input-image-folder ~/scans/book/ \
--output-pdf-file email_150dpi.pdf \
--output-pdf-dpi 150 \
--output-pdf-images-format jpegAll three PDFs have identical OCR text - only image resolution differs!
✅ Auto-DPI Detection - Reads from image EXIF, no guessing
✅ Post-OCR Downsampling - OCR at max resolution, downsample after
✅ JPEG2000 Support - True JP2/JPX via PyMuPDF (not reportlab)
✅ No False Claims - WebP properly rejected (not supported in PDF)
✅ Clean Help - Running with no args shows usage, not hanging
| Flag | Purpose | Example |
|---|---|---|
--input-pdf-file |
OCR a PDF | scan.pdf |
--input-image-folder |
OCR image folder | ./scans/ |
--output-pdf-file |
Save result (required) | output.pdf |
--source-dpi |
OCR resolution | 600 (default) |
--output-pdf-dpi |
Output resolution | 300 (downsamples) |
--output-pdf-images-format |
Compression | jp2, png, jpeg |
--debug |
Verbose logging | (flag) |
Run: pip3.11 install -r requirements.txt
Copy your utilities.py to the morphic folder
Install poppler: brew install poppler (macOS) or sudo apt-get install poppler-utils (Linux)
- Check if utilities.py is in the same directory
- Make sure all dependencies are installed
- Try running with
--debugflag
✅ Fixed: JPEG2000 actually works (uses PyMuPDF not reportlab)
✅ Fixed: WebP explicitly rejected (was claiming support)
✅ Fixed: Text color is white (was black in Qwen3's v2)
✅ Added: Auto-DPI detection from EXIF
✅ Added: Nice help display when run with no args
✅ Added: UV installation support (10-100× faster)
morphic/
├── morphic.py # Main tool
├── utilities.py # Your logging (you provide this)
├── requirements.txt # Python dependencies
├── README.md # Full documentation
└── UV_INSTALL.md # Fast install guide
- ✅ Install dependencies
- ✅ Copy your utilities.py
- ✅ Test with:
python3.11 morphic.py --help - ✅ Run your first OCR
- 🚀 Push to GitHub!
You're ready to process your 600dpi scans! 🔮
- Full docs:
README.md - UV guide:
UV_INSTALL.md - Source code:
morphic.py(well commented)
Happy OCR'ing!