Skip to content

DrAbdulmalek/OmniFile_Processor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

81 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

โš ๏ธ LEGACY NOTICE: This repository is in maintenance mode. Active development has moved to omni-medical-suite. See LEGACY_NOTICE.md for the full migration plan.



title: OmniFile AI Processor emoji: ๐Ÿง  colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false license: mit

๐Ÿง  OmniFile AI Processor v5.0.0

ู†ุธุงู… ุฐูƒุงุก ุงุตุทู†ุงุนูŠ ู…ุชูƒุงู…ู„ ู„ู…ุนุงู„ุฌุฉ ุงู„ู…ู„ูุงุช ูˆุงู„ู†ุตูˆุต ูˆุงู„ุฎุท ุงู„ูŠุฏูˆูŠ A Comprehensive AI System for File Processing, Text Analysis & Handwriting Recognition

Python 3.10+ License: MIT CI Tests HF Spaces GitHub

Version: v5.0.0 ย |ย  Status: โœ… CI-Verified

๐ŸŒ Live Demo (HF Spaces) ย |ย  ๐Ÿงช HF Lab Space ย |ย  ๐Ÿ“˜ Documentation ย |ย  ๐Ÿงฉ Dependency Profiles ย |ย  ๐Ÿ““ Colab Debug Notebook ย |ย  ๐Ÿ—‚๏ธ Prioritized Suggestions ย |ย  ๐Ÿ› Report Bug ย |ย  ๐Ÿ’ก Suggestions


๐Ÿ‘จโ€๐Ÿ’ป About the Author | ุนู† ุงู„ู…ุคู„ู

Dr Abdulmalek Tamer Al-husseini

Dr Abdulmalek Tamer Al-husseini

๐Ÿ“ Location: Homs, Syria
๐Ÿ“ง Email: Abdulmalek.husseini@gmail.com
๐Ÿ™ GitHub: DrAbdulmalek
๐Ÿค— HuggingFace: DrAbdulmalek


๐Ÿ“– Description | ุงู„ูˆุตู

OmniFile AI Processor is a production-ready, multimodal AI system that integrates six projects into a unified platform for document intelligence:

OmniFile_Processor + HandwrittenOCR + handwriting-ocr + arabic-ocr-pro + advanced-ocr + OCR-Enhancer

ู†ุธุงู… ุฐูƒุงุก ุงุตุทู†ุงุนูŠ ู…ุชู‚ุฏู… ูŠุฌู…ุน ุณุชุฉ ู…ุดุงุฑูŠุน ููŠ ู…ู†ุตุฉ ูˆุงุญุฏุฉ ู„ู…ุนุงู„ุฌุฉ ุงู„ู…ู„ูุงุช ูˆุงู„ุฎุท ุงู„ูŠุฏูˆูŠ. ูŠุฏุนู… ุงู„ุนุฑุจูŠุฉ ูˆุงู„ุฅู†ุฌู„ูŠุฒูŠุฉ ูˆุงู„ุฃู„ู…ุงู†ูŠุฉ ู…ุน ูˆุญุฏุงุช ู…ุชุฎุตุตุฉ ู„ู„ุฑุคูŠุฉ ุงู„ุญุงุณูˆุจูŠุฉ ูˆู…ุนุงู„ุฌุฉ ุงู„ู„ุบุฉ ูˆุงู„ุฃู…ุงู† ูˆุงู„ุชุตุฏูŠุฑ.


โœจ Features | ุงู„ู…ู…ูŠุฒุงุช

๐Ÿ” Computer Vision & OCR (ูˆุญุฏุฉ ุงู„ุฑุคูŠุฉ ุงู„ุญุงุณูˆุจูŠุฉ)

  1. Multi-Engine OCR โ€” 4 engines (TrOCR, EasyOCR, Tesseract, PaddleOCR) with intelligent engine selection
  2. Result Fusion โ€” 4 strategies: highest confidence, weighted average, voting, longest text
  3. Advanced Preprocessing โ€” CLAHE, deskew, denoise, Otsu thresholding, ONNX Runtime acceleration
  4. Layout Analysis โ€” Automatic detection of tables, headers, footers, and document structure
  5. Table Extraction โ€” Hough line detection + contour analysis for structured data extraction

๐Ÿ—ฃ๏ธ Natural Language Processing (ูˆุญุฏุฉ ู…ุนุงู„ุฌุฉ ุงู„ู„ุบุฉ)

  1. Multilingual Spell Correction โ€” Arabic, English, German with user-learning capability (186+ Arabic corrections)
  2. RTL Text Processing โ€” Full Arabic reshaping + BiDi support with 40+ normalization mappings
  3. Mixed-Text Handling โ€” Arabic/English/numbers with medical term protection
  4. Translation Engine โ€” Helsinki-NLP/opus-mt supporting 6 language pairs
  5. AI Summarization โ€” BART (facebook/bart-large-cnn) + Arabic (UAE-Code/mbart-summarization-ar)
  6. Entity Extraction & Text Classification โ€” BERT-based NER with 6-category classification

๐Ÿค– AI Enhancement (ูˆุญุฏุฉ ุงู„ุฐูƒุงุก ุงู„ุงุตุทู†ุงุนูŠ)

  1. GPT & Gemini Refinement โ€” Context-aware OCR correction with block-type-specific prompts
  2. SSIM Pattern Matching โ€” Self-learning from corrected word images with SQLite pattern database

๐ŸŒ AI Gateway (ูˆุญุฏุฉ ุจูˆุงุจุฉ ุงู„ุฐูƒุงุก ุงู„ุงุตุทู†ุงุนูŠ)

  1. Universal AI Model Proxy โ€” Drop-in proxy server compatible with Messages API protocol, routes to 8+ providers (DeepSeek, NVIDIA NIM, OpenRouter, Ollama, LM Studio, llama.cpp, Kimi, Wafer)
  2. Multi-Provider Routing โ€” Intelligent model routing with per-tier selection (Opus/Sonnet/Haiku levels) and automatic fallback
  3. Streaming SSE โ€” Full Server-Sent Events streaming with thinking block support and tool call conversion
  4. Request Optimization โ€” Fast-path optimizations including quota mock, prefix detection, title/suggestion skip
  5. Rate Limiting & Concurrency โ€” Configurable per-provider rate limits, concurrency control, and timeout management

๐Ÿ“ HTR Training System (ู†ุธุงู… ุชุฏุฑูŠุจ ุงู„ุชุนุฑู ุนู„ู‰ ุงู„ุฎุท)

  1. Arabic HTR Pipeline โ€” End-to-end Arabic handwritten text recognition: LineSegmenter โ†’ TrOCR โ†’ DottedRecovery
  2. LoRA Fine-Tuning โ€” Efficient TrOCR fine-tuning with PEFT/LoRA (4-8 GB VRAM) for Arabic handwriting
  3. Active Learning โ€” Uncertainty/Diversity/Hybrid sampling to select informative samples for annotation
  4. Synthetic Data Generation โ€” Arabic handwriting synthesis with font variation and degradation simulation
  5. Mobile Review Integration โ€” Bidirectional sync between training pipeline and mobile review interface

๐Ÿ“ค Multi-Format Export (ูˆุญุฏุฉ ุงู„ุชุตุฏูŠุฑ)

  1. 6 Export Formats โ€” DOCX (RTL support), HTML, searchable PDF, Excel, JSON (with BBox), TXT (UTF-8 BOM)

๐Ÿ”’ Security & Privacy (ูˆุญุฏุฉ ุงู„ุฃู…ุงู†)

  1. PII Detection โ€” Presidio-based sensitive data scanning + detect-secrets
  2. File Encryption โ€” Fernet (AES-128) with folder support
  3. Code Protection โ€” Prevents spell correction inside code blocks
  4. Audit Logging โ€” File + Redis audit trail with rate limiting (slowapi + Nginx)

๐Ÿ“Š Evaluation (ูˆุญุฏุฉ ุงู„ุชู‚ูŠูŠู…)

  1. CER/WER Metrics โ€” OCR accuracy evaluation with Arabic normalization + Levenshtein distance
  2. Quality Grading โ€” A+ to F with actionable recommendations

๐Ÿ–ฅ๏ธ Multiple Interfaces (ูˆุงุฌู‡ุงุช ุงู„ู…ุณุชุฎุฏู…)

  1. 4 UIs โ€” Streamlit (6 tabs), Gradio (7 tabs), React + shadcn/ui (dark/light), CLI, PyQt6 desktop
  2. FastAPI Backend โ€” Full REST API with Swagger documentation

๐Ÿš€ Scalability & Deployment (ุงู„ุชุญุฌูŠู… ูˆุงู„ู†ุดุฑ)

  1. Docker + Compose โ€” One-command deployment with all services
  2. Kubernetes Ready โ€” Complete K8s manifests with HPA (2-10 pods auto-scaling)
  3. Celery + Redis โ€” Asynchronous task processing for heavy workloads

๐Ÿš€ Quick Start | ุงู„ุชุดุบูŠู„ ุงู„ุณุฑูŠุน

Option 1: HuggingFace Spaces (Recommended for Demo)

The project is deployed and available at: ๐Ÿ‘‰ Production demo: https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr

๐Ÿ‘‰ Experimental lab space: https://huggingface.co/spaces/DrAbdulmalek/omnifile-processor-lab

To deploy your own instance:

git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
pip install -r requirements-hf.txt
python -m src.gradio_ui

Option 2: Local Installation (Linux / macOS / Windows)

# Clone the repository
git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor

# Install dependencies
pip install -r requirements-full.txt          # Everything (~6-8 GB, ~15-30 min)

# Or install in layers:
# pip install -r requirements-core.txt          # Minimum (~1.5 GB, ~3 min)
# pip install -r requirements-core.txt -r requirements-ocr.txt   # + OCR engines
# pip install -r requirements-core.txt -r requirements-nlp.txt   # + NLP

# Run with your preferred interface
streamlit run app.py          # Streamlit UI (6 tabs)
python -m src.gradio_ui       # Gradio UI (7 tabs)
python main.py                # CLI interface
cd frontend && npm install && npm run dev  # React Frontend

Option 3: Docker Compose (Full Stack)

git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
docker-compose up -d

# Access:
# API Docs:    http://localhost:5001/docs
# Streamlit:   http://localhost:7860
# React:       http://localhost:3000
# Nginx Proxy: http://localhost

Option 3.5: Mobile Review โ†’ Training Data

python mobile_review/server.py --host 0.0.0.0 --port 5000
# ุจุนุฏ ุงู„ู…ุฑุงุฌุนุฉ ูˆุงู„ุญูุธ:
python mobile_review/server.py --export-dataset --export-output mobile_review/dataset/review_dataset

# ุฃูˆ ุนุจุฑ ุงู„ุฃุฏุงุฉ ู…ุจุงุดุฑุฉ:
python tools/build_training_data.py --corrections mobile_review/ocr_corrected.json --output dataset/review_dataset

Option 4: Google Colab

ูŠูุถู‘ูŽู„ ุงุณุชุฎุฏุงู… ุงู„ุฏูุชุฑ ุงู„ุฌุงู‡ุฒ: ๐Ÿ‘‰ notebooks/OmniFile_Processor_Colab_Debug.ipynb

!git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
%cd OmniFile_Processor
!apt-get update -qq
!apt-get install -y -qq poppler-utils tesseract-ocr tesseract-ocr-ara tesseract-ocr-eng libgl1
!pip install -r requirements-colab.txt
!python hf_app.py

ู„ู„ู…ุฒูŠุฏ ุนู† ูุฑูˆู‚ ุงู„ุชุซุจูŠุชุŒ ุฑุงุฌุน docs/DEPENDENCY_PROFILES.md.

Option 5: AI Gateway (Universal Model Proxy)

The OmniFile AI Gateway provides a universal proxy for routing AI model requests to multiple providers:

# Clone and install gateway dependencies
git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
pip install -r requirements-gateway.txt

# Configure your provider (edit modules/ai/gateway/.env)
cp modules/ai/gateway/.env.example modules/ai/gateway/.env

# Start the gateway
bash scripts/start_gateway.sh

# Or start manually:
python -m uvicorn modules.ai.gateway.server:app --host 0.0.0.0 --port 8082

Supported Providers: DeepSeek, NVIDIA NIM, OpenRouter, Ollama, LM Studio, llama.cpp, Kimi, Wafer


๐Ÿงช Benchmark & Review Utilities

# Benchmark OCR engines on a folder of images
python tools/benchmark_ocr.py --images examples/ --output artifacts/ocr_benchmark

# Optional ground truth comparison
python tools/benchmark_ocr.py --images examples/ --ground-truth examples/ground_truth.json --output artifacts/ocr_benchmark

๐Ÿ“ Project Structure | ู‡ูŠูƒู„ ุงู„ู…ุดุฑูˆุน

ู…ู„ุงุญุธุฉ ู…ุนู…ุงุฑูŠุฉ โ€” Architecture Note: ูŠูˆุฌุฏ ููŠ ุงู„ู…ุดุฑูˆุน ู†ุธุงู…ุงู† ู…ุชูˆุงุฒูŠุงู†:

  • modules/ โ€” ุงู„ุจู†ูŠุฉ ุงู„ู†ุธุฑูŠุฉ ุงู„ู…ูˆุณู‘ุนุฉ: ูˆุญุฏุงุช ู…ู†ุธู‘ู…ุฉ ุจูˆุถูˆุญ (vision, nlp, security, export, ai, evaluation) ู…ุน ู†ู…ุงุฐุฌ Pydantic v2. ู‡ุฐู‡ ู‡ูŠ ุงู„ุจู†ูŠุฉ ุงู„ู…ุณุชู‚ุจู„ูŠุฉ ุงู„ู…ู‚ุตูˆุฏุฉ ู„ู„ู…ุดุฑูˆุน.
  • src/ โ€” ู…ุญุฑูƒ HandwrittenOCR ุงู„ุนู…ู„ูŠ: ูŠุญุชูˆูŠ ุนู„ู‰ ุงู„ุชุทุจูŠู‚ ุงู„ูุนู„ูŠ ุงู„ู…ูุณุชุฎุฏูŽู… ููŠ ูˆุงุฌู‡ุฉ Gradio (src/gradio_ui.py) ูˆHF Spaces. ูŠุดู…ู„ TrOCR Batch, LoRA Fine-tuning, Active Learning, ูˆStudy Guide.
  • ุงู„ู…ู„ูุงุช ุงู„ุฌุฐุฑูŠุฉ (app.py, database.py, config.py) โ€” ุทุจู‚ุฉ ุงู„ุชูƒุงู…ู„ ุงู„ุชูŠ ุชุฑุจุท ุจูŠู† ุงู„ุจู†ูŠุฉ ูˆุงู„ู…ุญุฑูƒุงุช.

ุงู„ุฎูŠุงุฑ ุงู„ู…ุชุจู†ู‘ู‰ ุญุงู„ูŠุงู‹: src/ ู‡ูˆ ุงู„ูƒูˆุฏ ุงู„ุนู…ู„ูŠ ุงู„ูุนู‘ุงู„ ู„ูˆุงุฌู‡ุฉ Gradio ูˆHF SpacesุŒ ุจูŠู†ู…ุง modules/ ูŠู…ุซู„ ุงู„ุจู†ูŠุฉ ุงู„ู†ุธุฑูŠุฉ ุงู„ู…ู†ุธู…ุฉ ู„ู„ู…ุดุฑูˆุน ุงู„ู…ูˆุณู‘ุน. ุงู„ุชุญูˆูŠู„ ุงู„ุชุฏุฑูŠุฌูŠ (migration) ู…ู† src/ ุฅู„ู‰ modules/ ุณูŠุชู… ุนู„ู‰ ู…ุฑุงุญู„ ุนุจุฑ Pull Requests ู…ุณุชู‚ู„ุฉ.

OmniFile_Processor/
โ”œโ”€โ”€ app.py                          # Main Streamlit UI
โ”œโ”€โ”€ config.py                       # Central configuration v4.1.1
โ”œโ”€โ”€ database.py                     # SQLite database layer
โ”œโ”€โ”€ main.py                         # Local / CLI entry point
โ”œโ”€โ”€ tasks.py                        # Celery async tasks
โ”œโ”€โ”€ requirements.txt                # Full dependencies (legacy)
โ”œโ”€โ”€ requirements-core.txt           # Core only (~1.5 GB)
โ”œโ”€โ”€ requirements-ocr.txt            # OCR engines layer
โ”œโ”€โ”€ requirements-nlp.txt            # NLP layer
โ”œโ”€โ”€ requirements-full.txt           # Everything (~6-8 GB)
โ”œโ”€โ”€ requirements-hf.txt             # HuggingFace Spaces (minimal)
โ”œโ”€โ”€ Dockerfile                      # Docker image
โ”œโ”€โ”€ docker-compose.yml              # Full stack orchestration
โ”œโ”€โ”€ nginx.conf                      # Nginx load balancer
โ”œโ”€โ”€ LICENSE                         # MIT License
โ”‚
โ”œโ”€โ”€ modules/
โ”‚   โ”œโ”€โ”€ core/                       # Core data models
โ”‚   โ”‚   โ””โ”€โ”€ structure.py            #   Pydantic v2 models
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ vision/                     # Computer Vision & OCR
โ”‚   โ”‚   โ”œโ”€โ”€ ocr_engine.py           #   4 OCR engines + ONNX + Quantization
โ”‚   โ”‚   โ”œโ”€โ”€ image_preprocessor.py   #   CLAHE + Denoise + Deskew + Otsu
โ”‚   โ”‚   โ”œโ”€โ”€ pdf_processor.py        #   Multi-format PDF processing
โ”‚   โ”‚   โ”œโ”€โ”€ text_reconstructor.py   #   RTL/LTR sentence reconstruction
โ”‚   โ”‚   โ”œโ”€โ”€ result_fusion.py        #   4 fusion strategies
โ”‚   โ”‚   โ”œโ”€โ”€ layout_analyzer.py      #   Layout analysis (tables, headers)
โ”‚   โ”‚   โ”œโ”€โ”€ table_extractor.py      #   Table extraction (Hough + contours)
โ”‚   โ”‚   โ””โ”€โ”€ htr/                    #   Arabic HTR Module
โ”‚   โ”‚       โ”œโ”€โ”€ arabic_htr.py       #     Unified HTR pipeline
โ”‚   โ”‚       โ”œโ”€โ”€ line_segmenter.py   #     Documentโ†’lines (projection/contour/U-Net)
โ”‚   โ”‚       โ”œโ”€โ”€ word_segmenter.py   #     Linesโ†’words (connected components)
โ”‚   โ”‚       โ”œโ”€โ”€ dotted_recovery.py  #     Arabic dot correction
โ”‚   โ”‚       โ””โ”€โ”€ trocr_finetuned.py  #     Fine-tuned TrOCR wrapper
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ nlp/                        # Natural Language Processing
โ”‚   โ”‚   โ”œโ”€โ”€ spell_corrector.py      #   3-language correction + learning
โ”‚   โ”‚   โ”œโ”€โ”€ translator.py           #   Helsinki-NLP translation
โ”‚   โ”‚   โ”œโ”€โ”€ summarizer.py           #   BART summarization
โ”‚   โ”‚   โ”œโ”€โ”€ entity_extractor.py     #   BERT-based NER
โ”‚   โ”‚   โ”œโ”€โ”€ language_detector.py    #   Language detection
โ”‚   โ”‚   โ”œโ”€โ”€ text_classifier.py      #   6-category classification
โ”‚   โ”‚   โ”œโ”€โ”€ arabic_rtl.py           #   Full RTL processing
โ”‚   โ”‚   โ”œโ”€โ”€ mixed_text.py           #   Arabic/English mixed text
โ”‚   โ”‚   โ”œโ”€โ”€ ai_corrector.py         #   GPT-based correction
โ”‚   โ”‚   โ”œโ”€โ”€ arabic_nlp_utils.py     #   Semantic similarity for Arabic OCR
โ”‚   โ”‚   โ””โ”€โ”€ correction_dict.json    #   186+ Arabic corrections
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ai/                         # AI Enhancement
โ”‚   โ”‚   โ”œโ”€โ”€ pattern_matcher.py      #   SSIM pattern matching
โ”‚   โ”‚   โ”œโ”€โ”€ pattern_db.py           #   SQLite pattern database
โ”‚   โ”‚   โ”œโ”€โ”€ gemini_refiner.py       #   Gemini AI refinement
โ”‚   โ”‚   โ””โ”€โ”€ gateway/                #   OmniFile AI Gateway (universal proxy)
โ”‚   โ”‚       โ”œโ”€โ”€ server.py           #     ASGI entry point
โ”‚   โ”‚       โ”œโ”€โ”€ api/                #     FastAPI routes & auth
โ”‚   โ”‚       โ”œโ”€โ”€ config/             #     Provider catalog & settings
โ”‚   โ”‚       โ”œโ”€โ”€ core/               #     SSE & protocol helpers
โ”‚   โ”‚       โ”œโ”€โ”€ providers/          #     8 provider backends
โ”‚   โ”‚       โ””โ”€โ”€ pool/               #     Account pool & health management
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ security/                   # Security & Privacy
โ”‚   โ”‚   โ”œโ”€โ”€ file_scanner.py         #   Security scanning
โ”‚   โ”‚   โ”œโ”€โ”€ sensitive_data_scanner.py # PII detection (Presidio)
โ”‚   โ”‚   โ”œโ”€โ”€ encryption.py           #   Fernet encryption (AES-128)
โ”‚   โ”‚   โ”œโ”€โ”€ code_protector.py       #   Code block protection
โ”‚   โ”‚   โ”œโ”€โ”€ file_organizer.py       #   Auto file organization
โ”‚   โ”‚   โ”œโ”€โ”€ archive_handler.py      #   Archive management
โ”‚   โ”‚   โ”œโ”€โ”€ backup_manager.py       #   Backup management
โ”‚   โ”‚   โ”œโ”€โ”€ audit_logger.py         #   Audit logging
โ”‚   โ”‚   โ””โ”€โ”€ secure_file_handler.py  #   Safe file handling
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ export/                     # Multi-Format Export
โ”‚   โ”‚   โ”œโ”€โ”€ exporter.py             #   DOCX/HTML/PDF/JSON/TXT/Excel
โ”‚   โ”‚   โ””โ”€โ”€ layout_preserving.py    #   DOCX export with visual layout preservation
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ evaluation/                 # Evaluation & Metrics
โ”‚       โ””โ”€โ”€ metrics.py              #   CER/WER + quality grading
โ”‚
โ”œโ”€โ”€ frontend/                       # React + shadcn/ui Web App
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ App.jsx                 #   Main application
โ”‚   โ”‚   โ”œโ”€โ”€ components/             #   UI components
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ FileUpload.jsx
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ProcessingOptions.jsx
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ ResultsDisplay.jsx
โ”‚   โ”‚   โ””โ”€โ”€ services/api.js         #   API client
โ”‚   โ””โ”€โ”€ package.json
โ”‚
โ”œโ”€โ”€ backend/                        # FastAPI Backend
โ”‚   โ””โ”€โ”€ main.py                     #   REST API endpoints
โ”‚
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ arabic_fixes.json           # 186 Arabic corrections
โ”œโ”€โ”€ data_seed/
โ”‚   โ””โ”€โ”€ correction_dict_seed.json   # Seed data
โ”œโ”€โ”€ artifacts/
โ”‚   โ””โ”€โ”€ correction_dict.json        # Learned corrections
โ”‚
โ”œโ”€โ”€ src/                            # HandwrittenOCR Engine
โ”œโ”€โ”€ mobile/                         # Static PWA (offline review)
โ”œโ”€โ”€ training/                       # HTR Training System
โ”‚   โ”œโ”€โ”€ README.md                  #   Training overview
โ”‚   โ”œโ”€โ”€ configs/                   #   Training configurations
โ”‚   โ”‚   โ””โ”€โ”€ trocr_lora_arabic.yaml #     TrOCR + LoRA settings
โ”‚   โ”œโ”€โ”€ scripts/                   #   Training scripts
โ”‚   โ”‚   โ”œโ”€โ”€ prepare_htr_dataset.py #     Dataset preparation
โ”‚   โ”‚   โ”œโ”€โ”€ train_trocr_lora.py    #     LoRA training
โ”‚   โ”‚   โ”œโ”€โ”€ evaluate_checkpoint.py #     Model evaluation
โ”‚   โ”‚   โ”œโ”€โ”€ active_learning_pipeline.py  # Active learning
โ”‚   โ”‚   โ””โ”€โ”€ generate_synthetic_data.py   # Synthetic data generation
โ”‚   โ”œโ”€โ”€ models/                    #   Training models
โ”‚   โ”‚   โ””โ”€โ”€ lora_htr_trainer.py    #     Custom LoRA trainer
โ”‚   โ””โ”€โ”€ data/                      #   Data connectors
โ”‚       โ””โ”€โ”€ mobile_review_connector.py  # Mobile review sync
โ”œโ”€โ”€ Dockerfile.training             # Training Docker image
โ”œโ”€โ”€ requirements-training.txt       # Training dependencies
โ”œโ”€โ”€ Makefile                        # Build & training commands
โ”œโ”€โ”€ justfile                        # Alternative build commands
โ”‚
โ”œโ”€โ”€ mobile_review/                  # Flask server (remote team review)
โ”‚   โ”œโ”€โ”€ server.py                  #   REST API review server
โ”‚   โ”œโ”€โ”€ templates/review.html      #   Touch-friendly review UI
โ”‚   โ””โ”€โ”€ README.md                  #   mobile/ vs mobile_review/ guide
โ”œโ”€โ”€ tests/                          # pytest test suite (13+ files)
โ”œโ”€โ”€ .github/workflows/              # CI/CD
โ”‚   โ”œโ”€โ”€ ci.yml                     #   Tests on push/PR
โ”‚   โ””โ”€โ”€ release.yml                #   Auto-release on tags
โ”œโ”€โ”€ notebooks/                      # Jupyter Notebooks
โ”‚   โ”œโ”€โ”€ OmniFile_Gradio_Debugger.ipynb  #   Gradio interactive debugger (Colab-ready)
โ”œโ”€โ”€ docs/                           # Documentation
โ”‚   โ”œโ”€โ”€ TESTING_GUIDE.md          #   Testing & development guide
โ”‚   โ”œโ”€โ”€ API_DOCS.md
โ”‚   โ”œโ”€โ”€ USER_GUIDE.md
โ”‚   โ””โ”€โ”€ DEVELOPER_GUIDE.md
โ”œโ”€โ”€ k8s/                            # Kubernetes manifests
โ”‚   โ”œโ”€โ”€ namespace.yaml
โ”‚   โ”œโ”€โ”€ backend.yaml
โ”‚   โ”œโ”€โ”€ celery.yaml
โ”‚   โ”œโ”€โ”€ redis.yaml
โ”‚   โ”œโ”€โ”€ nginx.yaml
โ”‚   โ”œโ”€โ”€ hpa.yaml
โ”‚   โ””โ”€โ”€ storage.yaml
โ””โ”€โ”€ examples/                       # Usage examples
    โ”œโ”€โ”€ ocr_basic.py
    โ”œโ”€โ”€ nlp_pipeline.py
    โ””โ”€โ”€ evaluation_example.py

๐Ÿงฉ Module Descriptions | ูˆุตู ุงู„ูˆุญุฏุงุช

1. ๐ŸŽฏ modules/core/ โ€” Core Data Models

The foundational layer defining all shared data structures using Pydantic v2. Provides type-safe models for OCR results, processing options, document metadata, and inter-module communication.

File Description
structure.py Pydantic v2 models for OCRResult, ProcessingOptions, Document, BoundingBox, and all shared types

2. ๐Ÿ‘๏ธ modules/vision/ โ€” Computer Vision & OCR Engine

The heart of the system. Handles image preprocessing, OCR with 4 engines, result fusion, PDF processing, layout analysis, table extraction, and text reconstruction.

File Description
ocr_engine.py Multi-engine OCR (TrOCR, EasyOCR, Tesseract, PaddleOCR) with ONNX Runtime & INT8 quantization
image_preprocessor.py CLAHE, Gaussian denoise, deskew (Hough), Otsu thresholding, adaptive binarization
pdf_processor.py PyMuPDF-based PDF processing with pdf2image fallback
text_reconstructor.py RTL/LTR sentence reconstruction with language-aware ordering
result_fusion.py 4 fusion strategies: highest confidence, weighted average, voting, longest text
layout_analyzer.py Document layout analysis โ€” tables, headers, footers, sections
table_extractor.py Table extraction using Hough lines + contour analysis

3. ๐Ÿ—ฃ๏ธ modules/nlp/ โ€” Natural Language Processing

Multilingual text processing including spell correction, translation, summarization, entity extraction, text classification, and Arabic RTL handling.

File Description
spell_corrector.py 3-language spell correction (AR, EN, DE) with user learning & Python keyword protection
translator.py Helsinki-NLP/opus-mt machine translation (6 language pairs)
summarizer.py BART summarization (English + Arabic)
entity_extractor.py BERT-based Named Entity Recognition
language_detector.py Automatic language detection (AR, EN, DE)
text_classifier.py 6-category text classification
arabic_rtl.py Full RTL processing โ€” arabic_reshaper + python-bidi + 40+ normalization rules
mixed_text.py Arabic/English/number mixed text handler with medical term protection
ai_corrector.py GPT-based OCR correction with context awareness
correction_dict.json 186+ common Arabic OCR error corrections

4. ๐Ÿ”’ modules/security/ โ€” Security & Privacy

Comprehensive security module for PII detection, encryption, code protection, file organization, backup management, and audit logging.

File Description
file_scanner.py File security scanning and validation
sensitive_data_scanner.py PII detection using Microsoft Presidio + detect-secrets
encryption.py Fernet symmetric encryption (AES-128) with folder support
code_protector.py Prevents spell correction inside code blocks (Python, JS, etc.)
file_organizer.py Automatic file organization by type and content
archive_handler.py ZIP archive creation/extraction with integrity checks
backup_manager.py Automatic and manual backup management
audit_logger.py File + Redis audit trail with statistics
secure_file_handler.py Path traversal prevention + safe tempfile handling

5. ๐Ÿค– modules/ai/ โ€” AI Enhancement

Advanced AI capabilities including self-learning pattern matching, Gemini-based refinement, and a universal AI model gateway.

File Description
pattern_matcher.py SSIM-based visual pattern matching โ€” learns from corrected word images
pattern_db.py SQLite database for storing and retrieving visual OCR patterns
gemini_refiner.py Google Gemini API integration for context-aware OCR refinement
gateway/ OmniFile AI Gateway โ€” Universal proxy for routing AI model requests to multiple providers (see below)

๐ŸŒ modules/ai/gateway/ โ€” OmniFile AI Gateway

A universal AI model proxy that intercepts Messages API requests and routes them to alternative providers. Supports 8 provider backends with automatic format conversion and SSE streaming.

Subdirectory Description
api/ FastAPI application layer โ€” routes, auth, model routing, request optimization
api/models/ Pydantic request/response models for the Messages API protocol
api/web_tools/ Local web search and fetch handling with SSRF protection
config/ Central settings, provider catalog, logging configuration
core/ Protocol helpers โ€” SSE formatting, message conversion, token estimation
core/anthropic/ Native Messages API support โ€” content blocks, thinking tags, tool parsing
providers/ Provider transport implementations โ€” OpenAI-compatible and native protocols
providers/deepseek/ DeepSeek provider (native Messages API)
providers/nvidia_nim/ NVIDIA NIM provider (OpenAI-compatible)
providers/open_router/ OpenRouter provider (native Messages API)
providers/ollama/ Ollama local provider (native Messages API)
providers/lmstudio/ LM Studio local provider (native Messages API)
providers/llamacpp/ llama.cpp local provider (native Messages API)
providers/kimi/ Kimi/Moonshot provider (OpenAI-compatible)
providers/wafer/ Wafer provider (native Messages API)
server.py ASGI entry point for the gateway proxy
pool/ Advanced Pool Management โ€” Account pooling, rate limit fallback, conversation reuse, health scoring
๐Ÿ”„ modules/ai/gateway/pool/ โ€” Advanced Pool Management

Intelligent multi-account management system with automatic failover, rate limit distribution, and conversation context reuse.

File Description
account_pool.py AccountPool โ€” Multi-account rotation with priority ordering, LRU selection, concurrency control, and automatic failover after consecutive failures
rate_limit_fallback.py RateLimitFallbackManager โ€” Smart model fallback when rate-limited: auto-detects effort tiers (low/medium/high/xhigh/max), picks same-provider alternatives first, then cross-provider
conversation_pool.py ConversationPool โ€” Multi-turn conversation reuse via stable fingerprinting: normalizes dynamic tokens (dates, UUIDs, CWDs), TTL-based LRU eviction (500 entries, 30 min default)
health_scorer.py ProviderHealthScorer โ€” Weighted health scoring (0.0-1.0) based on success rate, latency percentiles (p50/p95), error type tracking, and consecutive failure streaks

6. ๐Ÿ“ค modules/export/ โ€” Multi-Format Export

Export processed documents to 6 different formats with proper RTL support.

File Description
exporter.py Export to DOCX (RTL bidi), HTML (dir="rtl"), searchable PDF (invisible text), Excel (RTL alignment), JSON (full structure with BBox), TXT (UTF-8 BOM)

7. ๐Ÿ“Š modules/evaluation/ โ€” Evaluation & Metrics

OCR accuracy evaluation with Arabic-aware normalization.

File Description
metrics.py CER/WER computation, Arabic text normalization, Levenshtein distance (zero dependencies), quality grading (A+ to F)

๐Ÿ”— API Documentation | ุชูˆุซูŠู‚ API

The FastAPI backend exposes the following REST endpoints. Full interactive documentation is available at /docs (Swagger UI) when the backend is running.

Base URL

http://localhost:5001/api/v1

Endpoints

Method Endpoint Description
POST /ocr/process Process an image/PDF file with OCR
POST /ocr/process-batch Batch process multiple files
GET /ocr/result/{task_id} Get OCR result by task ID
POST /nlp/correct Spell-correct text (AR/EN/DE)
POST /nlp/translate Translate text between languages
POST /nlp/summarize Summarize text
POST /nlp/entities Extract named entities
POST /nlp/classify Classify text into categories
POST /export/{format} Export results to DOCX/HTML/PDF/JSON/TXT/Excel
POST /security/scan Scan file for PII and security issues
POST /security/encrypt Encrypt a file
POST /security/decrypt Decrypt a file
GET /health Health check endpoint
GET /metrics System performance metrics

๐Ÿ“– For the complete API reference with request/response schemas, see docs/API_DOCS.md or access the Swagger UI at /docs.


๐Ÿ“ธ Screenshots | ู„ู‚ุทุงุช ุงู„ุดุงุดุฉ

Streamlit UI Gradio UI React Frontend
Streamlit Gradio React
6-tab interface 7-tab interface Dark/Light mode
OCR Processing Arabic RTL Table Extraction
OCR RTL Tables
Multi-engine OCR Full RTL support Structured data

๐Ÿ“ Screenshots will be added in a future update. For now, try the live demo to see the application in action.

๐Ÿ“‹ See CHANGELOG.md for the complete version history.


๐Ÿ“Š Project Statistics | ุฅุญุตุงุฆูŠุงุช ุงู„ู…ุดุฑูˆุน

Metric Value
Python Files 155+
Lines of Code ~40,000+
Total Files 230+
OCR Engines 4 (TrOCR, EasyOCR, Tesseract, PaddleOCR)
Fusion Strategies 4
Supported Languages 3 (EN, AR, DE)
Export Formats 6 (DOCX, HTML, PDF, JSON, TXT, Excel)
Test Files 13
Merged Projects 6
Security Modules 9
NLP Capabilities 10
AI Gateway Providers 8 (DeepSeek, NIM, OpenRouter, Ollama, LM Studio, llama.cpp, Kimi, Wafer)
API Endpoints 14+

๐ŸŒ Supported Languages | ุงู„ู„ุบุงุช ุงู„ู…ุฏุนูˆู…ุฉ

Language Code RTL Support OCR Spell Check Translation
๐Ÿ‡ธ๐Ÿ‡ฆ ุงู„ุนุฑุจูŠุฉ (Arabic) ar โœ… โœ… โœ… โœ…
๐Ÿ‡ฌ๐Ÿ‡ง English en โŒ โœ… โœ… โœ…
๐Ÿ‡ฉ๐Ÿ‡ช Deutsch (German) de โŒ โœ… โœ… โœ…

๐Ÿค Contributing | ุงู„ู…ุณุงู‡ู…ุฉ

Contributions are welcome! Please follow these steps:

ูƒูŠู ุชุณุงู‡ู… / How to Contribute

  1. Fork the repository
  2. Clone your fork locally
    git clone https://github.com/your-username/OmniFile_Processor.git
    cd OmniFile_Processor
  3. Create a feature branch
    git checkout -b feature/your-feature-name
  4. Make your changes and ensure tests pass
    pip install -r requirements-full.txt          # Everything (~6-8 GB, ~15-30 min)
    

Or install in layers:

pip install -r requirements-core.txt # Minimum (~1.5 GB, ~3 min)

pip install -r requirements-core.txt -r requirements-ocr.txt # + OCR engines

pip install -r requirements-core.txt -r requirements-nlp.txt # + NLP

pytest tests/ -v

5. **Commit** with a descriptive message
```bash
git commit -m "feat: add your feature description"
  1. Push to your fork
    git push origin feature/your-feature-name
  2. Open a Pull Request against the main branch

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add docstrings to all new functions and classes
  • Write tests for new features (place in tests/)
  • Update the relevant documentation in docs/
  • Use type hints throughout your code
  • Ensure RTL handling is tested for any text-related changes

๐Ÿ“œ License | ุงู„ุชุฑุฎูŠุต

This project is licensed under the MIT License.

MIT License

Copyright (c) 2026 Dr Abdulmalek Tamer Al-husseini โ€” Homs, Syria

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

See LICENSE for the full text.


๐Ÿ”— Links | ุงู„ุฑูˆุงุจุท

Resource Link
๐Ÿ™ GitHub Repository https://github.com/DrAbdulmalek/OmniFile_Processor
๐Ÿค— HuggingFace Spaces https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr
๐Ÿ“š User Guide docs/USER_GUIDE.md
๐Ÿ‘จโ€๐Ÿ’ป Developer Guide docs/DEVELOPER_GUIDE.md
๐Ÿงช Testing Guide docs/TESTING_GUIDE.md
๐Ÿ“ก API Documentation docs/API_DOCS.md
๐Ÿ’ก Suggestions SUGGESTIONS.md
๐Ÿ“‹ License LICENSE

Built with โค๏ธ by Dr Abdulmalek Tamer Al-husseini ๐Ÿ“ Homs, Syria ย |ย  ๐Ÿ“ง Abdulmalek.husseini@gmail.com

โญ If you find this project useful, please give it a star on GitHub!


Repository Status

Field Value
Role Legacy AI File Processing System
Status Legacy โ€” Migration Source
Layer Legacy / Archive
Priority Low
Migration Target omni-medical-suite

Migration Notice

OmniFile_Processor v5.0 has been merged into omni-medical-suite v2.0. This repository is preserved as a reference and migration source.

Migration Path

Component Source (This Repo) Destination
Multi-Engine OCR OmniFile_Processor omni-medical-suite (packages/omni-ocr)
NLP Pipeline OmniFile_Processor omni-medical-suite (packages/nlp)
AI Gateway OmniFile_Processor omni-medical-suite (packages/ai)
Export System OmniFile_Processor omni-medical-suite (packages/export)
Streamlit UI OmniFile_Processor omni-medical-suite (apps/)
Gradio UI OmniFile_Processor omni-medical-suite (services/)
React UI OmniFile_Processor omni-medical-suite (apps/web)

Who Should Use This

  • Developers referencing legacy implementations during migration
  • Teams needing to extract specific modules not yet migrated
  • Users of the HuggingFace Spaces deployments (still active)

When to Use This vs omni-medical-suite

Need Repository
New development / unified platform omni-medical-suite
Reference legacy implementations This repo (OmniFile_Processor)
Active HuggingFace demo HF Space

Related Repositories

Repo Role Status
omni-medical-suite Main Platform (Successor) Active
medical-ocr-postprocessor Core Correction Engine Active
medical-handwriting-ocr Production OCR Active

License: MIT โ€” Dr. Abdulmalek Tamer Al-husseini

About

๐Ÿง  OmniFile AI Processor โ€” Integrated AI File Processing System | Legacy/Migration Source โ†’ omni-medical-suite

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors