โ ๏ธ LEGACY NOTICE: This repository is in maintenance mode. Active development has moved to omni-medical-suite. See LEGACY_NOTICE.md for the full migration plan.
title: OmniFile AI Processor emoji: ๐ง colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false license: mit
ูุธุงู ุฐูุงุก ุงุตุทูุงุนู ู ุชูุงู ู ูู ุนุงูุฌุฉ ุงูู ููุงุช ูุงููุตูุต ูุงูุฎุท ุงููุฏูู A Comprehensive AI System for File Processing, Text Analysis & Handwriting Recognition
Version: v5.0.0 ย |ย Status: โ CI-Verified
๐ Live Demo (HF Spaces) ย |ย ๐งช HF Lab Space ย |ย ๐ Documentation ย |ย ๐งฉ Dependency Profiles ย |ย ๐ Colab Debug Notebook ย |ย ๐๏ธ Prioritized Suggestions ย |ย ๐ Report Bug ย |ย ๐ก Suggestions
|
๐ Location: Homs, Syria |
OmniFile AI Processor is a production-ready, multimodal AI system that integrates six projects into a unified platform for document intelligence:
OmniFile_Processor + HandwrittenOCR + handwriting-ocr + arabic-ocr-pro + advanced-ocr + OCR-Enhancer
ูุธุงู ุฐูุงุก ุงุตุทูุงุนู ู ุชูุฏู ูุฌู ุน ุณุชุฉ ู ุดุงุฑูุน ูู ู ูุตุฉ ูุงุญุฏุฉ ูู ุนุงูุฌุฉ ุงูู ููุงุช ูุงูุฎุท ุงููุฏูู. ูุฏุนู ุงูุนุฑุจูุฉ ูุงูุฅูุฌููุฒูุฉ ูุงูุฃูู ุงููุฉ ู ุน ูุญุฏุงุช ู ุชุฎุตุตุฉ ููุฑุคูุฉ ุงูุญุงุณูุจูุฉ ูู ุนุงูุฌุฉ ุงููุบุฉ ูุงูุฃู ุงู ูุงูุชุตุฏูุฑ.
- Multi-Engine OCR โ 4 engines (TrOCR, EasyOCR, Tesseract, PaddleOCR) with intelligent engine selection
- Result Fusion โ 4 strategies: highest confidence, weighted average, voting, longest text
- Advanced Preprocessing โ CLAHE, deskew, denoise, Otsu thresholding, ONNX Runtime acceleration
- Layout Analysis โ Automatic detection of tables, headers, footers, and document structure
- Table Extraction โ Hough line detection + contour analysis for structured data extraction
- Multilingual Spell Correction โ Arabic, English, German with user-learning capability (186+ Arabic corrections)
- RTL Text Processing โ Full Arabic reshaping + BiDi support with 40+ normalization mappings
- Mixed-Text Handling โ Arabic/English/numbers with medical term protection
- Translation Engine โ Helsinki-NLP/opus-mt supporting 6 language pairs
- AI Summarization โ BART (facebook/bart-large-cnn) + Arabic (UAE-Code/mbart-summarization-ar)
- Entity Extraction & Text Classification โ BERT-based NER with 6-category classification
- GPT & Gemini Refinement โ Context-aware OCR correction with block-type-specific prompts
- SSIM Pattern Matching โ Self-learning from corrected word images with SQLite pattern database
- Universal AI Model Proxy โ Drop-in proxy server compatible with Messages API protocol, routes to 8+ providers (DeepSeek, NVIDIA NIM, OpenRouter, Ollama, LM Studio, llama.cpp, Kimi, Wafer)
- Multi-Provider Routing โ Intelligent model routing with per-tier selection (Opus/Sonnet/Haiku levels) and automatic fallback
- Streaming SSE โ Full Server-Sent Events streaming with thinking block support and tool call conversion
- Request Optimization โ Fast-path optimizations including quota mock, prefix detection, title/suggestion skip
- Rate Limiting & Concurrency โ Configurable per-provider rate limits, concurrency control, and timeout management
- Arabic HTR Pipeline โ End-to-end Arabic handwritten text recognition: LineSegmenter โ TrOCR โ DottedRecovery
- LoRA Fine-Tuning โ Efficient TrOCR fine-tuning with PEFT/LoRA (4-8 GB VRAM) for Arabic handwriting
- Active Learning โ Uncertainty/Diversity/Hybrid sampling to select informative samples for annotation
- Synthetic Data Generation โ Arabic handwriting synthesis with font variation and degradation simulation
- Mobile Review Integration โ Bidirectional sync between training pipeline and mobile review interface
- 6 Export Formats โ DOCX (RTL support), HTML, searchable PDF, Excel, JSON (with BBox), TXT (UTF-8 BOM)
- PII Detection โ Presidio-based sensitive data scanning + detect-secrets
- File Encryption โ Fernet (AES-128) with folder support
- Code Protection โ Prevents spell correction inside code blocks
- Audit Logging โ File + Redis audit trail with rate limiting (slowapi + Nginx)
- CER/WER Metrics โ OCR accuracy evaluation with Arabic normalization + Levenshtein distance
- Quality Grading โ A+ to F with actionable recommendations
- 4 UIs โ Streamlit (6 tabs), Gradio (7 tabs), React + shadcn/ui (dark/light), CLI, PyQt6 desktop
- FastAPI Backend โ Full REST API with Swagger documentation
- Docker + Compose โ One-command deployment with all services
- Kubernetes Ready โ Complete K8s manifests with HPA (2-10 pods auto-scaling)
- Celery + Redis โ Asynchronous task processing for heavy workloads
The project is deployed and available at: ๐ Production demo: https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr
๐ Experimental lab space: https://huggingface.co/spaces/DrAbdulmalek/omnifile-processor-lab
To deploy your own instance:
git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
pip install -r requirements-hf.txt
python -m src.gradio_ui# Clone the repository
git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
# Install dependencies
pip install -r requirements-full.txt # Everything (~6-8 GB, ~15-30 min)
# Or install in layers:
# pip install -r requirements-core.txt # Minimum (~1.5 GB, ~3 min)
# pip install -r requirements-core.txt -r requirements-ocr.txt # + OCR engines
# pip install -r requirements-core.txt -r requirements-nlp.txt # + NLP
# Run with your preferred interface
streamlit run app.py # Streamlit UI (6 tabs)
python -m src.gradio_ui # Gradio UI (7 tabs)
python main.py # CLI interface
cd frontend && npm install && npm run dev # React Frontendgit clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
docker-compose up -d
# Access:
# API Docs: http://localhost:5001/docs
# Streamlit: http://localhost:7860
# React: http://localhost:3000
# Nginx Proxy: http://localhostpython mobile_review/server.py --host 0.0.0.0 --port 5000
# ุจุนุฏ ุงูู
ุฑุงุฌุนุฉ ูุงูุญูุธ:
python mobile_review/server.py --export-dataset --export-output mobile_review/dataset/review_dataset
# ุฃู ุนุจุฑ ุงูุฃุฏุงุฉ ู
ุจุงุดุฑุฉ:
python tools/build_training_data.py --corrections mobile_review/ocr_corrected.json --output dataset/review_datasetููุถููู ุงุณุชุฎุฏุงู ุงูุฏูุชุฑ ุงูุฌุงูุฒ: ๐ notebooks/OmniFile_Processor_Colab_Debug.ipynb
!git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
%cd OmniFile_Processor
!apt-get update -qq
!apt-get install -y -qq poppler-utils tesseract-ocr tesseract-ocr-ara tesseract-ocr-eng libgl1
!pip install -r requirements-colab.txt
!python hf_app.pyููู ุฒูุฏ ุนู ูุฑูู ุงูุชุซุจูุชุ ุฑุงุฌุน docs/DEPENDENCY_PROFILES.md.
The OmniFile AI Gateway provides a universal proxy for routing AI model requests to multiple providers:
# Clone and install gateway dependencies
git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git
cd OmniFile_Processor
pip install -r requirements-gateway.txt
# Configure your provider (edit modules/ai/gateway/.env)
cp modules/ai/gateway/.env.example modules/ai/gateway/.env
# Start the gateway
bash scripts/start_gateway.sh
# Or start manually:
python -m uvicorn modules.ai.gateway.server:app --host 0.0.0.0 --port 8082Supported Providers: DeepSeek, NVIDIA NIM, OpenRouter, Ollama, LM Studio, llama.cpp, Kimi, Wafer
# Benchmark OCR engines on a folder of images
python tools/benchmark_ocr.py --images examples/ --output artifacts/ocr_benchmark
# Optional ground truth comparison
python tools/benchmark_ocr.py --images examples/ --ground-truth examples/ground_truth.json --output artifacts/ocr_benchmarkู ูุงุญุธุฉ ู ุนู ุงุฑูุฉ โ Architecture Note: ููุฌุฏ ูู ุงูู ุดุฑูุน ูุธุงู ุงู ู ุชูุงุฒูุงู:
modules/โ ุงูุจููุฉ ุงููุธุฑูุฉ ุงูู ูุณูุนุฉ: ูุญุฏุงุช ู ูุธูู ุฉ ุจูุถูุญ (vision, nlp, security, export, ai, evaluation) ู ุน ูู ุงุฐุฌ Pydantic v2. ูุฐู ูู ุงูุจููุฉ ุงูู ุณุชูุจููุฉ ุงูู ูุตูุฏุฉ ููู ุดุฑูุน.src/โ ู ุญุฑู HandwrittenOCR ุงูุนู ูู: ูุญุชูู ุนูู ุงูุชุทุจูู ุงููุนูู ุงูู ูุณุชุฎุฏูู ูู ูุงุฌูุฉ Gradio (src/gradio_ui.py) ูHF Spaces. ูุดู ู TrOCR Batch, LoRA Fine-tuning, Active Learning, ูStudy Guide.- ุงูู ููุงุช ุงูุฌุฐุฑูุฉ (
app.py,database.py,config.py) โ ุทุจูุฉ ุงูุชูุงู ู ุงูุชู ุชุฑุจุท ุจูู ุงูุจููุฉ ูุงูู ุญุฑูุงุช.ุงูุฎูุงุฑ ุงูู ุชุจููู ุญุงููุงู:
src/ูู ุงูููุฏ ุงูุนู ูู ุงููุนูุงู ููุงุฌูุฉ Gradio ูHF Spacesุ ุจููู ุงmodules/ูู ุซู ุงูุจููุฉ ุงููุธุฑูุฉ ุงูู ูุธู ุฉ ููู ุดุฑูุน ุงูู ูุณูุน. ุงูุชุญููู ุงูุชุฏุฑูุฌู (migration) ู ูsrc/ุฅููmodules/ุณูุชู ุนูู ู ุฑุงุญู ุนุจุฑ Pull Requests ู ุณุชููุฉ.
OmniFile_Processor/
โโโ app.py # Main Streamlit UI
โโโ config.py # Central configuration v4.1.1
โโโ database.py # SQLite database layer
โโโ main.py # Local / CLI entry point
โโโ tasks.py # Celery async tasks
โโโ requirements.txt # Full dependencies (legacy)
โโโ requirements-core.txt # Core only (~1.5 GB)
โโโ requirements-ocr.txt # OCR engines layer
โโโ requirements-nlp.txt # NLP layer
โโโ requirements-full.txt # Everything (~6-8 GB)
โโโ requirements-hf.txt # HuggingFace Spaces (minimal)
โโโ Dockerfile # Docker image
โโโ docker-compose.yml # Full stack orchestration
โโโ nginx.conf # Nginx load balancer
โโโ LICENSE # MIT License
โ
โโโ modules/
โ โโโ core/ # Core data models
โ โ โโโ structure.py # Pydantic v2 models
โ โ
โ โโโ vision/ # Computer Vision & OCR
โ โ โโโ ocr_engine.py # 4 OCR engines + ONNX + Quantization
โ โ โโโ image_preprocessor.py # CLAHE + Denoise + Deskew + Otsu
โ โ โโโ pdf_processor.py # Multi-format PDF processing
โ โ โโโ text_reconstructor.py # RTL/LTR sentence reconstruction
โ โ โโโ result_fusion.py # 4 fusion strategies
โ โ โโโ layout_analyzer.py # Layout analysis (tables, headers)
โ โ โโโ table_extractor.py # Table extraction (Hough + contours)
โ โ โโโ htr/ # Arabic HTR Module
โ โ โโโ arabic_htr.py # Unified HTR pipeline
โ โ โโโ line_segmenter.py # Documentโlines (projection/contour/U-Net)
โ โ โโโ word_segmenter.py # Linesโwords (connected components)
โ โ โโโ dotted_recovery.py # Arabic dot correction
โ โ โโโ trocr_finetuned.py # Fine-tuned TrOCR wrapper
โ โ
โ โโโ nlp/ # Natural Language Processing
โ โ โโโ spell_corrector.py # 3-language correction + learning
โ โ โโโ translator.py # Helsinki-NLP translation
โ โ โโโ summarizer.py # BART summarization
โ โ โโโ entity_extractor.py # BERT-based NER
โ โ โโโ language_detector.py # Language detection
โ โ โโโ text_classifier.py # 6-category classification
โ โ โโโ arabic_rtl.py # Full RTL processing
โ โ โโโ mixed_text.py # Arabic/English mixed text
โ โ โโโ ai_corrector.py # GPT-based correction
โ โ โโโ arabic_nlp_utils.py # Semantic similarity for Arabic OCR
โ โ โโโ correction_dict.json # 186+ Arabic corrections
โ โ
โ โโโ ai/ # AI Enhancement
โ โ โโโ pattern_matcher.py # SSIM pattern matching
โ โ โโโ pattern_db.py # SQLite pattern database
โ โ โโโ gemini_refiner.py # Gemini AI refinement
โ โ โโโ gateway/ # OmniFile AI Gateway (universal proxy)
โ โ โโโ server.py # ASGI entry point
โ โ โโโ api/ # FastAPI routes & auth
โ โ โโโ config/ # Provider catalog & settings
โ โ โโโ core/ # SSE & protocol helpers
โ โ โโโ providers/ # 8 provider backends
โ โ โโโ pool/ # Account pool & health management
โ โ
โ โโโ security/ # Security & Privacy
โ โ โโโ file_scanner.py # Security scanning
โ โ โโโ sensitive_data_scanner.py # PII detection (Presidio)
โ โ โโโ encryption.py # Fernet encryption (AES-128)
โ โ โโโ code_protector.py # Code block protection
โ โ โโโ file_organizer.py # Auto file organization
โ โ โโโ archive_handler.py # Archive management
โ โ โโโ backup_manager.py # Backup management
โ โ โโโ audit_logger.py # Audit logging
โ โ โโโ secure_file_handler.py # Safe file handling
โ โ
โ โโโ export/ # Multi-Format Export
โ โ โโโ exporter.py # DOCX/HTML/PDF/JSON/TXT/Excel
โ โ โโโ layout_preserving.py # DOCX export with visual layout preservation
โ โ
โ โโโ evaluation/ # Evaluation & Metrics
โ โโโ metrics.py # CER/WER + quality grading
โ
โโโ frontend/ # React + shadcn/ui Web App
โ โโโ src/
โ โ โโโ App.jsx # Main application
โ โ โโโ components/ # UI components
โ โ โ โโโ FileUpload.jsx
โ โ โ โโโ ProcessingOptions.jsx
โ โ โ โโโ ResultsDisplay.jsx
โ โ โโโ services/api.js # API client
โ โโโ package.json
โ
โโโ backend/ # FastAPI Backend
โ โโโ main.py # REST API endpoints
โ
โโโ data/
โ โโโ arabic_fixes.json # 186 Arabic corrections
โโโ data_seed/
โ โโโ correction_dict_seed.json # Seed data
โโโ artifacts/
โ โโโ correction_dict.json # Learned corrections
โ
โโโ src/ # HandwrittenOCR Engine
โโโ mobile/ # Static PWA (offline review)
โโโ training/ # HTR Training System
โ โโโ README.md # Training overview
โ โโโ configs/ # Training configurations
โ โ โโโ trocr_lora_arabic.yaml # TrOCR + LoRA settings
โ โโโ scripts/ # Training scripts
โ โ โโโ prepare_htr_dataset.py # Dataset preparation
โ โ โโโ train_trocr_lora.py # LoRA training
โ โ โโโ evaluate_checkpoint.py # Model evaluation
โ โ โโโ active_learning_pipeline.py # Active learning
โ โ โโโ generate_synthetic_data.py # Synthetic data generation
โ โโโ models/ # Training models
โ โ โโโ lora_htr_trainer.py # Custom LoRA trainer
โ โโโ data/ # Data connectors
โ โโโ mobile_review_connector.py # Mobile review sync
โโโ Dockerfile.training # Training Docker image
โโโ requirements-training.txt # Training dependencies
โโโ Makefile # Build & training commands
โโโ justfile # Alternative build commands
โ
โโโ mobile_review/ # Flask server (remote team review)
โ โโโ server.py # REST API review server
โ โโโ templates/review.html # Touch-friendly review UI
โ โโโ README.md # mobile/ vs mobile_review/ guide
โโโ tests/ # pytest test suite (13+ files)
โโโ .github/workflows/ # CI/CD
โ โโโ ci.yml # Tests on push/PR
โ โโโ release.yml # Auto-release on tags
โโโ notebooks/ # Jupyter Notebooks
โ โโโ OmniFile_Gradio_Debugger.ipynb # Gradio interactive debugger (Colab-ready)
โโโ docs/ # Documentation
โ โโโ TESTING_GUIDE.md # Testing & development guide
โ โโโ API_DOCS.md
โ โโโ USER_GUIDE.md
โ โโโ DEVELOPER_GUIDE.md
โโโ k8s/ # Kubernetes manifests
โ โโโ namespace.yaml
โ โโโ backend.yaml
โ โโโ celery.yaml
โ โโโ redis.yaml
โ โโโ nginx.yaml
โ โโโ hpa.yaml
โ โโโ storage.yaml
โโโ examples/ # Usage examples
โโโ ocr_basic.py
โโโ nlp_pipeline.py
โโโ evaluation_example.py
The foundational layer defining all shared data structures using Pydantic v2. Provides type-safe models for OCR results, processing options, document metadata, and inter-module communication.
| File | Description |
|---|---|
structure.py |
Pydantic v2 models for OCRResult, ProcessingOptions, Document, BoundingBox, and all shared types |
The heart of the system. Handles image preprocessing, OCR with 4 engines, result fusion, PDF processing, layout analysis, table extraction, and text reconstruction.
| File | Description |
|---|---|
ocr_engine.py |
Multi-engine OCR (TrOCR, EasyOCR, Tesseract, PaddleOCR) with ONNX Runtime & INT8 quantization |
image_preprocessor.py |
CLAHE, Gaussian denoise, deskew (Hough), Otsu thresholding, adaptive binarization |
pdf_processor.py |
PyMuPDF-based PDF processing with pdf2image fallback |
text_reconstructor.py |
RTL/LTR sentence reconstruction with language-aware ordering |
result_fusion.py |
4 fusion strategies: highest confidence, weighted average, voting, longest text |
layout_analyzer.py |
Document layout analysis โ tables, headers, footers, sections |
table_extractor.py |
Table extraction using Hough lines + contour analysis |
Multilingual text processing including spell correction, translation, summarization, entity extraction, text classification, and Arabic RTL handling.
| File | Description |
|---|---|
spell_corrector.py |
3-language spell correction (AR, EN, DE) with user learning & Python keyword protection |
translator.py |
Helsinki-NLP/opus-mt machine translation (6 language pairs) |
summarizer.py |
BART summarization (English + Arabic) |
entity_extractor.py |
BERT-based Named Entity Recognition |
language_detector.py |
Automatic language detection (AR, EN, DE) |
text_classifier.py |
6-category text classification |
arabic_rtl.py |
Full RTL processing โ arabic_reshaper + python-bidi + 40+ normalization rules |
mixed_text.py |
Arabic/English/number mixed text handler with medical term protection |
ai_corrector.py |
GPT-based OCR correction with context awareness |
correction_dict.json |
186+ common Arabic OCR error corrections |
Comprehensive security module for PII detection, encryption, code protection, file organization, backup management, and audit logging.
| File | Description |
|---|---|
file_scanner.py |
File security scanning and validation |
sensitive_data_scanner.py |
PII detection using Microsoft Presidio + detect-secrets |
encryption.py |
Fernet symmetric encryption (AES-128) with folder support |
code_protector.py |
Prevents spell correction inside code blocks (Python, JS, etc.) |
file_organizer.py |
Automatic file organization by type and content |
archive_handler.py |
ZIP archive creation/extraction with integrity checks |
backup_manager.py |
Automatic and manual backup management |
audit_logger.py |
File + Redis audit trail with statistics |
secure_file_handler.py |
Path traversal prevention + safe tempfile handling |
Advanced AI capabilities including self-learning pattern matching, Gemini-based refinement, and a universal AI model gateway.
| File | Description |
|---|---|
pattern_matcher.py |
SSIM-based visual pattern matching โ learns from corrected word images |
pattern_db.py |
SQLite database for storing and retrieving visual OCR patterns |
gemini_refiner.py |
Google Gemini API integration for context-aware OCR refinement |
gateway/ |
OmniFile AI Gateway โ Universal proxy for routing AI model requests to multiple providers (see below) |
A universal AI model proxy that intercepts Messages API requests and routes them to alternative providers. Supports 8 provider backends with automatic format conversion and SSE streaming.
| Subdirectory | Description |
|---|---|
api/ |
FastAPI application layer โ routes, auth, model routing, request optimization |
api/models/ |
Pydantic request/response models for the Messages API protocol |
api/web_tools/ |
Local web search and fetch handling with SSRF protection |
config/ |
Central settings, provider catalog, logging configuration |
core/ |
Protocol helpers โ SSE formatting, message conversion, token estimation |
core/anthropic/ |
Native Messages API support โ content blocks, thinking tags, tool parsing |
providers/ |
Provider transport implementations โ OpenAI-compatible and native protocols |
providers/deepseek/ |
DeepSeek provider (native Messages API) |
providers/nvidia_nim/ |
NVIDIA NIM provider (OpenAI-compatible) |
providers/open_router/ |
OpenRouter provider (native Messages API) |
providers/ollama/ |
Ollama local provider (native Messages API) |
providers/lmstudio/ |
LM Studio local provider (native Messages API) |
providers/llamacpp/ |
llama.cpp local provider (native Messages API) |
providers/kimi/ |
Kimi/Moonshot provider (OpenAI-compatible) |
providers/wafer/ |
Wafer provider (native Messages API) |
server.py |
ASGI entry point for the gateway proxy |
pool/ |
Advanced Pool Management โ Account pooling, rate limit fallback, conversation reuse, health scoring |
Intelligent multi-account management system with automatic failover, rate limit distribution, and conversation context reuse.
| File | Description |
|---|---|
account_pool.py |
AccountPool โ Multi-account rotation with priority ordering, LRU selection, concurrency control, and automatic failover after consecutive failures |
rate_limit_fallback.py |
RateLimitFallbackManager โ Smart model fallback when rate-limited: auto-detects effort tiers (low/medium/high/xhigh/max), picks same-provider alternatives first, then cross-provider |
conversation_pool.py |
ConversationPool โ Multi-turn conversation reuse via stable fingerprinting: normalizes dynamic tokens (dates, UUIDs, CWDs), TTL-based LRU eviction (500 entries, 30 min default) |
health_scorer.py |
ProviderHealthScorer โ Weighted health scoring (0.0-1.0) based on success rate, latency percentiles (p50/p95), error type tracking, and consecutive failure streaks |
Export processed documents to 6 different formats with proper RTL support.
| File | Description |
|---|---|
exporter.py |
Export to DOCX (RTL bidi), HTML (dir="rtl"), searchable PDF (invisible text), Excel (RTL alignment), JSON (full structure with BBox), TXT (UTF-8 BOM) |
OCR accuracy evaluation with Arabic-aware normalization.
| File | Description |
|---|---|
metrics.py |
CER/WER computation, Arabic text normalization, Levenshtein distance (zero dependencies), quality grading (A+ to F) |
The FastAPI backend exposes the following REST endpoints. Full interactive documentation is available at /docs (Swagger UI) when the backend is running.
http://localhost:5001/api/v1
| Method | Endpoint | Description |
|---|---|---|
POST |
/ocr/process |
Process an image/PDF file with OCR |
POST |
/ocr/process-batch |
Batch process multiple files |
GET |
/ocr/result/{task_id} |
Get OCR result by task ID |
POST |
/nlp/correct |
Spell-correct text (AR/EN/DE) |
POST |
/nlp/translate |
Translate text between languages |
POST |
/nlp/summarize |
Summarize text |
POST |
/nlp/entities |
Extract named entities |
POST |
/nlp/classify |
Classify text into categories |
POST |
/export/{format} |
Export results to DOCX/HTML/PDF/JSON/TXT/Excel |
POST |
/security/scan |
Scan file for PII and security issues |
POST |
/security/encrypt |
Encrypt a file |
POST |
/security/decrypt |
Decrypt a file |
GET |
/health |
Health check endpoint |
GET |
/metrics |
System performance metrics |
๐ For the complete API reference with request/response schemas, see docs/API_DOCS.md or access the Swagger UI at
/docs.
| Streamlit UI | Gradio UI | React Frontend |
|---|---|---|
![]() |
![]() |
![]() |
| 6-tab interface | 7-tab interface | Dark/Light mode |
| OCR Processing | Arabic RTL | Table Extraction |
|---|---|---|
![]() |
![]() |
![]() |
| Multi-engine OCR | Full RTL support | Structured data |
๐ Screenshots will be added in a future update. For now, try the live demo to see the application in action.
๐ See CHANGELOG.md for the complete version history.
| Metric | Value |
|---|---|
| Python Files | 155+ |
| Lines of Code | ~40,000+ |
| Total Files | 230+ |
| OCR Engines | 4 (TrOCR, EasyOCR, Tesseract, PaddleOCR) |
| Fusion Strategies | 4 |
| Supported Languages | 3 (EN, AR, DE) |
| Export Formats | 6 (DOCX, HTML, PDF, JSON, TXT, Excel) |
| Test Files | 13 |
| Merged Projects | 6 |
| Security Modules | 9 |
| NLP Capabilities | 10 |
| AI Gateway Providers | 8 (DeepSeek, NIM, OpenRouter, Ollama, LM Studio, llama.cpp, Kimi, Wafer) |
| API Endpoints | 14+ |
| Language | Code | RTL Support | OCR | Spell Check | Translation |
|---|---|---|---|---|---|
| ๐ธ๐ฆ ุงูุนุฑุจูุฉ (Arabic) | ar |
โ | โ | โ | โ |
| ๐ฌ๐ง English | en |
โ | โ | โ | โ |
| ๐ฉ๐ช Deutsch (German) | de |
โ | โ | โ | โ |
Contributions are welcome! Please follow these steps:
- Fork the repository
- Clone your fork locally
git clone https://github.com/your-username/OmniFile_Processor.git cd OmniFile_Processor - Create a feature branch
git checkout -b feature/your-feature-name
- Make your changes and ensure tests pass
pip install -r requirements-full.txt # Everything (~6-8 GB, ~15-30 min)
pytest tests/ -v
5. **Commit** with a descriptive message
```bash
git commit -m "feat: add your feature description"
- Push to your fork
git push origin feature/your-feature-name
- Open a Pull Request against the
mainbranch
- Follow PEP 8 style guidelines
- Add docstrings to all new functions and classes
- Write tests for new features (place in
tests/) - Update the relevant documentation in
docs/ - Use type hints throughout your code
- Ensure RTL handling is tested for any text-related changes
This project is licensed under the MIT License.
MIT License
Copyright (c) 2026 Dr Abdulmalek Tamer Al-husseini โ Homs, Syria
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
See LICENSE for the full text.
| Resource | Link |
|---|---|
| ๐ GitHub Repository | https://github.com/DrAbdulmalek/OmniFile_Processor |
| ๐ค HuggingFace Spaces | https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr |
| ๐ User Guide | docs/USER_GUIDE.md |
| ๐จโ๐ป Developer Guide | docs/DEVELOPER_GUIDE.md |
| ๐งช Testing Guide | docs/TESTING_GUIDE.md |
| ๐ก API Documentation | docs/API_DOCS.md |
| ๐ก Suggestions | SUGGESTIONS.md |
| ๐ License | LICENSE |
Built with โค๏ธ by Dr Abdulmalek Tamer Al-husseini ๐ Homs, Syria ย |ย ๐ง Abdulmalek.husseini@gmail.com
โญ If you find this project useful, please give it a star on GitHub!
| Field | Value |
|---|---|
| Role | Legacy AI File Processing System |
| Status | Legacy โ Migration Source |
| Layer | Legacy / Archive |
| Priority | Low |
| Migration Target | omni-medical-suite |
OmniFile_Processor v5.0 has been merged into omni-medical-suite v2.0. This repository is preserved as a reference and migration source.
| Component | Source (This Repo) | Destination |
|---|---|---|
| Multi-Engine OCR | OmniFile_Processor | omni-medical-suite (packages/omni-ocr) |
| NLP Pipeline | OmniFile_Processor | omni-medical-suite (packages/nlp) |
| AI Gateway | OmniFile_Processor | omni-medical-suite (packages/ai) |
| Export System | OmniFile_Processor | omni-medical-suite (packages/export) |
| Streamlit UI | OmniFile_Processor | omni-medical-suite (apps/) |
| Gradio UI | OmniFile_Processor | omni-medical-suite (services/) |
| React UI | OmniFile_Processor | omni-medical-suite (apps/web) |
- Developers referencing legacy implementations during migration
- Teams needing to extract specific modules not yet migrated
- Users of the HuggingFace Spaces deployments (still active)
| Need | Repository |
|---|---|
| New development / unified platform | omni-medical-suite |
| Reference legacy implementations | This repo (OmniFile_Processor) |
| Active HuggingFace demo | HF Space |
| Repo | Role | Status |
|---|---|---|
| omni-medical-suite | Main Platform (Successor) | Active |
| medical-ocr-postprocessor | Core Correction Engine | Active |
| medical-handwriting-ocr | Production OCR | Active |
License: MIT โ Dr. Abdulmalek Tamer Al-husseini






