LangChain 1.0 + OCR Multimodal Document Analysis System

Integrates MinerU, PaddleOCR‑VL, and DeepSeekOCR — top-performing OCR parsing projects

English | 中文

⚡ Overview

Deploy the industry's leading OCR parsing projects via the vLLM inference framework — MinerU, DeepSeek‑OCR, and PaddleOCR‑VL — and build a multimodal data analysis system with a unified parsing service interface. The project includes optimizations and wrappers for DeepSeek‑OCR and MinerU service interfaces, making it ready for enterprise use.

redpandacompress_.2025-11-09_164002_924.mp4

For installation and detailed instructions for MinerU, PaddleOCR‑VL, and DeepSeekOCR, see the tutorial.

🎯 Key Features

Unified parsing interface: pluggable selection of MinerU, PaddleOCR‑VL, and DeepSeek‑OCR
Batch parsing: supports batch processing for PDFs and images; auto-splits multi‑page documents
High performance: powered by the vLLM inference framework
Multimodal support: extract text, tables, formulas, images, and more
Standardized outputs: unified format with Markdown/JSON and image exports

🚀 Quick Start

For MinerU, PaddleOCR‑VL, and DeepSeekOCR installation and detailed guidance, refer to the tutorial.

Configure Backend Environment

Edit backend/.env:

# Server Configuration
PORT=8000
HOST=0.0.0.0
DEBUG=True

# MinerU Configuration - Using Direct API
MINERU_API_URL=http://192.168.130.4:50000/file_parse
VLLM_SERVER_URL=http://192.168.130.4:40000
MINERU_BACKEND=vlm-vllm-async-engine
MINERU_TIMEOUT=600
MINERU_VIZ_DIR=/home/MuyuWorkSpace/05_OcrProject/backend/mineru_visualizations

# DeepSeek OCR Configuration
DEEPSEEK_OCR_API_URL=http://192.168.130.4:8797/ocr

# PaddleOCR Configuration
PADDLEOCR_API_URL=http://192.168.130.4:10800/layout-parsing

# File Upload Limits
MAX_FILE_SIZE=10485760
ALLOWED_FILE_TYPES=application/pdf,image/png,image/jpeg,image/jpg,image/webp

# Storage Paths
UPLOAD_DIR=./uploads
EXPORT_DIR=./exports
TEMP_DIR=./temp

# Processing Timeout (seconds)
OCR_TIMEOUT=300

# CORS Settings
ALLOWED_ORIGINS=http://localhost:3000,http://localhost:5173

Start Backend Service

    cd backend

    # Create and activate virtual environment
    python -m venv venv
    source venv/bin/activate  # Linux/Mac
    # or
    venv\Scripts\activate    # Windows

    # Install dependencies
    pip install -r requirements.txt

    # Start server
    python main.py

Start Frontend Service

    cd frontend

    # Install dependencies
    npm install

    # Start dev server
    npm run dev

🙈 Contributing

Contributions via GitHub PRs or issues are welcome. We appreciate any form of contribution, including feature improvements, bug fixes, and documentation.

😎 Community

Explore our tech community 👉 Large Model Tech Community | Fanfan Space

Scan to add the contact and reply "OCR" to join the technical group and learn with other members.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
backend		backend
frontend		frontend
Deployment.md		Deployment.md
README.md		README.md
README_zh.md		README_zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LangChain 1.0 + OCR Multimodal Document Analysis System

⚡ Overview

🎯 Key Features

🚀 Quick Start

Configure Backend Environment

Start Backend Service

Start Frontend Service

🙈 Contributing

😎 Community

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LangChain 1.0 + OCR Multimodal Document Analysis System

⚡ Overview

🎯 Key Features

🚀 Quick Start

Configure Backend Environment

Start Backend Service

Start Frontend Service

🙈 Contributing

😎 Community

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages