Multi-Agent AI Document Classifier & Processor

🚀 Smart Document Triage with PDF, JSON, Email Routing 📁 Built with Python, LangChain (LLM), Redis, and Streamlit

🔍 Overview

This project is a multi-agent AI system that accepts documents in various formats (PDF, Email, JSON) and intelligently:

Classifies the document format and business intent (Invoice, RFQ, Complaint, Regulation)
Routes the input to the correct processing agent (EmailAgent, JSONAgent, PDFAgent)
Extracts structured data (e.g., sender info, invoice fields, anomalies)
Stores traceable output in Redis-based shared memory for chain-of-processing transparency

This project leverages LLMs, LangChain, and agent orchestration for real-world intelligent document workflows.

🧠 Key Features

Feature	Description
📤 Format Detection	Automatically detects if input is PDF, JSON, or Email
🧾 Intent Classification	Understands if the doc is an Invoice, RFQ, Complaint, or Regulation
🧠 LLM-Powered Extraction	Uses LLMs to extract and clean structured fields from natural text
🔁 Multi-Agent Routing	Orchestrates document to the right processing agent
🧩 Shared Memory via Redis	Persists extracted fields, metadata, thread ID for traceability
📊 Anomaly Detection (JSON)	Flags missing/suspicious fields in structured JSON
📨 CRM Formatting (Email)	Extracts sender info + urgency for downstream CRM integration
📄 Streamlit UI	Upload interface with live logs, previews, and Redis-stored results

📁 Folder Structure

.
├── agents/
│   ├── classifier_agent.py
│   ├── email_agent.py
│   ├── json_agent.py
│   └── pdf_agent.py
├── router/orchestrator.py
├── memory/redis_memory.py
├── utils/
│   ├── logger.py
│   ├── json_cleaner.py
│   └── file_handler.py
├── llm/langchain_llm.py
├── app.py  # Streamlit UI
├── requirements.txt
└── README.md

🖼️ Sample Output Screenshots

📦 Setup Instructions

Clone the repository

git clone https://github.com/yourname/multi-agent-ai-docs.git
cd multi-agent-ai-docs

Install dependencies

pip install -r requirements.txt

Configure Environment Create a .env file:

GROQ_API_KEY=your_groq_api_key
REDIS_HOST=localhost
REDIS_PORT=6379

Run the Streamlit App

streamlit run app.py

🛠️ Requirements

langchain
langchain-groq
redis
streamlit
python-dotenv
pdfplumber  # for PDF extraction

🌐 Project Highlights

Real-time classification using LLMs (via LangChain)
Seamless orchestration across multi-modal document types
Context-sharing using Redis-based memory architecture
Fully interactive UI with Streamlit
Highly modular structure for extensibility to other formats or intents

Thank you for checking out this project 🙌

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
__pycache__		__pycache__
agents		agents
llm		llm
memory		memory
outputs		outputs
router		router
test_files		test_files
ui		ui
utils		utils
venv		venv
.gitignore		.gitignore
README.md		README.md
config.py		config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent AI Document Classifier & Processor

🔍 Overview

🧠 Key Features

📁 Folder Structure

🖼️ Sample Output Screenshots

📦 Setup Instructions

🛠️ Requirements

🌐 Project Highlights

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent AI Document Classifier & Processor

🔍 Overview

🧠 Key Features

📁 Folder Structure

🖼️ Sample Output Screenshots

📦 Setup Instructions

🛠️ Requirements

🌐 Project Highlights

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages