NPMAI-RAG-API-Pipeline

A powerful FastAPI-based multi-modal ingestion system that processes PDFs, scanned documents, images, videos, YouTube links, and text files — then optionally performs semantic retrieval using FAISS + HuggingFace embeddings and refines answers using an LLM (Ollama via NPMAI).

🚀 Features

📄 Extract text from searchable PDFs
🖨️ OCR for scanned PDFs
🖼️ Image OCR (Tesseract + OpenCV preprocessing)
🎥 Local video speech-to-text (Whisper)
📺 YouTube video transcription (yt-dlp + Whisper)
📃 Plain text processing
🧠 FAISS vector database creation & loading
🔎 Semantic similarity search
♻️ Iterative refinement using LLM (Ollama)
🗂 Automatic ingestion routing based on file type

To understand repo project with AI in detail with full documentation visit here:-

Workflow:-

🏗 Architecture Overview

Client Request
      ↓
/ingestion Endpoint
      ↓
File Type Detection
      ↓
Text Extraction (PDF/OCR/Video/etc.)
      ↓
Optional Vector DB Retrieval (FAISS)
      ↓
Refinement via LLM
      ↓
Final Response

📌 API Endpoints

Health Check

GET /

Returns:

{ "ok": true }

Main Ingestion Endpoint

POST /ingestion

Supported Inputs:

file → Upload file (pdf, txt, mp4, jpg, png, etc.)
query → Optional semantic query
DB_PATH → Path to vector database
link → YouTube link
output_path → Download location for video
temperature → LLM temperature
model → Ollama model name

📂 Supported File Types

Type	Processing Method
PDF (text-based)	PyMuPDF
PDF (scanned)	pdf2image + Tesseract
Image	OpenCV + Tesseract
TXT	Direct read
MP4	Whisper transcription
YouTube	yt-dlp + Whisper

🔍 Retrieval Pipeline

If query and DB_PATH are provided:

Check if FAISS DB exists
If yes → Load and perform similarity search
If no → Create embeddings & save DB
Retrieve top 4 chunks
Send to LLM refine loop

🧠 Vector Store

Embeddings: all-MiniLM-L6-v2
Vector DB: FAISS
Chunk Size: 1000
Overlap: 200

🔄 Refinement Logic

For each retrieved chunk:

Pass context to LLM
Iteratively refine previous answer
Return final refined response

📦 Dependencies

Install required packages:

pip install fastapi uvicorn
pip install langchain langchain-community
pip install faiss-cpu
pip install whisper
pip install moviepy
pip install pytesseract
pip install pdf2image
pip install pymupdf
pip install yt-dlp
pip install opencv-python
pip install pillow
pip install numpy

Make sure:

Tesseract OCR is installed in system
FFmpeg is installed
Ollama is running locally

▶️ Running the Server

uvicorn main:app --reload

🧩 Example Usage

Upload a PDF with Retrieval

POST /ingestion
Form Data:
file = document.pdf
query = "Summarize key points"
DB_PATH = vector_db
model = llama3
temperature = 0.7

⚠️ Notes

GPU is disabled (CUDA_VISIBLE_DEVICES="")
Whisper model loads once (thread-safe singleton)
FAISS uses dangerous deserialization (use trusted DB paths only)
Temporary audio saved as temp.wav

🔮 Future Improvements

Streaming responses
Async video processing
Chunk-level caching
Background task queue
Better refine logic
Support for multiple vector stores
Use through ##Docker

🛠 Tech Stack

FastAPI
FAISS
HuggingFace Embeddings
Whisper
OpenCV
Tesseract OCR
PyMuPDF
yt-dlp
##npmai

📜 License

MIT License

💡 Summary

This system acts as a universal AI ingestion pipeline capable of processing multi-modal data and performing intelligent semantic retrieval with LLM refinement.

It can serve as:

AI document assistant
Video summarizer
Research helper
OCR intelligence engine
Knowledge base system

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
entrypoint.sh		entrypoint.sh
npmai-rag-api-framework.py		npmai-rag-api-framework.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NPMAI-RAG-API-Pipeline

🚀 Features

To understand repo project with AI in detail with full documentation visit here:-

Workflow:-

🏗 Architecture Overview

📌 API Endpoints

Health Check

Main Ingestion Endpoint

Supported Inputs:

📂 Supported File Types

🔍 Retrieval Pipeline

🧠 Vector Store

🔄 Refinement Logic

📦 Dependencies

▶️ Running the Server

🧩 Example Usage

Upload a PDF with Retrieval

⚠️ Notes

🔮 Future Improvements

🛠 Tech Stack

📜 License

💡 Summary

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NPMAI-RAG-API-Pipeline

🚀 Features

To understand repo project with AI in detail with full documentation visit here:-

Workflow:-

🏗 Architecture Overview

📌 API Endpoints

Health Check

Main Ingestion Endpoint

Supported Inputs:

📂 Supported File Types

🔍 Retrieval Pipeline

🧠 Vector Store

🔄 Refinement Logic

📦 Dependencies

▶️ Running the Server

🧩 Example Usage

Upload a PDF with Retrieval

⚠️ Notes

🔮 Future Improvements

🛠 Tech Stack

📜 License

💡 Summary

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages