Skip to content

kshitizj03/rag-pdf-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Document Q&A

Ask questions about your PDF documents in plain English. Upload any document and get accurate, cited answers — powered by semantic search and Google Gemini.

Built to explore production patterns in LLM systems: vector retrieval, streaming responses, session isolation, and grounded generation.


Architecture

UPLOAD FLOW
─────────────────────────────────────────────────────────────
PDF ──► PyMuPDF ──► Chunks (512 tok, 50 overlap) ──► Gemini Embeddings ──► ChromaDB
                                                                               │
                                                                    (keyed by session ID)

QUERY FLOW
─────────────────────────────────────────────────────────────
User Question ──► Gemini Embeddings ──► Cosine Similarity Search ──► Top 5 Chunks
                                                                           │
                                                              ┌────────────▼────────────┐
                                                              │      Gemini LLM          │
                                                              │  + System Prompt         │
                                                              │  + Conversation History  │
                                                              │    (last 5 messages)     │
                                                              └────────────┬────────────┘
                                                                           │
                                                              Streamed Answer + Citations

Features

  • Semantic Search — vector similarity retrieval via ChromaDB, not keyword matching
  • Streaming Responses — token-by-token generation via Server-Sent Events
  • Source Citations — every answer links back to the exact page and chunk it came from
  • Conversation History — sliding window of last 5 messages for context-aware follow-ups
  • Multi-User Sessions — UUID-based session isolation; each user's documents are namespaced separately
  • Grounded Generation — system prompt enforces answers only from retrieved context, eliminating hallucination
  • Production Patterns — rate limiting, structured logging, retry logic with exponential backoff

Tech Stack

Layer Technology
Backend Python, FastAPI
Vector Store ChromaDB
LLM + Embeddings Google Gemini API
PDF Parsing PyMuPDF
Token Counting tiktoken
Frontend React, Tailwind CSS, Vite
Containerization Docker, Docker Compose

Getting Started

Prerequisites

  • Docker and Docker Compose
  • Google Gemini API key (get one here)

Run with Docker

git clone https://github.com/kshitizj03/rag-pdf-chat
cd RAG

cp .env.example .env
# Add your GEMINI_API_KEY to .env

docker compose up --build

Local Development

Backend

cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
uvicorn main:app --reload --port 8000

Frontend

cd frontend
npm install
cp .env.example .env
npm run dev

License

MIT

About

Chat with your PDF documents using RAG — semantic search, streaming responses, and source citations

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors