A web-based Retrieval-Augmented Generation (RAG) system that allows users to upload documents, ask questions, and receive answers grounded explicitly in retrieved source content.
RAG Studio demonstrates the core RAG pipeline:
- Upload PDF or TXT documents
- Index documents using semantic embeddings
- Query documents with natural language questions
- Generate answers grounded in retrieved document chunks
- Attribute answers by showing the exact source chunks used
The system emphasizes source transparency by always showing which document chunks were used to generate each answer.
- Multi-Query Retrieval - Automatically rewrites queries in multiple ways for better document matching
- Smart Similarity Filtering - Only applies threshold when more than 5 documents indexed
- Strict Document Grounding - Answers are generated only when relevant document context exists
- Multi-document Support - Upload and manage multiple documents
- Document Management - Delete documents and their embeddings
- Source Attribution - Shows which documents were used for each answer
- Semantic Search - ChromaDB vector database for retrieval
- Answer Generation - Google Gemini for grounded responses
- Clean UI - Streamlit chat interface with left-right layout
The system improves retrieval by:
- Query Analysis - LLM analyzes user intent
- Query Rewriting - Generates 2-3 alternative phrasings
- Multi-Vector Search - Searches with all query variations
- Result Aggregation - Deduplicates and ranks by best scores
- Quality Boost - Documents matching multiple queries rank higher
The system will not answer questions if no relevant context is found (when more than 5 documents are indexed).
When you ask a question:
- System rewrites your query into multiple perspectives
- Searches uploaded documents with all variations
- If more than 5 documents: Applies similarity threshold filtering (0.5)
- If 5 or fewer documents: Returns all matches (no threshold)
- If no relevant content found (when threshold applies), it explicitly says so
- LLM is only called when relevant context exists
- All answers include source attribution
Backend (FastAPI):
/upload- Accept and store document files/index- Extract text, chunk, embed, and store in vector database/query- Retrieve relevant chunks and generate grounded answers/documents- List all indexed documents/documents/{filename}- Delete a document and its embeddings
Components:
- SentenceTransformers for embeddings (local model)
- ChromaDB for vector storage (persistent)
- Google Gemini for answer generation
- Streamlit frontend with chat interface
- Python 3.8 or higher
- Google Gemini API key - Get one at https://aistudio.google.com/api-keys
- Clone the repository:
git clone https://github.com/nkmohit/rag-studio.git
cd rag-studio- Install dependencies:
pip install -r requirements.txt- Download the embedding model:
python download_model.pyThis will download the sentence-transformers model to ./utils/models/retriever/
- Create environment file:
echo "GEMINI_API_KEY=your_api_key_here" > .envReplace your_api_key_here with your actual Gemini API key.
- Start the backend server:
uvicorn main:app --reloadBackend will run on http://localhost:8000
- In a separate terminal, start the Streamlit UI:
streamlit run streamlit_app.pyUI will open automatically in your browser at http://localhost:8501
- Upload Documents - Click "Upload New Document" and select PDF or TXT files
- Manage Documents - View all indexed documents in the left panel
- Ask Questions - Type questions in the chat interface
- View Sources - See which documents and chunks were used for each answer
- Multi-Query Retrieval - Rewrite queries for comprehensive document coverage
- Strict Grounding - Never answer without relevant document context
- Similarity Filtering - Only retrieve chunks above threshold
- Source Attribution - Always show which documents were used
- Clear Separation of Concerns - Loader, embeddings, generation are independent
- Explicit Over Implicit - Clear error messages when context is missing
- No Hallucination - LLM instructed to answer only from provided context
MIT License - See LICENSE file for details.
