A Retrieval-Augmented Generation (RAG) based chat assistant that answers user questions strictly from internal documents such as academic calendars, institutional policies, SOPs, FAQs, and manuals.
This project is designed to simulate a real-world SMB / institutional knowledge assistant where accuracy, document grounding, and policy version control are critical.
Small and Medium Businesses (SMBs) and institutions face the following challenges:
- 🔍 Difficulty searching across multiple internal documents
- 🔁 Repetitive questions asked to HR / support teams
⚠️ Errors caused by outdated or conflicting policy information- 🌐 Risk of LLMs hallucinating answers from the internet
"We want a chat assistant that answers questions only from our internal documents, not from the internet."
We built a Mini RAG Assistant that:
- Ingests internal documents (PDFs, TXT, DOCs, CSVs)
- Converts them into vector embeddings
- Stores embeddings in a FAISS vector database
- Retrieves relevant chunks for a user query
- Feeds retrieved context to an LLM to generate grounded answers
- 📄 Academic Calendar (PDF)
- 📄 Institutional Policies (PDF)
- 📄 Internal Documents
- 📄 SOPs, FAQs
Documents (PDF / TXT / DOC / CSV)
↓
Text Extraction + Chunking
↓
Embedding Generation (all-MiniLM-L6-v2)
↓
FAISS Vector Database
↓
User Query
↓
Query Embedding (all-MiniLM-L6-v2)
↓
Top-K Similarity Search (FAISS)
↓
Prompt Construction
↓
LLM (GPT / Google API / HuggingFace)
↓
Final Answer (Grounded in Documents)
- Language: Python
- Embedding Model:
sentence-transformers/all-MiniLM-L6-v2 - Vector Database: FAISS
- LLM:
- OpenAI GPT API (optional)
- Google Generative API (optional)
- HuggingFace open-source models *(Mistral, LLaMA, etc.)
- UI: Streamlit
- Documents are split into smaller text chunks
- Each chunk is embedded separately
- Helps in:
- Better semantic retrieval
- Reduced hallucinations
- Efficient similarity search
Instruction:
You are an institutional assistant. Answer only from the provided context.
Context:
[Top-K Retrieved Document Chunks]
User Question:
{query}
Supported formats:
- ✅ TXT
- ✅ DOC / DOCX
- ✅ CSV
Handled using appropriate loaders and text extractors before chunking.
We handle new or updated policies in two robust ways:
- Maintain multiple versions of a policy
- Assign metadata like
version,date,status - Apply re-ranking to prioritize the latest policy
- When a new policy is added:
- Old policy embeddings are removed or deactivated
- Only the latest policy remains active in the vector DB
➡️ This avoids conflicts and ensures answers are always up-to-date.
Navigate to the project directory:
cd RAG_Assistant
pip install -r requirements.txt
cd embeds
python embed.py
cd ..
This will:
- Load documents
- Chunk text
- Generate embeddings
- Store vectors in FAISS
streamlit run app.py
Open the browser and go to:
http://localhost:8501
(or use the Streamlit-generated URL)
- Ask any question related to the document corpus
- Example queries:
- "What is the semester start date?"
- "What is the attendance policy?"
- "What are the academic regulations for grading?"
📌 The model will not answer questions outside the document corpus.
- ✅ Answers grounded only in internal documents
- ✅ No internet-based hallucinations
- ✅ Handles multiple file formats
- ✅ Policy version control
- ✅ Easy to deploy and extend
- 🔄 Incremental embedding updates
- 🧠 Cross-encoder re-ranking
- 🔍 Hybrid search (BM25 + FAISS)
- 👥 Role-based access control
- 📊 Admin dashboard for document management
This Mini RAG Assistant demonstrates a production-ready approach to building a secure, document-grounded AI assistant for institutions and SMBs.
It reduces manual effort, improves accuracy, and ensures users always receive up-to-date and trusted information.
✨ Built for real-world RAG system understanding and deployment readiness.