A simple Retrieval-Augmented Generation (RAG) agent that:
- Ingests PDF documents into a Chroma vector store
- Embeds text via Hugging Face Inference API (
all-MiniLM-L6-v2) - Retrieves relevant excerpts on user queries
- Generates contextualized answers using SeaLion LLM (e.g.
Gemma-SEA-LION-v3-9B-IT) - Exports responses as downloadable PDF files
Built with Python, Gradio, LangChain, ChromaDB, PyPDF2 & FPDF
- Features
- Prerequisites
- Installation
- Configuration
- Usage
- Project Structure
- API Reference
- Contributing
- License
- PDF Ingestion: Splits each page into 1,000-character chunks (200-character overlap) for efficient indexing.
- Embeddings: Uses the Hugging Face Inference API endpoint for
sentence-transformers/all-MiniLM-L6-v2to generate vector embeddings. - Similarity Search: Retrieves top-k most similar document chunks via ChromaDB.
- Contextual Q&A: Builds a prompt including retrieved context and queries SeaLion’s hosted LLM API for answers.
- Downloadable Reports: Packages answers (with listed source titles/pages) into a PDF for offline reference.
- Python 3.8+
- Docker (optional, for running a local ChromaDB instance)
- A SeaLion API key (
SEA_LION_API_KEY) - A Hugging Face Inference API token (
HUGGINGFACE_API_TOKEN)
-
Clone the repository
git clone https://github.com/AlvesMH/RAG_agent.git cd simple-rag-agent -
Create & activate a virtual environment
python -m venv venv source venv/bin/activate # macOS/Linux venv\\Scripts\\activate.bat # Windows
-
Install dependencies
pip install -r requirements.txt
-
(Optional) Run ChromaDB in Docker
docker run -d --name chroma -p 8000:8000 -v $(pwd)/chroma_db:/chroma chromadb/chromadb:latest
Copy the example environment file and fill in your API keys:
cp .env.example .env
# Then open .env and add:
# SEA_LION_API_KEY=your_sealion_key
# HUGGINGFACE_API_TOKEN=your_hf_tokenSEA_LION_API_KEY: Used to authenticate calls to SeaLion’s chat/completions endpoint.HUGGINGFACE_API_TOKEN: Grants access to the Hugging Face Inference API for embeddings.
-
Launch the Gradio app
python Simple_RAG_Agent.py
This will open a local web interface (usually at
http://localhost:7860). -
Ingest PDFs
- Click Upload PDF Documents → select one or more
.pdffiles. - Click Add to VectorDB → see ingestion status.
- Check existing docs via List Documents.
- Click Upload PDF Documents → select one or more
-
Retrieve Excerpts
- Enter a question in the Enter your question box.
- Click Retrieve Excerpts to view top-k matching chunks.
-
Generate Full Response
- With your query entered, click Generate Response.
- The agent will call SeaLion, stitch context into an answer, display it, and save a PDF you can download.
├── .env.example # Example env file
├── requirements.txt # Python dependencies
├── Simple_RAG_Agent.py # Main application script
├── chroma_db/ # Vectorstore
├── README.md # This file
- Endpoint:
POST https://api.sea-lion.ai/v1/chat/completions - Payload:
{ model, messages, temperature } - Response:
{ choices: [ { message: { content } } ] }
- Endpoint:
POST https://api-inference.huggingface.co/pipeline/feature-extraction/{model} - Payload:
{ inputs: [...texts] } - Response:
[[float...], ...]embeddings
This project is distributed under the MIT License. See LICENSE for more details | Built with ❤️ in Singapore
