| title | RAG LangGraph Chatbot |
|---|---|
| sdk | gradio |
| sdk_version | 4.44.1 |
| app_file | app.py |
| python_version | 3.1 |
title: RAG LangGraph Chatbot sdk: gradio sdk_version: 4.44.1 app_file: app.py python_version: "3.10"
This project implements a RAG (Retrieval-Augmented Generation) chatbot that answers with either:
- Hugging Face router (when you provide an HF token and a router-available model; default
HF_MODEL_ID:meta-llama/Meta-Llama-3-8B-Instruct), or - Local transformers generation (no token; fallback
LOCAL_MODEL_ID:distilgpt2by default — quality is limited; set a stronger local model if you need better offline answers).
- RAG Pipeline: Ingests, chunks, embeds, and indexes PDF documents for accurate retrieval.
- Inference Flexibility: Uses HF router when a token is provided; falls back to local transformers otherwise.
- LangGraph Agent: Retrieval + generation flow is orchestrated with LangGraph for clearer state handling.
- Gradio Interface: A user-friendly chat UI for interacting with the assistant.
- Modular Design: Clean separation of concerns (Ingestion, Vector Store, Agent, App).
rag_agent_project/
├─ app.py # Gradio application
├─ requirements.txt # Dependencies
├─ data/ # Data storage (PDFs, Index)
├─ src/ # Source code
│ ├─ ingestion.py # Data processing
│ ├─ vectorstore.py # Embedding & Indexing
│ ├─ rag_tool.py # (legacy) retriever tool helper
│ ├─ agent.py # RAG + HF router/local agent
│ └─ config.py # Configuration
└─ tests/ # Automated tests
-
Install Dependencies:
pip install -r requirements.txt
-
Configure (optional):
- Set
HUGGINGFACEHUB_API_TOKENfor router inference. - Override
HF_MODEL_IDfor router (default:meta-llama/Meta-Llama-3-8B-Instruct). - Override
LOCAL_MODEL_IDfor local fallback (default:distilgpt2; use a stronger local model if you need better offline answers).
- Set
-
Run the Application:
python app.py
-
Interact:
- Open the provided local URL (usually
http://127.0.0.1:7860). - (Optional) Provide a Hugging Face token and router-supported model ID for cloud inference (default:
meta-llama/Meta-Llama-3-8B-Instruct). - Without a token, the app uses a local fallback model (
LOCAL_MODEL_ID, default:distilgpt2; quality is limited—use router + token for good answers or set a stronger local model). - Upload a PDF and click "Initialize System".
- Start chatting!
- Open the provided local URL (usually
- Create a new Space on Hugging Face (SDK: Gradio).
- Upload the contents of
rag_agent_projectto the Space. - Ensure
requirements.txtis present. - The app will build and launch automatically.
- Or click the “Deploy to Spaces” button above for a one-touch setup via
https://anandharajan-rag-langgraph.hf.space, which clones this template Space (Anandharajan/RAG_LangGraph) and preselects the Gradio SDK.
- LLM: HF router (with token, default
meta-llama/Meta-Llama-3-8B-Instruct) or local transformers fallback (LOCAL_MODEL_ID, defaultdistilgpt2; change to a stronger model if running locally). - Embeddings: sentence-transformers/all-MiniLM-L6-v2
- Vector Store: FAISS
- Orchestration: LangGraph (retrieve → generate) RAG prompt with retrieval context
- Add your
HUGGINGFACEHUB_API_TOKENas a secret for router usage. - If you want to pin a different router model, set
HF_MODEL_IDin the Space variables. OverrideLOCAL_MODEL_IDif you want a specific offline fallback. - The
data/folder is persisted for uploads and FAISS index; it is git-ignored here but created at runtime. - Entry point is
app.py;demo.queue().launch()is enabled for Spaces concurrency. - Current status: build verified on HF Space
Anandharajan/RAG_LangGraphwith Python 3.10 (viaspace.yaml/runtime.txt).