MedGuide is an intelligent AI-powered application that interprets blood and lab test reports, generates structured summaries, and enables contextual chat about health results.
It combines multi-agent reasoning, vector-based retrieval (LanceDB), and LLMs (OpenAI GPT + Cohere Reranker) to deliver factual, explainable, and user-friendly insights.
- π§ AI-driven report analysis: Automatically extracts and analyzes lab report values.
- π Page-wise interpretation: Asynchronous agents process each page independently.
- π Structured final report: Merges multi-page insights into one comprehensive summary.
- π¬ Conversational interface: Chat with your analyzed reports using hybrid RAG retrieval.
- β‘ Hybrid search: Combines vector + keyword search (LanceDB + Cohere reranker).
- π§© Local knowledge base: Builds an offline LanceDB store for efficient querying.
- π§ββοΈ Safe AI design: Focused on factual, educational insights β not medical advice.
MedGuide/
β
βββ app/
β βββ streamlit_app.py # Main Streamlit UI
β βββ app.py # FastAPI REST backend
β βββ main.py # CLI pipeline for local testing
β βββ data/ # Stores processed text, LanceDB, and uploads
β
βββ agents/
β βββ analyzer_agent.py # Interprets lab values page-wise
β βββ chat_agent.py # Conversational RAG agent
β βββ document_extraction_agent.py # Extracts test names, values, ranges
β βββ final_report_agent.py # Combines all pages into a summary report
β
βββ utils/
β βββ pdf_extractor.py # Extracts text from PDFs using PyMuPDF
β βββ pdf_to_txt.py # Converts PDFs into text for preprocessing
β
βββ vectordb/
β βββ create_vector_db.py # Builds LanceDB-based knowledge base
β
βββ data/
β βββ knowledge_base/ # (Ignored in Git) Private extracted data
β βββ sample_reports/ # Example lab reports
β βββ lancedb/ # Vector DB storage
β
βββ requirements.txt # Dependencies
βββ .gitignore # Ignore rules
βββ README.md # Documentation
Upload any PDF-based lab report via the Streamlit interface.
document_extraction_agent parses test names, values, and reference ranges.
analyzer_agent interprets results and gives concise, factual explanations.
final_report_agent merges all insights into a single structured health report.
create_vector_db.py generates a LanceDB vector store for RAG retrieval.
chat_agent enables conversations with your report using OpenAI GPT + Cohere.
git clone https://github.com/Sahil0015/MedGuide.git
cd MedGuidepython -m venv venv
venv\Scripts\activate # On Windows
# source venv/bin/activate # On Mac/Linuxpip install -r requirements.txtCreate a .env file in the project root:
OPENAI_API_KEY=your_openai_api_key
COHERE_API_KEY=your_cohere_api_key
streamlit run app/streamlit_app.py| Category | Technologies |
|---|---|
| Frontend | Streamlit |
| LLMs & Agents | OpenAI GPT, Agno, LangChain |
| Retrieval & DB | LanceDB, RedisVL, Cohere Reranker |
| PDF Processing | PyMuPDF, Tantivy |
| Utilities | Python-dotenv, TQDM, Pandas |
| File | Purpose |
|---|---|
app/streamlit_app.py |
Main Streamlit interface |
app/app.py |
FastAPI REST backend (upload, process, chat endpoints) |
app/main.py |
CLI pipeline runner for local testing |
vectordb/create_vector_db.py |
Creates LanceDB vector store |
utils/pdf_extractor.py |
PDF text extraction |
utils/pdf_to_txt.py |
Converts PDF to text |
agents/*.py |
Multi-agent logic for report extraction, analysis, and chat |
Your .gitignore excludes:
data/knowledge_base/
venv/
.env
.cache/
__pycache__/
β Ensuring sensitive or auto-generated data is never uploaded.
- Launch the Streamlit app.
- Input your API keys and initialize.
- Upload a lab report (PDF).
- AI agents extract, analyze, and summarize results.
- Review the generated final report.
- Chat interactively with the app about your results.
MedGuide provides AI-generated educational insights based on lab data.
It is not a medical diagnostic tool β always consult a certified doctor for medical decisions.
- Expand multi-language report interpretation.
- Support FHIR / HL7 medical data formats.
- Implement persistent chat memory across sessions and report history.
- Add user authentication for personalized report management.
- Deploy as a Docker container on cloud platforms (AWS, GCP, Azure).
This project is licensed under the MIT License.
- OpenAI for GPT models
- Cohere for Reranker API
- LanceDB for high-speed vector storage
- Streamlit for a smooth frontend experience
- Agno and LangChain for agent orchestration
Pull requests and suggestions are welcome!
If youβd like to contribute, please fork the repository and create a PR.
π¨βπ» Author: Sahil Aggarwal
π GitHub: Sahil0015
βοΈ Email: sahilaggarwal1532003@gmail.com