MedGuide is an intelligent AI-powered application that interprets blood and lab test reports, generates structured summaries, and enables contextual chat about health results.
It combines multi-agent reasoning, vector-based retrieval (LanceDB), and LLMs (OpenAI GPT + Cohere Reranker) to deliver factual, explainable, and user-friendly insights.
- 🧠 AI-driven report analysis: Automatically extracts and analyzes lab report values.
- 📄 Page-wise interpretation: Asynchronous agents process each page independently.
- 📊 Structured final report: Merges multi-page insights into one comprehensive summary.
- 💬 Conversational interface: Chat with your analyzed reports using hybrid RAG retrieval.
- ⚡ Hybrid search: Combines vector + keyword search (LanceDB + Cohere reranker).
- 🧩 Local knowledge base: Builds an offline LanceDB store for efficient querying.
- 🧑⚕️ Safe AI design: Focused on factual, educational insights — not medical advice.
MedGuide/
│
├── app/
│ ├── streamlit_app.py # Main Streamlit UI
│ ├── app.py # FastAPI REST backend
│ ├── main.py # CLI pipeline for local testing
│ ├── data/ # Stores processed text, LanceDB, and uploads
│
├── agents/
│ ├── analyzer_agent.py # Interprets lab values page-wise
│ ├── chat_agent.py # Conversational RAG agent
│ ├── document_extraction_agent.py # Extracts test names, values, ranges
│ ├── final_report_agent.py # Combines all pages into a summary report
│
├── utils/
│ ├── pdf_extractor.py # Extracts text from PDFs using PyMuPDF
│ ├── pdf_to_txt.py # Converts PDFs into text for preprocessing
│
├── vectordb/
│ ├── create_vector_db.py # Builds LanceDB-based knowledge base
│
├── data/
│ ├── knowledge_base/ # (Ignored in Git) Private extracted data
│ ├── sample_reports/ # Example lab reports
│ ├── lancedb/ # Vector DB storage
│
├── requirements.txt # Dependencies
├── .gitignore # Ignore rules
└── README.md # Documentation
Upload any PDF-based lab report via the Streamlit interface.
document_extraction_agent parses test names, values, and reference ranges.
analyzer_agent interprets results and gives concise, factual explanations.
final_report_agent merges all insights into a single structured health report.
create_vector_db.py generates a LanceDB vector store for RAG retrieval.
chat_agent enables conversations with your report using OpenAI GPT + Cohere.
git clone https://github.com/Sahil0015/MedGuide.git
cd MedGuidepython -m venv venv
venv\Scripts\activate # On Windows
# source venv/bin/activate # On Mac/Linuxpip install -r requirements.txtCreate a .env file in the project root:
OPENAI_API_KEY=your_openai_api_key
COHERE_API_KEY=your_cohere_api_key
streamlit run app/streamlit_app.py| Category | Technologies |
|---|---|
| Frontend | Streamlit |
| LLMs & Agents | OpenAI GPT, Agno, LangChain |
| Retrieval & DB | LanceDB, RedisVL, Cohere Reranker |
| PDF Processing | PyMuPDF, Tantivy |
| Utilities | Python-dotenv, TQDM, Pandas |
| File | Purpose |
|---|---|
app/streamlit_app.py |
Main Streamlit interface |
app/app.py |
FastAPI REST backend (upload, process, chat endpoints) |
app/main.py |
CLI pipeline runner for local testing |
vectordb/create_vector_db.py |
Creates LanceDB vector store |
utils/pdf_extractor.py |
PDF text extraction |
utils/pdf_to_txt.py |
Converts PDF to text |
agents/*.py |
Multi-agent logic for report extraction, analysis, and chat |
Your .gitignore excludes:
data/knowledge_base/
venv/
.env
.cache/
__pycache__/
→ Ensuring sensitive or auto-generated data is never uploaded.
- Launch the Streamlit app.
- Input your API keys and initialize.
- Upload a lab report (PDF).
- AI agents extract, analyze, and summarize results.
- Review the generated final report.
- Chat interactively with the app about your results.
MedGuide provides AI-generated educational insights based on lab data.
It is not a medical diagnostic tool — always consult a certified doctor for medical decisions.
- Expand multi-language report interpretation.
- Support FHIR / HL7 medical data formats.
- Implement persistent chat memory across sessions and report history.
- Add user authentication for personalized report management.
- Deploy as a Docker container on cloud platforms (AWS, GCP, Azure).
This project is licensed under the MIT License.
- OpenAI for GPT models
- Cohere for Reranker API
- LanceDB for high-speed vector storage
- Streamlit for a smooth frontend experience
- Agno and LangChain for agent orchestration
Pull requests and suggestions are welcome!
If you’d like to contribute, please fork the repository and create a PR.
👨💻 Author: Sahil Aggarwal
📂 GitHub: Sahil0015
✉️ Email: sahilaggarwal1532003@gmail.com