Skip to content

AreeshaM/MediAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MediAgent

Autonomous Clinical Decision Support System

Python XGBoost Streamlit Groq License


Overview

MediAgent is an end-to-end autonomous clinical decision support system that predicts hospital readmission risk for diabetic patients using a 4-agent AI pipeline. It combines machine learning, explainable AI, and large language models to generate physician-ready PDF reports — bridging the gap between data science and clinical practice.

Built on 101,766 real-world patient records from 130 US hospitals (1999-2008).


Problem Statement

Hospital readmissions within 30 days cost the US healthcare system $26 billion annually. Early identification of high-risk patients enables targeted interventions that reduce readmissions and improve patient outcomes.

MediAgent addresses this by providing physicians with instant, explainable, AI-generated risk assessments at the point of care.


System Architecture

+----------------------------------------------------------+
|                    MEDIAGENT PIPELINE                    |
|                                                          |
|   Patient Data                                           |
|       |                                                  |
|       v                                                  |
|   +---------------+                                      |
|   |  Data Agent   |  Clean, encode, preprocess           |
|   +-------+-------+                                      |
|           |                                              |
|           v                                              |
|   +---------------+                                      |
|   |  Risk Agent   |  XGBoost prediction + SHAP           |
|   +-------+-------+                                      |
|           |                                              |
|           v                                              |
|   +---------------+                                      |
|   |   LLM Agent   |  Groq LLaMA 3.3-70B summary         |
|   +-------+-------+                                      |
|           |                                              |
|           v                                              |
|   +---------------+                                      |
|   | Report Agent  |  Auto-generated physician PDF        |
|   +---------------+                                      |
+----------------------------------------------------------+

Key Features

Feature Description
Multi-Agent Architecture 4 independent agents each with single responsibility
XGBoost Prediction Trained on 101,766 patient records, 89% accuracy
SHAP Explainability Top risk factors identified per patient
LLM Clinical Summary Groq LLaMA 3.3-70B generates physician-ready summaries
Auto PDF Reports Color-coded, professional reports downloadable instantly
Streamlit Interface Interactive web app, no technical knowledge required
Secure by Design API keys in .env, never hardcoded

Model Performance

Metric Value
Dataset Size 101,766 records
Records After Cleaning 99,492
Model Accuracy 89%
Readmission Rate 11.2%
Top Risk Factor number_inpatient
Training Algorithm XGBoost (100 estimators)

SHAP — Top Risk Factors Identified

  1. number_inpatient — Previous inpatient visits
  2. time_in_hospital — Length of current stay
  3. number_emergency — Emergency visit history
  4. number_diagnoses — Complexity of medical profile
  5. age — Patient age group

Tech Stack

Machine Learning: XGBoost, scikit-learn, SHAP, pandas, NumPy

AI / LLM: Groq API (LLaMA 3.3-70B-Versatile)

Web and Deployment: Streamlit, Hugging Face Spaces

Report Generation: fpdf2

Dev Tools: Python 3.14, Git, VS Code, python-dotenv


Project Structure

MediAgent/
├── agents/
│   ├── data_agent.py       # Agent 1: Data cleaning and preprocessing
│   ├── risk_agent.py       # Agent 2: XGBoost prediction + SHAP
│   ├── llm_agent.py        # Agent 3: LLM clinical summary
│   └── report_agent.py     # Agent 4: PDF report generation
├── data/
│   └── clean_data.csv      # Preprocessed dataset
├── models/
│   ├── risk_model.pkl      # Trained XGBoost model
│   └── feature_names.pkl   # Feature alignment
├── utils/
│   └── data_cleaner.py     # Preprocessing pipeline
├── reports/                # Generated PDF reports
├── app.py                  # Streamlit web interface
├── main.py                 # Pipeline orchestrator
├── requirements.txt        # Dependencies
└── .env.example            # Environment variables template

Setup and Installation

# 1. Clone the repository
git clone https://github.com/AreeshaM/MediAgent.git
cd MediAgent

# 2. Create virtual environment
python -m venv venv
venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Set up environment variables
# Create .env file and add your GROQ_API_KEY

# 5. Download dataset
# Place diabetic_data.csv in data/ folder
# Dataset: https://www.kaggle.com/datasets/brandao/diabetes

# 6. Train the model
python models/risk_model.py

# 7. Run the app
streamlit run app.py

Environment Variables

Create a .env file in the root directory:

GROQ_API_KEY=your_groq_api_key_here

Get your free API key at: https://console.groq.com


Author

Areesha Mubeen Data Analyst and ML Engineer BS Computer Systems Engineering — Riphah International University areeshamubeen85@gmail.com GitHub: https://github.com/AreeshaM


License

This project is licensed under the MIT License.


Disclaimer: MediAgent is an AI-assisted tool intended to support, not replace, clinical judgment. Always consult a qualified physician before making medical decisions.

About

Autonomous Clinical Decision Support System using Multi-Agent AI, XGBoost, SHAP, and LLM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages