Skip to content

Sanjavan7/Drugreuse

Repository files navigation

Synaptix - It was always there.

"7,000 rare diseases have no treatment. What if the cure already exists in a different pill bottle?"

Live Demo: http://144.202.54.117

Synaptix is a collaborative, AI-powered drug repurposing engine that uses knowledge graph embeddings, molecular similarity analysis, and autonomous research agents to discover new therapeutic uses for existing FDA-approved drugs. Built for Hacklytics 2026 (Georgia Tech Healthcare Track).


How It Works

The Problem

Drug development costs $2.6B and takes 10-15 years. 7,000+ rare diseases have zero approved treatment. Before AI, drug repurposing happened by accident (Viagra took years to discover), by one researcher's hunch (Aspirin for heart attacks took 30 years to validate), or by $300M AI companies like BenevolentAI.

Synaptix gives that same power to any researcher, for free.

Dual-Signal Prediction Engine

A single scoring method isn't reliable -- drugs that look promising biologically might be chemically implausible. Synaptix fuses two independent signals that no existing tool combines:

Knowledge Graph Signal (70%) -- 97,238 biological entities embedded in 400-dimensional space, trained on 5.8M relationships from 6 medical databases (DrugBank, Hetionet, GNBR, STRING, IntAct, DGIdb). We indexed all 24,313 drug embeddings in Actian VectorAI DB with HNSW indexing -- turning O(n) brute-force search into sub-millisecond nearest-neighbor retrieval. This is critical for interactive exploration where researchers query dozens of diseases in a session.

Molecular Signal (30%) -- RDKit computes Morgan fingerprints (2048-bit, radius-2) for 8,807 drugs and scores candidates by Tanimoto similarity to known treatments. A drug must pass both biological AND chemical tests to rank high.

Explainable Predictions

Scores alone don't help researchers. Synaptix explains every prediction at three levels:

  • Biological Pathways: BFS traversal through the knowledge graph finds chains like Drug -> binds Gene -> participates in Pathway -> disrupted in Disease
  • Plain English: Google Gemini 2.0 Flash translates complex paths into language anyone understands -- integrated across the platform for drug explanations, 4-section research briefs for no-cure diseases, and real-time PubMed paper extraction in our AI research agents
  • Interactive Research Assistant: A context-aware chat interface powered by Gemini that understands the current disease, selected drug, prediction scores, and active pathway -- researchers can ask follow-up questions naturally

Living Knowledge Graph

Static databases go stale. DRKG was built in 2020. Synaptix turns a frozen snapshot into a collaborative, living graph:

  • Researchers contribute new edges with evidence (PubMed links)
  • Peer review -- agree/disagree/uncertain voting on any edge
  • Four AI Research Agents continuously scan external sources:
    • PubMed scanner extracts drug-disease relationships from recent papers using Gemini
    • Clinical Trial monitor queries ClinicalTrials.gov for active trials
    • PubChem scanner checks drug-likeness, bioactivity assays, and structural analogs via PUG REST API
    • Hypothesis Validator cross-references all sources to compute external evidence scores (0.0-1.0)

Every new edge makes every future prediction smarter.

Rare Disease Discovery

We systematically scanned every disease in DRKG and identified 1,434 diseases with zero known treatment. For the top 100, we generated drug candidates -- 500 new drug-disease predictions that didn't exist before Synaptix. For 7 critical diseases (ALS, Huntington's, Glioblastoma, Pancreatic Cancer, Lupus, CJD, Fibromyalgia), we provide full research briefs and exportable CSV reports.

Cloud-Native Deployment

The entire platform runs on Vultr Cloud Compute -- Dockerized FastAPI backend with nginx reverse proxy. Background async startup pre-caches predictions for 21 diseases in ~50 seconds with a real-time progress ring. Cached predictions serve in <0.15 seconds.

The Experience

Inspired by retro computing aesthetics, Synaptix features a Windows 95-themed interface with a cinematic CRT boot sequence, 3D force-directed knowledge graph with interactive legend filtering and node inspection, and a floating research assistant -- making complex biological data feel approachable and explorable.


Quick Start (For Teammates)

Prerequisites

Tool Version Check
Python 3.10+ python3 --version
Node.js 18+ node --version
Git any git --version

Step 1: Clone and switch to branch

git clone https://github.com/Sanjavan7/Drugreuse.git
cd Drugreuse
git checkout main

Step 2: Download DRKG data (REQUIRED -- not in GitHub)

The Drug Repurposing Knowledge Graph (~500MB) is too large for Git. You must download it manually:

mkdir -p data/drkg
cd data/drkg
curl -O https://dgl-data.s3-us-west-2.amazonaws.com/dataset/DRKG/drkg.tar.gz
tar -xvzf drkg.tar.gz
cd ../..

This extracts:

  • drkg.tsv -- 5.8M biomedical relationships (366MB)
  • embed/ -- Pre-trained TransE embeddings (entity + relation vectors)
  • drugbank_info/ -- DrugBank metadata (weights, SMILES, molecule types)
  • drug_repurpose/ -- Example notebooks and COVID-19 repurposing data

Step 3: Python backend setup

python3 -m venv venv
source venv/bin/activate        # Mac/Linux
# venv\Scripts\activate         # Windows
pip install -r requirements.txt

Dependencies installed: FastAPI, Uvicorn, NumPy, RDKit (molecular chemistry), Google Generative AI (Gemini), PubChemPy (PubChem API), Actian VectorAI DB.

Step 4: Create .env file

echo "GEMINI_API_KEY=your_key_here" > .env

Get a free Gemini API key from: https://aistudio.google.com/apikey

Note: The platform works without a Gemini key -- you just won't get AI-generated explanations, research briefs, and the research assistant chat. All ML predictions, paths, and scores still work.

Step 5: Start the backend

source venv/bin/activate
uvicorn backend.main:app --reload --port 8000

First startup takes ~5 minutes. The backend:

  1. Loads TransE embeddings (~148MB entity vectors)
  2. Builds in-memory adjacency list from 5.8M edges
  3. Initializes molecular similarity engine (RDKit)
  4. Pre-caches predictions for all 21 diseases (instant responses after startup)
  5. Runs model validation (Hits@10, Hits@50, MRR)

Wait for the log message: Backend ready!

Health check: http://localhost:8000/health

Step 6: Frontend setup (new terminal)

cd drugreuse-frontend
npm install
npm run dev

Step 7: Open browser

Go to http://localhost:5173

Type a disease name (e.g., "Alzheimer's Disease") to see drug repurposing candidates.


Project Structure

Synaptix/
|
|-- backend/
|   |-- main.py                     # FastAPI app, all API endpoints, Gemini assistant, startup caching
|   |-- ml/
|   |   |-- kg_predictor.py         # TransE knowledge graph scoring engine
|   |   |-- path_explainer.py       # BFS path finder through knowledge graph
|   |   |-- combined_scorer.py      # 0.7*KG + 0.3*Mol combined ranking
|   |   |-- molecular_similarity.py # RDKit molecular fingerprint similarity
|   |   |-- llm_explainer.py        # Gemini-powered natural language explanations
|   |   |-- entity_names.py         # DrugBank ID -> human-readable name resolution
|   |   |-- vector_db.py            # Actian VectorAI DB integration
|   |   |-- validation.py           # Hits@K and MRR validation metrics
|   |-- graph/
|   |   |-- living_graph.py         # SQLite-backed collaborative knowledge graph
|   |-- agents/
|   |   |-- research_agents.py      # 4 autonomous AI research agents
|   |-- pipelines/
|   |   |-- rare_diseases.py        # Batch pipeline for 100 rare diseases
|   |-- deploy/
|       |-- vultr_deploy.py         # Vultr cloud provisioning via API
|
|-- drugreuse-frontend/
|   |-- src/
|   |   |-- App.jsx                 # Root component, layout orchestration, Win95 loading screens
|   |   |-- api/
|   |   |   |-- client.js           # Axios API client (all backend endpoints)
|   |   |-- components/
|   |   |   |-- SearchBar.jsx       # Disease search with autocomplete
|   |   |   |-- CandidateList.jsx   # Ranked drug candidates with molecular structures & research briefs
|   |   |   |-- GraphVisualization.jsx  # 3D force-directed knowledge graph (Win95 style)
|   |   |   |-- PathExplainer.jsx   # Visual biological pathway explanation
|   |   |   |-- ValidationPanel.jsx # Model accuracy metrics display
|   |   |   |-- QuickTags.jsx       # One-click disease quick-select buttons
|   |   |   |-- DiscoveryDashboard.jsx  # 100 rare disease discovery grid
|   |   |   |-- AgentsPanel.jsx     # AI research agents control panel
|   |   |   |-- ContributeModal.jsx # Add edges/entities to knowledge graph
|   |   |   |-- ActivityFeed.jsx    # Real-time activity sidebar with peer review
|   |   |   |-- GraphStats.jsx      # Live graph statistics banner
|   |   |   |-- ImpactDashboard.jsx # Animated platform metrics counters
|   |   |   |-- DiscoveryFeed.jsx   # Chronological discovery feed
|   |   |   |-- ContributorProfiles.jsx # Researcher & agent profiles
|   |   |   |-- IdentityModal.jsx   # User login/identity management
|   |   |   |-- Navbar.jsx          # Win95 taskbar with system status + auth controls
|   |   |   |-- TerminalIntro.jsx   # Cinematic CRT boot sequence (3D Macintosh animation)
|   |   |   |-- RetroAssistant.jsx  # Floating Gemini-powered research chat assistant
|   |   |-- hooks/
|   |   |   |-- usePrediction.js    # React hook for prediction state management
|   |   |-- styles/
|   |       |-- globals.css         # Win95 design system, CRT effects, CSS variables, animations
|   |-- package.json
|   |-- vite.config.js
|
|-- data/
|   |-- drkg/                       # DRKG dataset (download separately, ~500MB)
|   |   |-- drkg.tsv                # 5.8M biomedical triples
|   |   |-- embed/                  # TransE embeddings (.npy files)
|   |   |-- drugbank_info/          # Drug metadata (weights, SMILES, types)
|   |   |-- drug_repurpose/         # Example notebooks
|   |-- rare_disease/               # Generated rare disease predictions
|       |-- no_cure_candidates.json # 100 diseases x top candidates
|
|-- Dockerfile                      # Production container (Python 3.11-slim)
|-- docker-compose.yml              # Actian VectorAI DB service
|-- requirements.txt                # Python dependencies
|-- deploy.sh                       # Docker build/push/run deployment
|-- deploy_vultr.sh                 # One-command Vultr cloud deployment
|-- .env                            # API keys (not in git)

Features

Core ML Engine

  • TransE Knowledge Graph Predictions -- Scores drug-disease pairs using pre-trained TransE embeddings on DRKG (97K entities, 5.8M relationships)
  • Molecular Similarity Scoring -- RDKit fingerprint-based chemical similarity between candidates and known treatments
  • Combined Ranking -- Weighted fusion (70% KG + 30% molecular) for robust predictions
  • BFS Path Explainer -- Finds biological pathways connecting drugs to diseases through genes, proteins, and pathways
  • AI Explanations -- Gemini 2.0 Flash generates natural language explanations of why each drug might work
  • Model Validation -- Hits@10, Hits@50, and MRR metrics against known drug-disease associations

Living Knowledge Graph

  • Collaborative Contributions -- Anyone can add new edges (drug-gene-disease relationships) with evidence
  • Peer Review System -- Agree/disagree/uncertain voting on contributed edges
  • Real-time Activity Feed -- See what humans and AI agents are discovering
  • Entity Autocomplete -- Search across 97K+ biological entities

AI Research Agents

  • PubMed Paper Scanner -- Searches NCBI for relevant papers, extracts drug-disease relationships using Gemini
  • Clinical Trial Monitor -- Queries ClinicalTrials.gov API v2 for active/recruiting trials
  • PubChem Bioactivity Scanner -- Checks Lipinski drug-likeness, bioactivity assays, and structural analogs
  • Hypothesis Validator -- Cross-references all evidence sources to compute confidence scores

Rare Disease Discovery

  • 100 No-Cure Diseases -- Pre-computed candidates for diseases with no approved treatment
  • Frontier Predictions -- Highlighted section for ALS, Huntington's, Glioblastoma, Pancreatic Cancer, Lupus, CJD, Fibromyalgia
  • Research Briefs -- AI-generated summaries of the therapeutic hypothesis for each candidate
  • CSV Export -- Download predictions for individual diseases or the entire discovery set

Collaboration Platform

  • User Profiles -- Researcher/Student/Clinician identity with contribution tracking
  • Contributor Leaderboard -- See who's contributing the most to the knowledge graph
  • Impact Dashboard -- Animated counters showing platform-wide metrics
  • Discovery Feed -- Live feed of all human and AI contributions

Interactive 3D Knowledge Graph

  • 3D Force-Directed Graph -- Three.js-powered visualization with clean solid spheres and wireframe outlines, running on react-force-graph-3d
  • Interactive Legend Filtering -- Click any node type (Drug, Gene, Pathway, Disease) or relation type in the legend to toggle visibility; hidden items dim with strikethrough
  • Node Click Detail Panel -- Click any node to open a Win95-style inspector showing type, connection count, relations breakdown, repurposing score, and clickable neighbor list
  • Drug Path Highlighting -- Select a drug candidate to illuminate its biological pathway through the graph with animated directional particles
  • Live Physics -- Drag any node and the entire graph reacts (d3 force reheat); auto-orbits when idle
  • PubChem Molecular Structures -- 2D structure images for every drug candidate (in CandidateList)
  • Biological Pathway Diagrams -- Visual step-by-step path from drug to disease (in PathExplainer)

Research Assistant

  • Context-Aware Chat -- Floating Win95-style chatbot powered by Gemini 2.0 Flash that understands the current disease, selected drug, prediction scores, and active pathway
  • Conversation History -- Maintains last 8 messages for multi-turn dialogue
  • Graceful Fallback -- Works without Gemini API key via rule-based pattern matching

Windows 95 Interface

  • CRT Boot Sequence -- Full-screen 3D Macintosh animation with phosphor scanlines, dust particles, and terminal startup messages (Framer Motion)
  • Win95 Design System -- Authentic beveled borders, navy gradient titlebars, beige panels, Tahoma font, system status taskbar
  • Win95 Loading Screens -- Progress bars in inset panels with navy titlebars for all loading states

Tech Stack

Layer Technology
ML Engine TransE embeddings, NumPy, BFS graph traversal
Chemistry RDKit (molecular fingerprints, Tanimoto similarity)
Knowledge Graph DRKG (5.8M triples from DrugBank, GNBR, Hetionet, STRING, IntAct, DGIdb)
LLM Google Gemini 2.0 Flash (explanations, research briefs, assistant chat, paper extraction)
Vector DB Actian VectorAI DB (HNSW-indexed embedding similarity search)
Backend FastAPI, Uvicorn, Python 3.11
Frontend React 19, Vite 7, Axios, Framer Motion
Visualization react-force-graph-3d, Three.js (3D force-directed graph)
Graph Store SQLite (WAL mode, living knowledge graph)
External APIs PubMed E-utilities, ClinicalTrials.gov v2, PubChem PUG REST
Deployment Docker, nginx, Vultr Cloud Compute

API Reference

Prediction Endpoints

Method Path Description
GET /api/diseases List all supported diseases
GET /api/predict/{disease}?top_k=10 Get ranked drug candidates
GET /api/viz-graph/{disease} Get knowledge subgraph for visualization
GET /api/paths/{disease}/{drug} Compute biological pathways on-demand
GET /api/explain/{disease}/{drug_name} Generate Gemini explanation per drug
GET /api/drug/{drug_id} Get detailed drug info (DrugBank)
GET /api/validation Model validation metrics
GET /api/frontier No-cure diseases with candidate counts
GET /api/rare-diseases Full rare disease discovery data

Assistant Endpoint

Method Path Description
POST /api/assistant/chat Context-aware Gemini chat (sends disease, drug, scores, pathway context)

Living Graph Endpoints

Method Path Description
POST /api/graph/add-edge Add a new relationship
POST /api/graph/add-entity Add a new entity
GET /api/graph/stats Graph statistics
GET /api/graph/recent?limit=20 Recent contributions
GET /api/graph/search-entities?q=aspirin Entity autocomplete
POST /api/graph/review/{edge_id} Submit peer review
GET /api/graph/edge-reviews/{edge_id} Get reviews for an edge
GET /api/graph/history/{entity_id} Entity contribution history

AI Agent Endpoints

Method Path Description
POST /api/agents/scan/{disease} Run all 4 research agents
POST /api/agents/validate/{disease} Validate existing predictions
GET /api/agents/status Agent status and discoveries
GET /api/agents/log Chronological agent action log

Collaboration Endpoints

Method Path Description
GET /api/feed?limit=50 Platform activity feed
GET /api/contributors Contributor profiles and stats

Export Endpoints

Method Path Description
GET /api/export/{disease} Download disease predictions as CSV
GET /api/export/rare-diseases Download all 100 disease discoveries as CSV

Utility

Method Path Description
GET /health Health check + startup progress + loaded model status

Environment Variables

Variable Required Description
GEMINI_API_KEY Optional Google Gemini API key for AI explanations and research assistant. Platform works without it.
VULTR_API_KEY Optional Vultr API key for cloud deployment. Only needed for deploy_vultr.sh.
VITE_API_URL Optional Backend URL for frontend production build. Defaults to http://localhost:8000.

Deployment

Docker (Local)

# Start Actian VectorAI DB
docker compose up -d

# Build and run backend
docker build -t drugreuse-backend .
docker run -d -p 8000:8000 drugreuse-backend

Vultr Cloud (Production)

# Set API key
echo "VULTR_API_KEY=your_key" >> .env

# One-command deploy (provisions instance + installs Docker + deploys)
./deploy_vultr.sh

This provisions a Vultr instance, installs Docker, and deploys the backend container.

Current Production Deployment

  • Full app (frontend + backend): http://144.202.54.117
  • Server: Vultr Cloud Compute (2 CPU, 4GB RAM, Chicago)
  • Frontend: Vite React app served via nginx on port 80 (/var/www/synaptix/)
  • Backend: Dockerized FastAPI on port 8000, proxied through nginx (/api/ and /health)
  • Cache headers: no-cache on index.html, 1-year immutable on hashed /assets/
  • All 21 diseases pre-cached -- predictions serve in <0.15s

Team

Name Role
Sanjavan Ghodasara ML Engine + Frontend
Mohit Shah Backend Integration
Quyen Tran Domain Expert + Data Pipeline
Sivan Reddy Molecular Similarity + Graph Viz

Sponsor Tracks

  • Healthcare Track -- Grand Prize
  • Actian VectorAI DB -- Best Use (HNSW-indexed embedding similarity search for 24K+ drug candidates)
  • Vultr Cloud -- Best Use (deployment-ready with one-command cloud provisioning)

License

Built for Hacklytics 2026. DRKG dataset is licensed under CC BY-NC 4.0 by AWS.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors