Disha 🧭

Automated Market Intelligence & Career Optimization Platform for India's AI/ML job landscape.

A production-grade, multi-agent system built on LangGraph that scrapes corporate career pages and financial data, performs investment analysis, and matches opportunities against a hyper-personalized user profile — orchestrated through a Supervisor pattern with cyclic state management.

What It Does

Disha answers questions like:

"Find Agentic AI and backend roles in Bangalore above 20 LPA"
"Should I apply to Razorpay or Swiggy given my current skill set?"
"What LLMOps skills am I missing for Staff ML Engineer roles?"
"Suggest an ArXiv-backed learning roadmap for my skill gaps"

It responds with structured recommendations — scored, ranked, with compensation fit, skill overlap, and explicit reasoning — not generic LLM output.

Current Status

Component	Status	Notes
Supervisor orchestration	✅ Working	Cyclic routing, iteration guard, guardrails
Career scoring engine	✅ Working	Skill match %, LPA benchmarking, experience fit
Financial analyst	✅ Working	Burn multiple, ESOP, runway scoring (India-first)
Learning companion	✅ Working	Gap analysis, ArXiv roadmap, phase-based curriculum
FastAPI + SSE gateway	✅ Working	`/api/chat` sync + `/api/chat/stream` SSE
RSS feed ingestion	✅ Working	Live feeds via `feedparser`
Live Playwright scraping	✅ Working	Real browser rendering via `sync_playwright`
LLM job extraction	✅ Working	Gemini `with_structured_output` from scraped pages
Gemini learning companion	✅ Working	Dynamic gap analysis and ArXiv roadmap
LLM resume evaluation	✅ Working	Gemini `with_structured_output` tool
pgvector schema	✅ Working	`Vector(768)` columns + native `cosine_distance` queries
Greenhouse API integration	🔧 Phase 2	Structured JSON ingestion next priority
Next.js frontend	🔧 Phase 3	Architecture documented; implementation pending

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                          SUPERVISOR AGENT                               │
│   Intent Analysis • Dynamic Delegation • Aggregation • Iteration Guard │
└──────────────────────────────┬──────────────────────────────────────────┘
                               │
         ┌─────────────────────┼──────────────────────┐
         ▼                     ▼                      ▼
┌──────────────┐      ┌────────────────┐    ┌─────────────────┐
│  SCRAPER     │      │  FINANCIAL     │    │  CAREER         │
│  AGENT       │      │  ANALYST       │    │  STRATEGY       │
│              │      │                │    │                 │
│ • Playwright │      │ • ARR Growth   │    │ • Skill Match   │
│ • RSS Feeds  │      │ • Burn Multiple│    │ • LPA Bench.    │
│ • Gemini Ext │      │ • Runway       │    │ • India Filter  │
│ • (Greenhouse│      │ • ESOP Score   │    │ • Priority Rank │
│    API P2)   │      │ • Risk Flags   │    │                 │
└──────┬───────┘      └────────┬───────┘    └────────┬────────┘
       │                       │                     │
       │                       │                     ▼
       │                       │            ┌─────────────────┐
       │                       │            │  LEARNING       │
       │                       │            │  COMPANION      │
       │                       │            │                 │
       │                       │            │ • Gap Analysis  │
       │                       │            │ • ArXiv Papers  │
       │                       │            │ • Phase Roadmap │
       │                       │            │ • LLMOps/MLOps  │
       └───────────────────────┼────────────┴────────┬────────┘
                               ▼                     ▼
                    ┌──────────────────────────────────────┐
                    │           GUARDRAIL NODE             │
                    │  Domain Filter • Visa Strip • Dedup  │
                    └─────────────────┬────────────────────┘
                                      ▼
                    ┌──────────────────────────────────────┐
                    │          SYNTHESIZE NODE             │
                    │  Final Answer • Citations • Score    │
                    └─────────────────────────────────────┘

Agents are routed sequentially by the Supervisor based on query intent — not in parallel. A career query routes: scraper → career_strategy → [learning_companion] → guardrail → synthesize. A financial query routes: scraper → financial_analyst → guardrail → synthesize.

Core Components

Component	Technology	Responsibility
Supervisor	LangGraph + Pydantic	Cyclic orchestration, intent routing, max-6 iteration guard
Scraper Agent	Playwright, `feedparser`, Gemini	Live Playwright scraping, RSS feeds, LLM job extraction
Financial Analyst	Custom scoring engine	ARR growth, burn multiple, ESOP transparency, runway — India private-market metrics
Career Strategy	Skill-gap + comp matching	Stack extraction, INR/LPA benchmarking, city/remote filter, priority ranking
Learning Companion	Gap analysis + Gemini AI	Generates dynamic ArXiv roadmap and phases using LLM
Guardrail Node	Rule-based filter	Strips excluded domains (HFT, firmware), deduplicates before synthesis
Knowledge Base	PostgreSQL + pgvector (async)	Vector search over jobs/resumes/papers, LangGraph checkpoints
API Gateway	FastAPI + SSE	`/api/chat` sync, `/api/chat/stream` SSE, `/health`, `/api/v1/status`
Frontend	Next.js 14 + Tailwind + Shadcn/UI	Chat UI, job dashboard, learning roadmap (Phase 3)

Quick Start

git clone https://github.com/anmolsharma152/Disha.git
cd Disha

python -m venv venv
source venv/bin/activate

pip install -r requirements.txt

# Run a query directly
python main.py "Find Agentic AI and backend roles in Bangalore"

# Stream output
python main.py "Should I invest in Indian AI companies?" --stream

# JSON output
python main.py "Analyze Razorpay financial health" --json

# Start the API server
uvicorn api.server:app --reload --host 0.0.0.0 --port 8000

Then hit the API:

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "Find Agentic AI roles in Bangalore above 20 LPA"}'

API docs available at http://localhost:8000/docs.

Example Output

### 1. Senior AI/ML Engineer — Agentic Workflows @ Razorpay — 67.4/100 (MEDIUM)
- Location: Bangalore | Remote: Hybrid
- Base: ₹55 LPA | Est. Total: ₹73 LPA
- Skill Match: 68.4% (LangGraph, LangChain, Kubernetes, MLflow, vLLM)
- Gaps: multi-agent orchestration, model serving, drift detection
- Experience Fit: stretch
- Apply: razorpay.com/careers

Project Structure

Disha/
├── main.py                  # LangGraph compilation, CLI entry point
├── schemas.py               # Pydantic v2 models: CompanyMetrics, JobOpening, AgentState
├── README.md
├── agents/
│   ├── scraper_agent.py     # India-focused scraping pipeline
│   ├── financial_agent.py   # Private-market valuation (burn multiple, ESOP, runway)
│   ├── career_agent.py      # Skill-gap scoring, LPA benchmarking
│   ├── supervisor_agent.py  # Cyclic routing + guardrail pre-synthesis
│   └── learning_agent.py    # ArXiv gap analysis, phase roadmap
├── api/
│   └── server.py            # FastAPI + SSE endpoints
├── storage/
│   └── db.py                # Async SQLAlchemy 2.0 + pgvector scaffold
├── tools/
│   ├── scraper_tools.py     # RSS + Playwright (live)
│   └── career_tools.py      # Resume evaluation (Gemini, live)
└── frontend/
    └── README.md            # Next.js 14 architecture spec (Phase 3)

Configuration

Disha's profile matching is fully configurable via user_profile.yaml. The default profile targets India-based AI/ML engineering roles:

Parameter	Default
Target roles	AI/ML Engineer, LLM Engineer, LLMOps Engineer, ML Platform Engineer
Target cities	Bangalore, Delhi NCR, Pune, Hyderabad, Remote India
Salary floor	₹20 LPA base
Excluded domains	HFT, embedded, firmware, kernel

Also, add your Google Gemini API key to a .env file in the root directory:

GEMINI_API_KEY="your_api_key_here"

Roadmap

Phase 1 — Modular Framework & Async Postgres Scaffold ✅

Supervisor-Specialist multi-agent architecture (LangGraph)
FastAPI gateway with SSE streaming
Async PostgreSQL + pgvector schema (SQLAlchemy 2.0)
India job localization — INR/LPA benchmarking, city filters
Financial scoring engine — burn multiple, ESOP, runway (India private-market)
Career scoring engine — skill overlap, comp fit, experience fit
Guardrail node — domain/tech exclusions pre-synthesis
Demo pipeline — real Playwright scraping + Gemini extraction + deterministic scoring

Phase 2 — Live Data & LLM Integration 🔧

Live Playwright scraping (real browser rendering, not stub)
LLM job extraction (Gemini with_structured_output)
Dynamic Generative AI Learning Companion (Gemini 2.5 Flash integrated)
LLM-based resume evaluation (Gemini with_structured_output)
pgvector schema + native cosine_distance queries
Greenhouse API integration (structured JSON ingestion)
Lever API integration
Error propagation (activate error_recovery node)
Cover letter generator

Phase 3 — Frontend & Deployment 🔧

Next.js 14 chat UI with SSE streaming
Job dashboard — filterable cards, skill gap bars, one-click apply
Learning roadmap UI — phase cards, paper viewer, progress tracking
Deployment — Vercel + Railway + Neon Postgres

Phase 4 — Production Integrations 🔧

MCP servers — LinkedIn, Glassdoor, Yahoo Finance, Wellfound
PDF parsing — earnings transcripts, resume analysis
Automated email digests — daily market scans, weekly match refresh
LangSmith tracing, cost/token observability
Circuit breakers — per-domain failure tracking, fallback chains

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disha 🧭

What It Does

Current Status

Architecture

Core Components

Quick Start

Example Output

Project Structure

Configuration

Roadmap

Phase 1 — Modular Framework & Async Postgres Scaffold ✅

Phase 2 — Live Data & LLM Integration 🔧

Phase 3 — Frontend & Deployment 🔧

Phase 4 — Production Integrations 🔧

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
agents		agents
api		api
docs		docs
frontend		frontend
storage		storage
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt
schemas.py		schemas.py
user_profile.yaml		user_profile.yaml

Folders and files

Latest commit

History

Repository files navigation

Disha 🧭

What It Does

Current Status

Architecture

Core Components

Quick Start

Example Output

Project Structure

Configuration

Roadmap

Phase 1 — Modular Framework & Async Postgres Scaffold ✅

Phase 2 — Live Data & LLM Integration 🔧

Phase 3 — Frontend & Deployment 🔧

Phase 4 — Production Integrations 🔧

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages