Skip to content

anmolsharma152/Disha

Repository files navigation

Disha 🧭

Automated Market Intelligence & Career Optimization Platform for India's AI/ML job landscape.

A production-grade, multi-agent system built on LangGraph that scrapes corporate career pages and financial data, performs investment analysis, and matches opportunities against a hyper-personalized user profile — orchestrated through a Supervisor pattern with cyclic state management.

Python 3.12+ LangGraph PostgreSQL FastAPI License: MIT


What It Does

Disha answers questions like:

  • "Find Agentic AI and backend roles in Bangalore above 20 LPA"
  • "Should I apply to Razorpay or Swiggy given my current skill set?"
  • "What LLMOps skills am I missing for Staff ML Engineer roles?"
  • "Suggest an ArXiv-backed learning roadmap for my skill gaps"

It responds with structured recommendations — scored, ranked, with compensation fit, skill overlap, and explicit reasoning — not generic LLM output.


Current Status

Component Status Notes
Supervisor orchestration ✅ Working Cyclic routing, iteration guard, guardrails
Career scoring engine ✅ Working Skill match %, LPA benchmarking, experience fit
Financial analyst ✅ Working Burn multiple, ESOP, runway scoring (India-first)
Learning companion ✅ Working Gap analysis, ArXiv roadmap, phase-based curriculum
FastAPI + SSE gateway ✅ Working /api/chat sync + /api/chat/stream SSE
RSS feed ingestion ✅ Working Live feeds via feedparser
Live Playwright scraping ✅ Working Real browser rendering via sync_playwright
LLM job extraction ✅ Working Gemini with_structured_output from scraped pages
Gemini learning companion ✅ Working Dynamic gap analysis and ArXiv roadmap
LLM resume evaluation ✅ Working Gemini with_structured_output tool
pgvector schema ✅ Working Vector(768) columns + native cosine_distance queries
Greenhouse API integration 🔧 Phase 2 Structured JSON ingestion next priority
Next.js frontend 🔧 Phase 3 Architecture documented; implementation pending

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                          SUPERVISOR AGENT                               │
│   Intent Analysis • Dynamic Delegation • Aggregation • Iteration Guard │
└──────────────────────────────┬──────────────────────────────────────────┘
                               │
         ┌─────────────────────┼──────────────────────┐
         ▼                     ▼                      ▼
┌──────────────┐      ┌────────────────┐    ┌─────────────────┐
│  SCRAPER     │      │  FINANCIAL     │    │  CAREER         │
│  AGENT       │      │  ANALYST       │    │  STRATEGY       │
│              │      │                │    │                 │
│ • Playwright │      │ • ARR Growth   │    │ • Skill Match   │
│ • RSS Feeds  │      │ • Burn Multiple│    │ • LPA Bench.    │
│ • Gemini Ext │      │ • Runway       │    │ • India Filter  │
│ • (Greenhouse│      │ • ESOP Score   │    │ • Priority Rank │
│    API P2)   │      │ • Risk Flags   │    │                 │
└──────┬───────┘      └────────┬───────┘    └────────┬────────┘
       │                       │                     │
       │                       │                     ▼
       │                       │            ┌─────────────────┐
       │                       │            │  LEARNING       │
       │                       │            │  COMPANION      │
       │                       │            │                 │
       │                       │            │ • Gap Analysis  │
       │                       │            │ • ArXiv Papers  │
       │                       │            │ • Phase Roadmap │
       │                       │            │ • LLMOps/MLOps  │
       └───────────────────────┼────────────┴────────┬────────┘
                               ▼                     ▼
                    ┌──────────────────────────────────────┐
                    │           GUARDRAIL NODE             │
                    │  Domain Filter • Visa Strip • Dedup  │
                    └─────────────────┬────────────────────┘
                                      ▼
                    ┌──────────────────────────────────────┐
                    │          SYNTHESIZE NODE             │
                    │  Final Answer • Citations • Score    │
                    └─────────────────────────────────────┘

Agents are routed sequentially by the Supervisor based on query intent — not in parallel. A career query routes: scraper → career_strategy → [learning_companion] → guardrail → synthesize. A financial query routes: scraper → financial_analyst → guardrail → synthesize.


Core Components

Component Technology Responsibility
Supervisor LangGraph + Pydantic Cyclic orchestration, intent routing, max-6 iteration guard
Scraper Agent Playwright, feedparser, Gemini Live Playwright scraping, RSS feeds, LLM job extraction
Financial Analyst Custom scoring engine ARR growth, burn multiple, ESOP transparency, runway — India private-market metrics
Career Strategy Skill-gap + comp matching Stack extraction, INR/LPA benchmarking, city/remote filter, priority ranking
Learning Companion Gap analysis + Gemini AI Generates dynamic ArXiv roadmap and phases using LLM
Guardrail Node Rule-based filter Strips excluded domains (HFT, firmware), deduplicates before synthesis
Knowledge Base PostgreSQL + pgvector (async) Vector search over jobs/resumes/papers, LangGraph checkpoints
API Gateway FastAPI + SSE /api/chat sync, /api/chat/stream SSE, /health, /api/v1/status
Frontend Next.js 14 + Tailwind + Shadcn/UI Chat UI, job dashboard, learning roadmap (Phase 3)

Quick Start

git clone https://github.com/anmolsharma152/Disha.git
cd Disha

python -m venv venv
source venv/bin/activate

pip install -r requirements.txt

# Run a query directly
python main.py "Find Agentic AI and backend roles in Bangalore"

# Stream output
python main.py "Should I invest in Indian AI companies?" --stream

# JSON output
python main.py "Analyze Razorpay financial health" --json

# Start the API server
uvicorn api.server:app --reload --host 0.0.0.0 --port 8000

Then hit the API:

curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "Find Agentic AI roles in Bangalore above 20 LPA"}'

API docs available at http://localhost:8000/docs.


Example Output

### 1. Senior AI/ML Engineer — Agentic Workflows @ Razorpay — 67.4/100 (MEDIUM)
- Location: Bangalore | Remote: Hybrid
- Base: ₹55 LPA | Est. Total: ₹73 LPA
- Skill Match: 68.4% (LangGraph, LangChain, Kubernetes, MLflow, vLLM)
- Gaps: multi-agent orchestration, model serving, drift detection
- Experience Fit: stretch
- Apply: razorpay.com/careers

Project Structure

Disha/
├── main.py                  # LangGraph compilation, CLI entry point
├── schemas.py               # Pydantic v2 models: CompanyMetrics, JobOpening, AgentState
├── README.md
├── agents/
│   ├── scraper_agent.py     # India-focused scraping pipeline
│   ├── financial_agent.py   # Private-market valuation (burn multiple, ESOP, runway)
│   ├── career_agent.py      # Skill-gap scoring, LPA benchmarking
│   ├── supervisor_agent.py  # Cyclic routing + guardrail pre-synthesis
│   └── learning_agent.py    # ArXiv gap analysis, phase roadmap
├── api/
│   └── server.py            # FastAPI + SSE endpoints
├── storage/
│   └── db.py                # Async SQLAlchemy 2.0 + pgvector scaffold
├── tools/
│   ├── scraper_tools.py     # RSS + Playwright (live)
│   └── career_tools.py      # Resume evaluation (Gemini, live)
└── frontend/
    └── README.md            # Next.js 14 architecture spec (Phase 3)

Configuration

Disha's profile matching is fully configurable via user_profile.yaml. The default profile targets India-based AI/ML engineering roles:

Parameter Default
Target roles AI/ML Engineer, LLM Engineer, LLMOps Engineer, ML Platform Engineer
Target cities Bangalore, Delhi NCR, Pune, Hyderabad, Remote India
Salary floor ₹20 LPA base
Excluded domains HFT, embedded, firmware, kernel

Also, add your Google Gemini API key to a .env file in the root directory:

GEMINI_API_KEY="your_api_key_here"

Roadmap

Phase 1 — Modular Framework & Async Postgres Scaffold ✅

  • Supervisor-Specialist multi-agent architecture (LangGraph)
  • FastAPI gateway with SSE streaming
  • Async PostgreSQL + pgvector schema (SQLAlchemy 2.0)
  • India job localization — INR/LPA benchmarking, city filters
  • Financial scoring engine — burn multiple, ESOP, runway (India private-market)
  • Career scoring engine — skill overlap, comp fit, experience fit
  • Guardrail node — domain/tech exclusions pre-synthesis
  • Demo pipeline — real Playwright scraping + Gemini extraction + deterministic scoring

Phase 2 — Live Data & LLM Integration 🔧

  • Live Playwright scraping (real browser rendering, not stub)
  • LLM job extraction (Gemini with_structured_output)
  • Dynamic Generative AI Learning Companion (Gemini 2.5 Flash integrated)
  • LLM-based resume evaluation (Gemini with_structured_output)
  • pgvector schema + native cosine_distance queries
  • Greenhouse API integration (structured JSON ingestion)
  • Lever API integration
  • Error propagation (activate error_recovery node)
  • Cover letter generator

Phase 3 — Frontend & Deployment 🔧

  • Next.js 14 chat UI with SSE streaming
  • Job dashboard — filterable cards, skill gap bars, one-click apply
  • Learning roadmap UI — phase cards, paper viewer, progress tracking
  • Deployment — Vercel + Railway + Neon Postgres

Phase 4 — Production Integrations 🔧

  • MCP servers — LinkedIn, Glassdoor, Yahoo Finance, Wellfound
  • PDF parsing — earnings transcripts, resume analysis
  • Automated email digests — daily market scans, weekly match refresh
  • LangSmith tracing, cost/token observability
  • Circuit breakers — per-domain failure tracking, fallback chains

License

MIT

About

Disha: A production-grade, agentic Personal Intelligence platform. Features multi-agent LangGraph orchestration, India-market job intelligence, async PostgreSQL/pgvector storage, and a FastAPI/Next.js stack.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages