Agentic Recruitment Orchestrator
An AI-powered recruitment pipeline that parses Job Descriptions and Resumes, performs gap analysis using LLM reasoning, and drafts personalised outreach emails β all orchestrated by a multi-agent CrewAI crew.
Note: Live link may not work since backend deployment is not done as it gets charged due to heavy models used. However, it works locally. Comments are added throughout the project for better understanding.
Recruitment.Orchestration.Demo.mp4
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js 14 Frontend β
β ββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββ β
β β Upload β β Candidate β β Email Preview/Edit β β
β β Panel + β β Cards + AI β β Modal β β
β β Top-N Ctrl β β Insights β β β β
β ββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β REST API (proxied via Next.js rewrites)
βββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (async) β
β ββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ingestionβ β CrewAI Agent Pipeline β β
β β Engine β β ββββββββββββββ βββββββββββββ ββββββββββββ β β
β β (PyMuPDF)β β β Researcher βββ Evaluator βββ Writer β β β
β ββββββββββββ β ββββββββββββββ βββββββββββββ ββββββββββββ β β
β ββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ChromaDB β β Human-in-the-Loop approval gate β
β β (Vectors)β β Session isolation (auto-reset per JD) β
β ββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Layer
Technology
LLM Provider
Groq (Llama 3.3 70B Versatile) via CrewAI + LiteLLM
Embeddings
all-MiniLM-L6-v2 (local, sentence-transformers β no API key needed)
Vector DB
ChromaDB (persistent, cosine similarity)
Agents
CrewAI (sequential 3-agent crew)
PDF Parsing
PyMuPDF (fitz)
Backend
FastAPI (async, in-memory state)
Frontend
Next.js 14 + React 18 + Tailwind CSS + Radix UI
Agent
Role
Researcher
Analyses the JD β extracts technical reqs, soft skills, culture fit
Evaluator
Scores candidates with reasoning & gap analysis (trainable skills)
Writer
Drafts personalised outreach emails referencing specific projects
Upload JD + Resumes
Set Top-N candidates to analyse (defaults to 5, clamped to resume count)
Launch pipeline β Researcher + Evaluator run automatically
Pipeline pauses at "Awaiting Approval"
User reviews AI shortlist, selects approved candidates
Writer drafts emails only for approved candidates
User can edit emails before sending
Automatic reset: Uploading a new JD clears all previous state β documents, pipeline runs, and ChromaDB embeddings
Explicit reset: POST /api/session/reset wipes everything for a clean slate
No cross-session leakage: Each JD upload starts a fresh session with no stale data
cd backend
python -m venv venv
venv\S cripts\a ctivate # Windows
# source venv/bin/activate # macOS/Linux
pip install -r requirements.txt
# Create .env with your Groq API key
echo GROQ_API_KEY=gsk_your_key_here > .env
# Run
uvicorn app.main:app --reload --port 8000
cd frontend
npm install
npm run dev
Open http://localhost:3000
Variable
Required
Default
Description
GROQ_API_KEY
Yes
β
Groq API key for LLM calls
GROQ_MODEL
No
llama-3.3-70b-versatile
Groq model identifier
EMBEDDING_MODEL
No
all-MiniLM-L6-v2
Local sentence-transformers model
DEFAULT_TOP_N
No
5
Default number of top candidates
FRONTEND_URL
No
http://localhost:3000
Allowed CORS origin
Method
Endpoint
Description
POST
/api/upload/jd
Upload a Job Description (PDF/TXT) β resets session
POST
/api/upload/resumes
Upload multiple Resume PDFs
GET
/api/documents
List all uploaded documents
POST
/api/pipeline/start
Start the agent pipeline (top_n clamped to resume count)
GET
/api/pipeline/{run_id}
Poll pipeline status & results
POST
/api/pipeline/{run_id}/approve
Approve shortlisted candidates (HITL)
PUT
/api/pipeline/{run_id}/emails/{rid}
Edit a drafted outreach email
POST
/api/session/reset
Explicitly reset all session state
βββ backend/
β βββ app/
β β βββ __init__.py
β β βββ config.py # Env-based configuration (Groq keys, paths)
β β βββ models.py # Pydantic schemas
β β βββ ingestion.py # PDF/TXT extraction (PyMuPDF)
β β βββ vector_store.py # ChromaDB embeddings, search & reset
β β βββ agents.py # CrewAI agent definitions (Groq-powered)
β β βββ main.py # FastAPI application & endpoints
β βββ uploads/ # Uploaded files (gitignored)
β βββ chroma_db/ # Persistent vector store (gitignored)
β βββ requirements.txt
βββ frontend/
β βββ src/
β β βββ app/
β β β βββ globals.css
β β β βββ layout.tsx
β β β βββ page.tsx # Command Center dashboard
β β βββ components/
β β β βββ ui/ # Shadcn/UI primitives
β β β βββ CandidateCard.tsx # Match scores, AI insights, gap analysis
β β β βββ EmailModal.tsx # Edit outreach emails
β β β βββ UploadPanel.tsx # JD/resume upload + Top-N control
β β β βββ StatusBanner.tsx # Pipeline status indicator
β β βββ lib/
β β βββ api.ts # API client (incl. session reset)
β β βββ types.ts # TypeScript types
β β βββ utils.ts # cn() utility
β βββ package.json
β βββ next.config.js # API proxy rewrites to FastAPI
β βββ tailwind.config.js
β βββ tsconfig.json
βββ README.md
Groq-only β all LLM calls use Groq (Llama 3.3 70B); no OpenAI dependency
Local embeddings β sentence-transformers all-MiniLM-L6-v2 runs locally; no embedding API key needed
No keyword matching β all evaluation uses LLM chain-of-thought reasoning
Session isolation β uploading a new JD auto-resets ChromaDB and in-memory state to prevent cross-session data leakage
Dynamic Top-N β user chooses how many candidates to analyse; backend clamps to actual resume count
Async pipeline β FastAPI background tasks with 3-second polling from the frontend
Human-in-the-loop β pipeline pauses for approval before email drafting
Chunked embeddings β resumes are chunked (2000 chars, 200 overlap) for better retrieval
Cosine similarity β ChromaDB uses cosine distance for semantic search