Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
*.env

.vscode/
.claude/

credentials*.json
.run/
Expand Down
148 changes: 148 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Compass Project — AI Agent Instructions

## Project Overview

Compass is an AI-powered chatbot that helps job-seekers discover and articulate their skills using the ESCO (European Skills, Competences, Qualifications and Occupations) taxonomy. Users describe their work experiences in a conversational interface, and the system maps those experiences to standardized occupations and skills.

> **Terminology note**: "Agent" in this codebase refers to a **Compass conversation agent** — a backend Python class that handles one phase of the user's chat conversation (e.g., welcome, experience collection, skills exploration, farewell). These are *not* AI coding agents. See the [backend instructions](copilot-instructions-backend.md) for the full agent architecture.
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link to the backend instructions is incorrect. The file is located at instructions/backend.instructions.md, not copilot-instructions-backend.md. This broken link will prevent users from navigating to the backend-specific instructions.

Suggested change
> **Terminology note**: "Agent" in this codebase refers to a **Compass conversation agent** — a backend Python class that handles one phase of the user's chat conversation (e.g., welcome, experience collection, skills exploration, farewell). These are *not* AI coding agents. See the [backend instructions](copilot-instructions-backend.md) for the full agent architecture.
> **Terminology note**: "Agent" in this codebase refers to a **Compass conversation agent** — a backend Python class that handles one phase of the user's chat conversation (e.g., welcome, experience collection, skills exploration, farewell). These are *not* AI coding agents. See the [backend instructions](instructions/backend.instructions.md) for the full agent architecture.

Copilot uses AI. Check for mistakes.

## Repository Structure

This is a monorepo with three main packages:

```
compass/
├── backend/ # Python/FastAPI REST API + multi-agent LLM system
├── frontend-new/ # React/TypeScript SPA (chat UI)
├── iac/ # Pulumi infrastructure-as-code (GCP)
└── .github/workflows # CI/CD pipelines
```
Comment on lines +11 to +19
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo structure snippet shows a compass/ top-level directory and also lists .github/workflows alongside the three packages. In this repo the top-level directories are backend/, frontend-new/, and iac/ directly at the root, so this tree (and the “three main packages” wording) is currently misleading. Please update the diagram/text to match the actual repository root layout.

Copilot uses AI. Check for mistakes.

Path-specific instructions are automatically applied by Copilot when working in the relevant directories:
- [Backend instructions](backend/AGENTS.md) — applies to `backend/**`
- [Frontend instructions](frontend-new/AGENTS.md) — applies to `frontend-new/**`

## Domain Context

### What is ESCO?

ESCO (European Skills, Competences, Qualifications and Occupations) is a taxonomy developed by the European Commission that standardizes how occupations and skills are classified. It was chosen over alternatives like O*NET and ISCO because it offers:

- **Global breadth** with local adaptability (multi-language, region-specific skills)
- **Simpler skill descriptions** and "alternative labels" for occupations (e.g., "data engineer" as an alternative for "data scientist")
- **Soft skills coverage** ("attitudes and values") absent from other frameworks
- **Green and digital economy** skill frameworks built in
- **Frequent updates** and growing adoption, especially in Latin America

### Inclusive Livelihoods Taxonomy

Compass uses Tabiya's **Inclusive Livelihoods Taxonomy**, which extends ESCO to cover the full spectrum of economic activities — including informal and unpaid work that traditional frameworks exclude. It classifies work into **four categories**:

1. **Wage employment** — traditional salaried/hourly work
2. **Self-employment** — independent/freelance work
3. **Unpaid training** — internships, apprenticeships, volunteering
4. **Unseen/unpaid work** — caregiving, household management, community work

This equity focus is core to the product — Compass must recognize and validate skills from *all* types of work, not just formal employment.

### Target Users

- **Primary audience**: Job-seekers in emerging markets, particularly those with informal economy experience
- **Device context**: Mobile-first, optimized for mid-range smartphones (Samsung Galaxy A23 as reference device)
- **Language**: Moderate English proficiency expected; multi-language support is expanding
- **Accessibility**: 88.9% of testers found Compass easy to use — maintain this standard

### Product Mission

Compass helps users discover skills they already have but may not know how to articulate. It does *not* answer career questions directly — instead, it **guides users through structured conversation** to extract, classify, and present their skills in a standardized format useful for CVs, job matching, and career development.

---

## Tech Stack

| Layer | Technology |
| -------------- | ---------------------------------------------------------------- |
| Backend | Python 3.11+, FastAPI, Uvicorn, Poetry |
| LLM | Google Vertex AI (Gemini), structured output with Pydantic |
| Database | MongoDB (4 instances via Motor async driver) |
| Vector Search | MongoDB Atlas Search with Vertex AI embeddings |
| Frontend | React 18, TypeScript 5.4+, MUI 7, Webpack 5 |
| Auth | Firebase Authentication (email, Google OAuth, anonymous) |
| i18n | i18next (backend + frontend), locales: en-GB, en-US, es-ES, etc |
| Infra | GCP (Cloud Run, Cloud Storage, API Gateway), Pulumi, Docker |
| CI/CD | GitHub Actions |
| Error Tracking | Sentry (both backend and frontend) |
| Testing | pytest + in-memory MongoDB (backend), Jest + RTL (frontend) |

---

## Infrastructure (`iac/`)

### Pulumi Stacks

```
iac/
├── realm/ # GCP org root, projects, user groups
├── environment/ # Per-env GCP project creation, API enablement
├── auth/ # Identity Platform, Firebase, OAuth providers
├── backend/ # Cloud Run service + API Gateway
├── frontend/ # Cloud Storage bucket for static assets
├── common/ # Load balancer, SSL certificates, DNS records
├── dns/ # DNS zone management
├── aws-ns/ # AWS Route 53 name server delegation
├── lib/ # Shared utilities and types
└── scripts/ # Deployment orchestration (prepare.py, up.py)
```

### Deployment

- **Backend**: Docker image → GCP Artifact Registry → Cloud Run (port 8080, linux/amd64)
- **Frontend**: Build artifact (tar.gz) → GCP Artifact Registry → Cloud Storage bucket
- **DNS**: GCP Cloud DNS + AWS Route 53 for delegation

### Environment Hierarchy

- **Realm**: Top-level container (`compass-realm`) with org access
- **Environment naming**: `{realm}.{env}` (e.g., `compass.dev`, `compass.prod`)
- **Types**: `dev`, `test`, `prod` — separate GCP service accounts for lower vs production envs

---

## CI/CD (`.github/workflows/`)

### Pipeline Flow

1. **Every push**: Frontend CI (format, lint, compile, test, a11y) + Backend CI (bandit, pylint, pytest) run in parallel
2. **Main branch** with `[pulumi up]` in commit message: Build artifacts + deploy to dev
3. **Release creation**: Build artifacts + deploy to test, then production

### Key Workflows

| File | Purpose |
| ----------------- | -------------------------------------- |
| `main.yml` | Orchestrates all CI/CD jobs |
| `frontend-ci.yml` | Frontend checks, build, artifact upload |
| `backend-ci.yml` | Backend checks, Docker build & push |
| `config-ci.yml` | Template/config uploads |
| `deploy.yml` | Pulumi deployment to target env |

---

## Development Guidelines

### File Organization

- Tests alongside source files (`*_test.py`, `*.test.tsx`)
- No separate `tests/` directories
- Feature modules are self-contained with routes, services, models, and tests
Comment on lines +135 to +137
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says there are no separate tests/ directories, but the repo contains dedicated test folders (e.g. backend/evaluation_tests/, backend/smoke_test/, frontend-new/test/smoke/). Please adjust this guideline (e.g., clarify that unit tests live alongside source, with smoke/e2e/evaluation tests in dedicated dirs).

Suggested change
- Tests alongside source files (`*_test.py`, `*.test.tsx`)
- No separate `tests/` directories
- Feature modules are self-contained with routes, services, models, and tests
- Unit tests live alongside source files (`*_test.py`, `*.test.tsx`)
- Higher-level suites (smoke, e2e, evaluation) use dedicated test directories (e.g. `backend/evaluation_tests/`, `backend/smoke_test/`, `frontend-new/test/smoke/`)
- Feature modules are self-contained with routes, services, models, and unit tests

Copilot uses AI. Check for mistakes.

### Code Style

- **Backend**: Python type hints, Pydantic models, async/await, pylint + bandit
- **Frontend**: TypeScript strict mode, ESLint + Prettier, MUI styled components

### Environment Variables

- Backend: see `backend/.env.example`
- Frontend: see env vars loaded in `frontend-new/src/envService.ts`
- Infrastructure: see `iac/templates/env.template` for full reference
1 change: 1 addition & 0 deletions CLAUDE.md
174 changes: 174 additions & 0 deletions backend/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Compass Backend — AI Agent Instructions

## Entry Point & Server

- **`backend/app/server.py`** — FastAPI application with async lifespan management. Initializes 4 MongoDB connections in parallel, validates environment, sets up CORS and Brotli middleware, and exposes the conversation API.
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backend/app/server.py does not initialize 4 MongoDB connections in parallel; it fetches 4 DB handles and then runs initialization for application/userdata/metrics plus taxonomy validation concurrently. Please adjust this sentence to reflect the actual startup behavior so the instructions remain accurate.

Suggested change
- **`backend/app/server.py`** — FastAPI application with async lifespan management. Initializes 4 MongoDB connections in parallel, validates environment, sets up CORS and Brotli middleware, and exposes the conversation API.
- **`backend/app/server.py`** — FastAPI application with async lifespan management. Fetches 4 MongoDB DB handles and then concurrently initializes the application/userdata/metrics databases and runs taxonomy validation, validates environment, sets up CORS and Brotli middleware, and exposes the conversation API.

Copilot uses AI. Check for mistakes.
- Runs on port 8080 via Uvicorn.
- Configuration is loaded from environment variables (see `backend/.env.example`).

## Multi-Agent Architecture

> **Terminology note**: "Agent" in this codebase refers to a **Compass conversation agent** — a backend Python class that handles one phase of the user's skills exploration conversation. These are *not* AI coding agents. Each agent wraps LLM calls with domain-specific prompts, manages conversation state, and produces structured responses. They live in `backend/app/agent/`.

### What is a Compass Agent?

Every agent extends the abstract base class `Agent` (`backend/app/agent/agent.py`) and implements a single method:

```python
async def execute(self, user_input: AgentInput, context: ConversationContext) -> AgentOutput
```

- **`AgentInput`** — The user's message text, a message ID, and a timestamp.
- **`ConversationContext`** — Full conversation history plus a summary of older turns.
- **`AgentOutput`** — The agent's response message, a `finished` flag (signals phase transition), and LLM usage stats.

There are two implementation patterns:
- **`SimpleLLMAgent`** — For stateless agents that make a single LLM call per turn (e.g., `FarewellAgent`, `QnaAgent`). Just provide system instructions.
- **`Agent` (direct)** — For complex agents with internal state, multiple LLM calls, or sub-agent orchestration (e.g., `WelcomeAgent`, `ExploreExperiencesAgentDirector`).

Stateful agents persist their state to MongoDB via a state object (e.g., `WelcomeAgentState`).

### Agent Hierarchy

```
Agent Director (LLM-based router)
├── WelcomeAgent — Greets user, explains process, handles country selection
├── ExploreExperiencesAgentDirector (sub-director)
│ ├── CollectExperiencesAgent — Gathers work experience details
│ ├── SkillsExplorerAgent — Explores and validates identified skills
│ └── Experience Pipeline — LLM-driven skill linking & ranking
│ ├── ClusterResponsibilitiesTool
│ ├── InferOccupationTool
│ ├── SkillLinkingTool
│ └── PickTopSkillsTool
└── FarewellAgent — Concludes conversation, returns summary
```

### Conversation Phases & Routing

**Phases**: `INTRO → COUNSELING → CHECKOUT → ENDED`

The **Agent Director** (`agent_director/llm_agent_director.py`) selects the appropriate agent for each user message via an LLM router. The router produces structured output (`RouterModelResponse` with `reasoning` and `agent_type` fields) and falls back to a default agent per phase if the LLM fails.

When an agent returns `finished=True`, the Director transitions to the next phase. It can auto-advance through multiple phases in one turn by sending an artificial `"(silence)"` message to the next agent.

## LLM Strategy

The system uses LLMs for four distinct purposes — understand which one applies before modifying agent code:

1. **Conversational direction** — Generates guided questions to steer users through the skills exploration process. The LLM asks questions, it does *not* answer them.
2. **NLP tasks** — Clustering, entity extraction, and classification without model fine-tuning. Used in the skills identification pipeline.
3. **Explainability** — Chain-of-Thought reasoning traces outputs back to user inputs, making the system's decisions transparent.
4. **Taxonomy filtering** — Hybrid approach combining semantic vector search with LLM-based filtering to match user input against ESCO skills/occupations.

### Model Versions

Configured in `backend/app/agent/config.py`:

| Purpose | Model |
| -------------------- | ----------------------- |
| LLM (default/fast) | `gemini-2.5-flash-lite` |
| LLM (deep reasoning) | `gemini-2.5-flash` |
| LLM (ultra reasoning) | `gemini-2.5-pro` |
| Embeddings | `text-embedding-005` |

### Hallucination Prevention

When modifying agent prompts or LLM interactions, preserve these guardrails:

- **Task decomposition** — Each agent has a narrow, specific objective. Don't merge responsibilities.
- **State guardrails** — Favor rule-based logic over LLM decisions for control flow (e.g., phase transitions).
- **Guided outputs** — Use few-shot examples, JSON schemas with Pydantic validation, and semantic ordering to constrain LLM responses.
- **Taxonomy grounding** — All skill/occupation outputs must be linked to taxonomy entries. Never let the LLM invent skills or occupations.

## Skills Identification Pipeline

After gathering user experiences, the system processes data through a multi-stage pipeline in `backend/app/agent/`:

```
User experiences (raw text)
→ ClusterResponsibilitiesTool — Groups related responsibilities via LLM clustering
→ InferOccupationTool — Maps clusters to ESCO occupations via vector search + LLM filtering
→ SkillLinkingTool — Links occupations to specific ESCO skills
→ PickTopSkillsTool — Ranks and selects the user's top skills
```

The pipeline ensures outputs are **grounded in the taxonomy** — skills are never hallucinated, they are always linked to real ESCO entries via entity linking.

## Conversation API

- **`backend/app/conversations/routes.py`** — Two endpoints:
- `POST /conversations/{session_id}/messages` — Send user message, get agent response
- `GET /conversations/{session_id}/messages` — Retrieve conversation history
- Max message length: 1000 characters
- Session ownership is verified per request

## Database Architecture

Four separate MongoDB instances (connected via Motor async driver):

| Database | Purpose |
| ----------- | ------------------------------------------------ |
| Taxonomy | ESCO occupations, skills, embeddings for search |
| Application | Conversation state, session data, user reactions |
| Userdata | Encrypted PII, CV uploads |
| Metrics | Application state snapshots, analytics |

Connection management uses a singleton provider pattern (`CompassDBProvider`) with async locks.

## Vector Search & Embeddings

- **`backend/app/vector_search/`** — Template method pattern for occupation and skill search
- Embeds user input via Vertex AI (`text-embedding-005` model)
- Searches MongoDB Atlas vector indexes
- Async LRU cache for occupation-skill associations (up to ~223MB)

## LLM Integration

- **`backend/common_libs/llm/`** — Gemini generative model wrapper
- Structured output: LLM responses are parsed into Pydantic models
- Retry logic: 3 attempts with increasing temperature (0.1 → 1.0) and top-P variation
- JSON extraction from LLM text with validation
- Token usage tracking via `LLMStats`

## Authentication & Rate Limiting

- JWT-based (Firebase tokens) via `HTTPBearer`
- API key auth for search endpoints (header: `x-api-key`)
- Supported providers: anonymous, password, Google OAuth
- **Rate limit**: 2 requests per minute per API key by default (HTTP 429 when exceeded)

## Key Patterns

- **Dependency injection** via FastAPI's `Depends()`
- **Async-first**: all I/O is async (Motor, LLM calls, HTTP)
- **Repository pattern** for data access
- **Pydantic v2** for all data models and validation
- **Feature flags** via `BACKEND_FEATURES` environment variable (JSON)

## Testing

```bash
poetry run pytest -m "not (evaluation_test or smoke_test)" # Unit/integration tests
poetry run pylint --recursive=y . # Linting
poetry run bandit -c bandit.yaml -r . # Security scanning
```

- Tests live alongside source: `*_test.py`
- In-memory MongoDB via `pymongo_inmemory`
- Async test support via `pytest-asyncio`

## Adding a New Agent

1. Create agent class in `backend/app/agent/` implementing the agent interface
2. Register it in the Agent Director's phase configuration
3. Define LLM prompt and response schema (Pydantic model)
4. Add routing rules in `_llm_router.py`
5. Write tests with mocked LLM responses

## Adding a New API Endpoint

1. Create route in appropriate module under `backend/app/`
2. Use FastAPI `Depends()` for auth and service injection
3. Define Pydantic request/response models
4. Add tests using in-memory MongoDB fixtures
1 change: 1 addition & 0 deletions backend/CLAUDE.md
Loading