diff --git a/.gitignore b/.gitignore index e4cc3111d..392e47f2a 100644 --- a/.gitignore +++ b/.gitignore @@ -3,6 +3,7 @@ *.env .vscode/ +.claude/ credentials*.json .run/ diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..f453fc988 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,148 @@ +# Compass Project — AI Agent Instructions + +## Project Overview + +Compass is an AI-powered chatbot that helps job-seekers discover and articulate their skills using the ESCO (European Skills, Competences, Qualifications and Occupations) taxonomy. Users describe their work experiences in a conversational interface, and the system maps those experiences to standardized occupations and skills. + +> **Terminology note**: "Agent" in this codebase refers to a **Compass conversation agent** — a backend Python class that handles one phase of the user's chat conversation (e.g., welcome, experience collection, skills exploration, farewell). These are *not* AI coding agents. See the [backend instructions](copilot-instructions-backend.md) for the full agent architecture. + +## Repository Structure + +This is a monorepo with three main packages: + +``` +compass/ +├── backend/ # Python/FastAPI REST API + multi-agent LLM system +├── frontend-new/ # React/TypeScript SPA (chat UI) +├── iac/ # Pulumi infrastructure-as-code (GCP) +└── .github/workflows # CI/CD pipelines +``` + +Path-specific instructions are automatically applied by Copilot when working in the relevant directories: +- [Backend instructions](backend/AGENTS.md) — applies to `backend/**` +- [Frontend instructions](frontend-new/AGENTS.md) — applies to `frontend-new/**` + +## Domain Context + +### What is ESCO? + +ESCO (European Skills, Competences, Qualifications and Occupations) is a taxonomy developed by the European Commission that standardizes how occupations and skills are classified. It was chosen over alternatives like O*NET and ISCO because it offers: + +- **Global breadth** with local adaptability (multi-language, region-specific skills) +- **Simpler skill descriptions** and "alternative labels" for occupations (e.g., "data engineer" as an alternative for "data scientist") +- **Soft skills coverage** ("attitudes and values") absent from other frameworks +- **Green and digital economy** skill frameworks built in +- **Frequent updates** and growing adoption, especially in Latin America + +### Inclusive Livelihoods Taxonomy + +Compass uses Tabiya's **Inclusive Livelihoods Taxonomy**, which extends ESCO to cover the full spectrum of economic activities — including informal and unpaid work that traditional frameworks exclude. It classifies work into **four categories**: + +1. **Wage employment** — traditional salaried/hourly work +2. **Self-employment** — independent/freelance work +3. **Unpaid training** — internships, apprenticeships, volunteering +4. **Unseen/unpaid work** — caregiving, household management, community work + +This equity focus is core to the product — Compass must recognize and validate skills from *all* types of work, not just formal employment. + +### Target Users + +- **Primary audience**: Job-seekers in emerging markets, particularly those with informal economy experience +- **Device context**: Mobile-first, optimized for mid-range smartphones (Samsung Galaxy A23 as reference device) +- **Language**: Moderate English proficiency expected; multi-language support is expanding +- **Accessibility**: 88.9% of testers found Compass easy to use — maintain this standard + +### Product Mission + +Compass helps users discover skills they already have but may not know how to articulate. It does *not* answer career questions directly — instead, it **guides users through structured conversation** to extract, classify, and present their skills in a standardized format useful for CVs, job matching, and career development. + +--- + +## Tech Stack + +| Layer | Technology | +| -------------- | ---------------------------------------------------------------- | +| Backend | Python 3.11+, FastAPI, Uvicorn, Poetry | +| LLM | Google Vertex AI (Gemini), structured output with Pydantic | +| Database | MongoDB (4 instances via Motor async driver) | +| Vector Search | MongoDB Atlas Search with Vertex AI embeddings | +| Frontend | React 18, TypeScript 5.4+, MUI 7, Webpack 5 | +| Auth | Firebase Authentication (email, Google OAuth, anonymous) | +| i18n | i18next (backend + frontend), locales: en-GB, en-US, es-ES, etc | +| Infra | GCP (Cloud Run, Cloud Storage, API Gateway), Pulumi, Docker | +| CI/CD | GitHub Actions | +| Error Tracking | Sentry (both backend and frontend) | +| Testing | pytest + in-memory MongoDB (backend), Jest + RTL (frontend) | + +--- + +## Infrastructure (`iac/`) + +### Pulumi Stacks + +``` +iac/ +├── realm/ # GCP org root, projects, user groups +├── environment/ # Per-env GCP project creation, API enablement +├── auth/ # Identity Platform, Firebase, OAuth providers +├── backend/ # Cloud Run service + API Gateway +├── frontend/ # Cloud Storage bucket for static assets +├── common/ # Load balancer, SSL certificates, DNS records +├── dns/ # DNS zone management +├── aws-ns/ # AWS Route 53 name server delegation +├── lib/ # Shared utilities and types +└── scripts/ # Deployment orchestration (prepare.py, up.py) +``` + +### Deployment + +- **Backend**: Docker image → GCP Artifact Registry → Cloud Run (port 8080, linux/amd64) +- **Frontend**: Build artifact (tar.gz) → GCP Artifact Registry → Cloud Storage bucket +- **DNS**: GCP Cloud DNS + AWS Route 53 for delegation + +### Environment Hierarchy + +- **Realm**: Top-level container (`compass-realm`) with org access +- **Environment naming**: `{realm}.{env}` (e.g., `compass.dev`, `compass.prod`) +- **Types**: `dev`, `test`, `prod` — separate GCP service accounts for lower vs production envs + +--- + +## CI/CD (`.github/workflows/`) + +### Pipeline Flow + +1. **Every push**: Frontend CI (format, lint, compile, test, a11y) + Backend CI (bandit, pylint, pytest) run in parallel +2. **Main branch** with `[pulumi up]` in commit message: Build artifacts + deploy to dev +3. **Release creation**: Build artifacts + deploy to test, then production + +### Key Workflows + +| File | Purpose | +| ----------------- | -------------------------------------- | +| `main.yml` | Orchestrates all CI/CD jobs | +| `frontend-ci.yml` | Frontend checks, build, artifact upload | +| `backend-ci.yml` | Backend checks, Docker build & push | +| `config-ci.yml` | Template/config uploads | +| `deploy.yml` | Pulumi deployment to target env | + +--- + +## Development Guidelines + +### File Organization + +- Tests alongside source files (`*_test.py`, `*.test.tsx`) +- No separate `tests/` directories +- Feature modules are self-contained with routes, services, models, and tests + +### Code Style + +- **Backend**: Python type hints, Pydantic models, async/await, pylint + bandit +- **Frontend**: TypeScript strict mode, ESLint + Prettier, MUI styled components + +### Environment Variables + +- Backend: see `backend/.env.example` +- Frontend: see env vars loaded in `frontend-new/src/envService.ts` +- Infrastructure: see `iac/templates/env.template` for full reference diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/backend/AGENTS.md b/backend/AGENTS.md new file mode 100644 index 000000000..b4633787b --- /dev/null +++ b/backend/AGENTS.md @@ -0,0 +1,174 @@ +# Compass Backend — AI Agent Instructions + +## Entry Point & Server + +- **`backend/app/server.py`** — FastAPI application with async lifespan management. Initializes 4 MongoDB connections in parallel, validates environment, sets up CORS and Brotli middleware, and exposes the conversation API. +- Runs on port 8080 via Uvicorn. +- Configuration is loaded from environment variables (see `backend/.env.example`). + +## Multi-Agent Architecture + +> **Terminology note**: "Agent" in this codebase refers to a **Compass conversation agent** — a backend Python class that handles one phase of the user's skills exploration conversation. These are *not* AI coding agents. Each agent wraps LLM calls with domain-specific prompts, manages conversation state, and produces structured responses. They live in `backend/app/agent/`. + +### What is a Compass Agent? + +Every agent extends the abstract base class `Agent` (`backend/app/agent/agent.py`) and implements a single method: + +```python +async def execute(self, user_input: AgentInput, context: ConversationContext) -> AgentOutput +``` + +- **`AgentInput`** — The user's message text, a message ID, and a timestamp. +- **`ConversationContext`** — Full conversation history plus a summary of older turns. +- **`AgentOutput`** — The agent's response message, a `finished` flag (signals phase transition), and LLM usage stats. + +There are two implementation patterns: +- **`SimpleLLMAgent`** — For stateless agents that make a single LLM call per turn (e.g., `FarewellAgent`, `QnaAgent`). Just provide system instructions. +- **`Agent` (direct)** — For complex agents with internal state, multiple LLM calls, or sub-agent orchestration (e.g., `WelcomeAgent`, `ExploreExperiencesAgentDirector`). + +Stateful agents persist their state to MongoDB via a state object (e.g., `WelcomeAgentState`). + +### Agent Hierarchy + +``` +Agent Director (LLM-based router) +├── WelcomeAgent — Greets user, explains process, handles country selection +├── ExploreExperiencesAgentDirector (sub-director) +│ ├── CollectExperiencesAgent — Gathers work experience details +│ ├── SkillsExplorerAgent — Explores and validates identified skills +│ └── Experience Pipeline — LLM-driven skill linking & ranking +│ ├── ClusterResponsibilitiesTool +│ ├── InferOccupationTool +│ ├── SkillLinkingTool +│ └── PickTopSkillsTool +└── FarewellAgent — Concludes conversation, returns summary +``` + +### Conversation Phases & Routing + +**Phases**: `INTRO → COUNSELING → CHECKOUT → ENDED` + +The **Agent Director** (`agent_director/llm_agent_director.py`) selects the appropriate agent for each user message via an LLM router. The router produces structured output (`RouterModelResponse` with `reasoning` and `agent_type` fields) and falls back to a default agent per phase if the LLM fails. + +When an agent returns `finished=True`, the Director transitions to the next phase. It can auto-advance through multiple phases in one turn by sending an artificial `"(silence)"` message to the next agent. + +## LLM Strategy + +The system uses LLMs for four distinct purposes — understand which one applies before modifying agent code: + +1. **Conversational direction** — Generates guided questions to steer users through the skills exploration process. The LLM asks questions, it does *not* answer them. +2. **NLP tasks** — Clustering, entity extraction, and classification without model fine-tuning. Used in the skills identification pipeline. +3. **Explainability** — Chain-of-Thought reasoning traces outputs back to user inputs, making the system's decisions transparent. +4. **Taxonomy filtering** — Hybrid approach combining semantic vector search with LLM-based filtering to match user input against ESCO skills/occupations. + +### Model Versions + +Configured in `backend/app/agent/config.py`: + +| Purpose | Model | +| -------------------- | ----------------------- | +| LLM (default/fast) | `gemini-2.5-flash-lite` | +| LLM (deep reasoning) | `gemini-2.5-flash` | +| LLM (ultra reasoning) | `gemini-2.5-pro` | +| Embeddings | `text-embedding-005` | + +### Hallucination Prevention + +When modifying agent prompts or LLM interactions, preserve these guardrails: + +- **Task decomposition** — Each agent has a narrow, specific objective. Don't merge responsibilities. +- **State guardrails** — Favor rule-based logic over LLM decisions for control flow (e.g., phase transitions). +- **Guided outputs** — Use few-shot examples, JSON schemas with Pydantic validation, and semantic ordering to constrain LLM responses. +- **Taxonomy grounding** — All skill/occupation outputs must be linked to taxonomy entries. Never let the LLM invent skills or occupations. + +## Skills Identification Pipeline + +After gathering user experiences, the system processes data through a multi-stage pipeline in `backend/app/agent/`: + +``` +User experiences (raw text) + → ClusterResponsibilitiesTool — Groups related responsibilities via LLM clustering + → InferOccupationTool — Maps clusters to ESCO occupations via vector search + LLM filtering + → SkillLinkingTool — Links occupations to specific ESCO skills + → PickTopSkillsTool — Ranks and selects the user's top skills +``` + +The pipeline ensures outputs are **grounded in the taxonomy** — skills are never hallucinated, they are always linked to real ESCO entries via entity linking. + +## Conversation API + +- **`backend/app/conversations/routes.py`** — Two endpoints: + - `POST /conversations/{session_id}/messages` — Send user message, get agent response + - `GET /conversations/{session_id}/messages` — Retrieve conversation history +- Max message length: 1000 characters +- Session ownership is verified per request + +## Database Architecture + +Four separate MongoDB instances (connected via Motor async driver): + +| Database | Purpose | +| ----------- | ------------------------------------------------ | +| Taxonomy | ESCO occupations, skills, embeddings for search | +| Application | Conversation state, session data, user reactions | +| Userdata | Encrypted PII, CV uploads | +| Metrics | Application state snapshots, analytics | + +Connection management uses a singleton provider pattern (`CompassDBProvider`) with async locks. + +## Vector Search & Embeddings + +- **`backend/app/vector_search/`** — Template method pattern for occupation and skill search +- Embeds user input via Vertex AI (`text-embedding-005` model) +- Searches MongoDB Atlas vector indexes +- Async LRU cache for occupation-skill associations (up to ~223MB) + +## LLM Integration + +- **`backend/common_libs/llm/`** — Gemini generative model wrapper +- Structured output: LLM responses are parsed into Pydantic models +- Retry logic: 3 attempts with increasing temperature (0.1 → 1.0) and top-P variation +- JSON extraction from LLM text with validation +- Token usage tracking via `LLMStats` + +## Authentication & Rate Limiting + +- JWT-based (Firebase tokens) via `HTTPBearer` +- API key auth for search endpoints (header: `x-api-key`) +- Supported providers: anonymous, password, Google OAuth +- **Rate limit**: 2 requests per minute per API key by default (HTTP 429 when exceeded) + +## Key Patterns + +- **Dependency injection** via FastAPI's `Depends()` +- **Async-first**: all I/O is async (Motor, LLM calls, HTTP) +- **Repository pattern** for data access +- **Pydantic v2** for all data models and validation +- **Feature flags** via `BACKEND_FEATURES` environment variable (JSON) + +## Testing + +```bash +poetry run pytest -m "not (evaluation_test or smoke_test)" # Unit/integration tests +poetry run pylint --recursive=y . # Linting +poetry run bandit -c bandit.yaml -r . # Security scanning +``` + +- Tests live alongside source: `*_test.py` +- In-memory MongoDB via `pymongo_inmemory` +- Async test support via `pytest-asyncio` + +## Adding a New Agent + +1. Create agent class in `backend/app/agent/` implementing the agent interface +2. Register it in the Agent Director's phase configuration +3. Define LLM prompt and response schema (Pydantic model) +4. Add routing rules in `_llm_router.py` +5. Write tests with mocked LLM responses + +## Adding a New API Endpoint + +1. Create route in appropriate module under `backend/app/` +2. Use FastAPI `Depends()` for auth and service injection +3. Define Pydantic request/response models +4. Add tests using in-memory MongoDB fixtures diff --git a/backend/CLAUDE.md b/backend/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/backend/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/frontend-new/AGENTS.md b/frontend-new/AGENTS.md new file mode 100644 index 000000000..4176f543c --- /dev/null +++ b/frontend-new/AGENTS.md @@ -0,0 +1,128 @@ +# Compass Frontend — AI Agent Instructions + +## Entry Point & Providers + +- **`frontend-new/src/index.tsx`** — Initializes Sentry, validates env vars, applies branding, loads i18n, wraps app in providers (Theme, Snackbar, IsOnline, ViewPort). +- **`frontend-new/src/app/index.tsx`** — Hash-based React Router with auth initialization. Lazy-loads main components. + +## Routes + +| Path | Component | Description | +| ----------------- | ----------------- | ---------------------------- | +| `/` | Chat | Main chat interface | +| `/landing` | Landing | Landing page | +| `/login` | Login | Authentication | +| `/register` | Register | User registration (optional) | +| `/verify-email` | VerifyEmail | Email verification | +| `/consent` | Consent | Terms & conditions | +| `/sensitive-data` | SensitiveDataForm | PII collection | + +All routes except NotFound are protected via `ProtectedRoute`, which enforces auth flow (T&C acceptance, sensitive data completion). + +## Chat Architecture + +**`frontend-new/src/chat/Chat.tsx`** (main component, ~1000 lines) manages: + +- Session creation and history loading +- Optimistic message insertion + API calls via `ChatService` +- AI typing indicators +- CV upload with polling (60s max) +- User inactivity detection (3-minute timeout) +- Page refresh interception (F5, Ctrl+R) +- Experiences drawer and skills ranking integration +- Conversation phase progress bar + +## State Management + +No Redux — uses React patterns: + +- **Context API**: `ChatContext` for cross-component chat state +- **Service singletons**: `AuthenticationStateService`, `UserPreferencesStateService`, `ChatService`, `ExperienceService`, `CVService`, etc. +- **Persistent storage**: localStorage wrapper for tokens and personal info +- **Cross-tab sync**: `BroadcastChannel` API for auth state + +## Authentication + +- Firebase Auth with multiple providers (email, Google OAuth, anonymous) +- JWT token validation with clock tolerance +- Auto-refresh on token expiry if provider session is valid +- Cross-tab logout/login via BroadcastChannel + +## Internationalization + +- i18next with browser language detection +- Locales: `en-GB` (default), `en-US`, `es-ES`, `es-AR`, `fr-FR`, and more +- Translation files in `frontend-new/src/i18n/locales/` +- Configured via `FRONTEND_SUPPORTED_LOCALES` and `FRONTEND_DEFAULT_LOCALE` env vars + +## UX & Conversational Design Principles + +When working on the chat UI or anything user-facing, follow these design principles established through user testing: + +- **Natural dialogue** — The conversation should feel human. Users should be able to respond naturally without rigid formatting. +- **Predictability** — Establish recognizable patterns in question flow so users can anticipate what comes next and batch their responses. +- **Persistence** — Repeat headline questions across experience categories to ensure comprehensive skills capture, especially for informal economy work. +- **Agility** — Handle gracefully when users go off-topic; redirect back without friction. Navigate through input errors without breaking flow. + +### Target Device & Audience + +- **Mobile-first**: Optimized for mid-range smartphones (Samsung Galaxy A23 as reference device) +- **Target users**: Job-seekers in emerging markets, often with informal economy backgrounds +- **Language**: Moderate English proficiency — keep UI text simple and clear +- **Key metric**: 88.9% of testers found Compass easy to use — maintain this bar + +### Conversational Pitfalls to Avoid + +- Ambiguous phrasing (e.g., "what was a typical day like" confuses users) +- Assuming formal employment structures for all work types (wage, self-employment, unpaid training, unseen work all need different framing) +- Asking users to repeat information already captured in earlier conversation phases + +## Customization System + +Compass supports per-deployment customization without code changes. When working on configurable features, be aware of what's customizable: + +- **Branding**: App name, logos, icons, favicon, color scheme (must maintain WCAG AA contrast) +- **Authentication**: Login codes and registration codes can be independently enabled/disabled +- **CV features**: Entire CV functionality can be toggled off (removes all related UI) +- **Skills report**: Configurable logo, export format (PDF/DOCX), and section visibility +- **Language**: Default locale and available language options +- **Sensitive data fields**: Which PII fields are collected (name, email, gender, age, education, main activity) — each supports multi-language translations + +Settings are applied at deployment via environment configuration. Missing/incorrect settings fall back to defaults rather than erroring. + +## Theming + +- MUI theme with light/dark modes (`frontend-new/src/theme/`) +- Brand colors from CSS variables (`--brand-primary`, etc.) +- Custom spacing system: `theme.tabiyaSpacing` +- WCAG AA contrast ratio (4.5:1) +- Runtime branding overrides via environment config + +## Environment Configuration + +- Loaded from `window.tabiyaConfig` (set in `public/data/env.js`) +- All values are **base64-encoded** and decoded at runtime +- Key vars: `FIREBASE_API_KEY`, `BACKEND_URL`, `SENSITIVE_PERSONAL_DATA_RSA_ENCRYPTION_KEY`, locale config + +## Testing + +```bash +yarn test # Jest + React Testing Library +yarn lint # ESLint +yarn compile # TypeScript type checking +yarn format:check # Prettier +yarn test:accessibility # axe-playwright WCAG testing via Storybook +``` + +- Tests live alongside source: `*.test.tsx` / `*.test.ts` +- Test utilities in `src/_test_utilities/` +- Storybook for component development: `yarn storybook` (port 6006) + +## Adding Frontend Features + +1. Create component in appropriate directory under `frontend-new/src/` +2. Add translations to all locale files in `frontend-new/src/i18n/locales/` +3. Use MUI components and the application theme +4. Write tests with React Testing Library +5. Create Storybook stories for visual components +6. Ensure WCAG accessibility (test with `yarn test:accessibility`) diff --git a/frontend-new/CLAUDE.md b/frontend-new/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/frontend-new/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file