Skip to content

feat: Add interview pipeline, TTS, and transcription endpoints#5

Merged
mlarsen-source merged 1 commit into
mainfrom
feat/interview-pipeline
Jun 4, 2026
Merged

feat: Add interview pipeline, TTS, and transcription endpoints#5
mlarsen-source merged 1 commit into
mainfrom
feat/interview-pipeline

Conversation

@mlarsen-source
Copy link
Copy Markdown
Owner

Summary

Implements the core interview pipeline: POST /speech/tts streams Groq Orpheus audio for the interviewer voice, and POST /speech/transcribe sends audio to Groq Whisper for segment-level transcription then scores each segment with the local wav2vec2 emotion model. Introduces InterviewClient, a full-session interview UI that replaces the home page placeholder, with interviewer persona selection, voice activity detection, automatic recording flow, and per-segment arousal/dominance/valence display. Moves AGENTS.md and CLAUDE.md from frontend/ to the repository root, expands them to project-wide scope, and adds docs/architecture.md with a full system data-flow diagram. Updates all setup documentation and agent workflows to reflect Groq as the speech provider and the one-time Orpheus terms-acceptance requirement.

What's new

Surface Status Layer
POST /speech/tts new Backend
POST /speech/transcribe new (was empty TODO) Backend
Interview UI (/) new Frontend
/dev/transcribe debug page new Frontend
Question bank + interviewer personas new Frontend data
AGENTS.md / CLAUDE.md moved from frontend/ to root, expanded Config
docs/architecture.md new Documentation
docs/agent-workflows/run-project.md new Documentation

File-by-file changes

Backend — API

  • backend/app.py: Added OPTIONS to allow_methods in CORS middleware so multipart file-upload preflight requests from the browser are not blocked.
  • backend/requirements.txt: Added groq>=0.9.0 as the Groq SDK dependency; updated the speech-to-text comment to reflect Groq Whisper replacing OpenAI.
  • backend/services/speech_to_text/router.py: Replaced the empty TODO stub with full implementations of POST /speech/tts (Groq Orpheus → WAV) and POST /speech/transcribe (Groq Whisper segmentation + per-segment wav2vec2 scoring); uses os.environ.get for the API key and raises HTTP 503 at request time when the key is absent so a missing key does not crash backend startup.

Frontend — Interview UI

  • frontend/app/InterviewClient.tsx: New client component implementing the full interview state machine — idle, playing (TTS), recording (MediaRecorder + Web Audio VAD), processing, and done — with interviewer selection, abort-signal-gated fetch calls to /speech/tts and /speech/transcribe, and per-session used-question tracking via a caller-owned Set ref.
  • frontend/app/InterviewClient.module.css: New module stylesheet for InterviewClient; all color values reference semantic tokens defined in globals.css.
  • frontend/app/page.tsx: Replaced the static placeholder home page with InterviewClient so the root route is the live interview UI.

Frontend — Dev page

  • frontend/app/dev/transcribe/page.tsx: New route entry point for the transcription debug page at /dev/transcribe.
  • frontend/app/dev/transcribe/TranscribeDevClient.tsx: New dev-only client component that records audio in accumulated WebM chunks and sends each to POST /speech/transcribe, displaying sent-chunk status and returned segments side by side in real time.
  • frontend/app/dev/transcribe/TranscribeDevClient.module.css: New module stylesheet for the dev transcribe page; all color values reference semantic tokens.

Frontend — Styles

  • frontend/app/globals.css: Added 18 semantic color tokens (--color-surface, --color-danger, --color-muted, etc.) so component module files reference tokens rather than hard-coded hex values.

Frontend — Data

  • frontend/lib/prompts/interviewers.ts: New module defining the six Groq Orpheus interviewer personas (name, title, voice name) and the default selection.
  • frontend/lib/prompts/questions.ts: New module with a 20-question bank (1 intro, 10 technical, 10 behavioral) and pickRandomQuestion(usedIds) that accepts a caller-owned Set to avoid module-level mutable state.

Agent configuration

  • frontend/AGENTS.md: Deleted — file moved to repository root and expanded to project-wide scope.
  • frontend/CLAUDE.md: Deleted — file moved to repository root.
  • AGENTS.md: New at repository root; expanded from frontend-only conventions to project-wide scope covering pipeline architecture, backend endpoints, Groq Orpheus setup requirement, run-project trigger phrases, and environment variable table.
  • CLAUDE.md: New at repository root; thin pointer to AGENTS.md, replacing the deleted frontend/CLAUDE.md.

Documentation

  • README.md: Updated setup steps (added numbered Groq Orpheus terms-acceptance step with symptom description), updated repo structure tree, and updated stack table to reflect Groq Whisper and Orpheus as the speech provider.
  • backend/.env.example: Added GROQ_API_KEY entry with console URL; reorganized comments to reflect Groq as the primary speech dependency.
  • backend/README.md: Added POST /speech/tts endpoint entry and full request/response documentation; updated POST /speech/transcribe status to done with curl example; added score-interpretation table.
  • docs/architecture.md: New document with full system diagram, step-by-step pipeline walkthrough for all seven stages, data schemas, key design decisions, environment variable table, and local dev URLs.
  • docs/agent-workflows/README.md: Updated references from frontend/AGENTS.md to AGENTS.md; added run-project.md to the workflow table.
  • docs/agent-workflows/code-quality-review.md: Updated frontend/AGENTS.md references to AGENTS.md.
  • docs/agent-workflows/css-and-component-standards-review.md: Updated frontend/AGENTS.md references to AGENTS.md.
  • docs/agent-workflows/feature-implementation-planning.md: Updated frontend/AGENTS.md references to AGENTS.md.
  • docs/agent-workflows/figma-design-to-code.md: Updated frontend/AGENTS.md references to AGENTS.md.
  • docs/agent-workflows/pre-merge-full-review.md: Updated frontend/AGENTS.md references to AGENTS.md.
  • docs/agent-workflows/run-project.md: New workflow document for agent-driven project startup; covers state assessment, prerequisite fixes, Groq Orpheus terms check, sequential backend/frontend launch, stop instructions, and error reference table.
  • docs/agent-workflows/test-suite-quality-review.md: Updated frontend/AGENTS.md references to AGENTS.md.

Environment variables

  • GROQ_API_KEY — add

@mlarsen-source mlarsen-source self-assigned this Jun 1, 2026
@mlarsen-source mlarsen-source merged commit f8babc5 into main Jun 4, 2026
2 checks passed
@mlarsen-source mlarsen-source deleted the feat/interview-pipeline branch June 4, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants