feat: Add interview pipeline, TTS, and transcription endpoints#5
Merged
Conversation
JacobWoodbury
approved these changes
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the core interview pipeline:
POST /speech/ttsstreams Groq Orpheus audio for the interviewer voice, andPOST /speech/transcribesends audio to Groq Whisper for segment-level transcription then scores each segment with the local wav2vec2 emotion model. IntroducesInterviewClient, a full-session interview UI that replaces the home page placeholder, with interviewer persona selection, voice activity detection, automatic recording flow, and per-segment arousal/dominance/valence display. MovesAGENTS.mdandCLAUDE.mdfromfrontend/to the repository root, expands them to project-wide scope, and addsdocs/architecture.mdwith a full system data-flow diagram. Updates all setup documentation and agent workflows to reflect Groq as the speech provider and the one-time Orpheus terms-acceptance requirement.What's new
POST /speech/ttsPOST /speech/transcribe/)/dev/transcribedebug pageAGENTS.md/CLAUDE.mdfrontend/to root, expandeddocs/architecture.mddocs/agent-workflows/run-project.mdFile-by-file changes
Backend — API
backend/app.py: AddedOPTIONStoallow_methodsin CORS middleware so multipart file-upload preflight requests from the browser are not blocked.backend/requirements.txt: Addedgroq>=0.9.0as the Groq SDK dependency; updated the speech-to-text comment to reflect Groq Whisper replacing OpenAI.backend/services/speech_to_text/router.py: Replaced the empty TODO stub with full implementations ofPOST /speech/tts(Groq Orpheus → WAV) andPOST /speech/transcribe(Groq Whisper segmentation + per-segment wav2vec2 scoring); usesos.environ.getfor the API key and raises HTTP 503 at request time when the key is absent so a missing key does not crash backend startup.Frontend — Interview UI
frontend/app/InterviewClient.tsx: New client component implementing the full interview state machine — idle, playing (TTS), recording (MediaRecorder + Web Audio VAD), processing, and done — with interviewer selection, abort-signal-gated fetch calls to/speech/ttsand/speech/transcribe, and per-session used-question tracking via a caller-ownedSetref.frontend/app/InterviewClient.module.css: New module stylesheet forInterviewClient; all color values reference semantic tokens defined inglobals.css.frontend/app/page.tsx: Replaced the static placeholder home page withInterviewClientso the root route is the live interview UI.Frontend — Dev page
frontend/app/dev/transcribe/page.tsx: New route entry point for the transcription debug page at/dev/transcribe.frontend/app/dev/transcribe/TranscribeDevClient.tsx: New dev-only client component that records audio in accumulated WebM chunks and sends each toPOST /speech/transcribe, displaying sent-chunk status and returned segments side by side in real time.frontend/app/dev/transcribe/TranscribeDevClient.module.css: New module stylesheet for the dev transcribe page; all color values reference semantic tokens.Frontend — Styles
frontend/app/globals.css: Added 18 semantic color tokens (--color-surface,--color-danger,--color-muted, etc.) so component module files reference tokens rather than hard-coded hex values.Frontend — Data
frontend/lib/prompts/interviewers.ts: New module defining the six Groq Orpheus interviewer personas (name, title, voice name) and the default selection.frontend/lib/prompts/questions.ts: New module with a 20-question bank (1 intro, 10 technical, 10 behavioral) andpickRandomQuestion(usedIds)that accepts a caller-ownedSetto avoid module-level mutable state.Agent configuration
frontend/AGENTS.md: Deleted — file moved to repository root and expanded to project-wide scope.frontend/CLAUDE.md: Deleted — file moved to repository root.AGENTS.md: New at repository root; expanded from frontend-only conventions to project-wide scope covering pipeline architecture, backend endpoints, Groq Orpheus setup requirement, run-project trigger phrases, and environment variable table.CLAUDE.md: New at repository root; thin pointer toAGENTS.md, replacing the deletedfrontend/CLAUDE.md.Documentation
README.md: Updated setup steps (added numbered Groq Orpheus terms-acceptance step with symptom description), updated repo structure tree, and updated stack table to reflect Groq Whisper and Orpheus as the speech provider.backend/.env.example: AddedGROQ_API_KEYentry with console URL; reorganized comments to reflect Groq as the primary speech dependency.backend/README.md: AddedPOST /speech/ttsendpoint entry and full request/response documentation; updatedPOST /speech/transcribestatus to done with curl example; added score-interpretation table.docs/architecture.md: New document with full system diagram, step-by-step pipeline walkthrough for all seven stages, data schemas, key design decisions, environment variable table, and local dev URLs.docs/agent-workflows/README.md: Updated references fromfrontend/AGENTS.mdtoAGENTS.md; addedrun-project.mdto the workflow table.docs/agent-workflows/code-quality-review.md: Updatedfrontend/AGENTS.mdreferences toAGENTS.md.docs/agent-workflows/css-and-component-standards-review.md: Updatedfrontend/AGENTS.mdreferences toAGENTS.md.docs/agent-workflows/feature-implementation-planning.md: Updatedfrontend/AGENTS.mdreferences toAGENTS.md.docs/agent-workflows/figma-design-to-code.md: Updatedfrontend/AGENTS.mdreferences toAGENTS.md.docs/agent-workflows/pre-merge-full-review.md: Updatedfrontend/AGENTS.mdreferences toAGENTS.md.docs/agent-workflows/run-project.md: New workflow document for agent-driven project startup; covers state assessment, prerequisite fixes, Groq Orpheus terms check, sequential backend/frontend launch, stop instructions, and error reference table.docs/agent-workflows/test-suite-quality-review.md: Updatedfrontend/AGENTS.mdreferences toAGENTS.md.Environment variables
GROQ_API_KEY— add