⚡ Bolt: Replace synchronous Groq client with AsyncGroq#81
⚡ Bolt: Replace synchronous Groq client with AsyncGroq#81Adityasingh-8858 wants to merge 1 commit into
Conversation
- Changed `Groq` imports and instantiations to `AsyncGroq` in `backend/main.py` - Switched the `.create()` calls to use `await` inside the `/ai-voice` and `/initiate-transfer` endpoints - Added a journal entry to `.jules/bolt.md` documenting this performance anti-pattern and solution Co-authored-by: Deepaksingh7238 <110552872+Deepaksingh7238@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Pull request overview
This PR updates the FastAPI backend to use Groq’s async client so LLM calls don’t block the event loop, and records the learning in the Jules journal.
Changes:
- Switched Groq client usage in
/ai-voiceand/initiate-transferfrom sync toAsyncGroqwithawait. - Added a Bolt journal entry documenting the async-client performance rationale.
- Includes an incidental
package-lock.jsonmetadata change.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
backend/main.py |
Replaces synchronous Groq client calls with AsyncGroq + await inside async FastAPI routes. |
frontend/package-lock.json |
Removes a dev flag from the fsevents entry (appears unrelated to the PR’s stated goal). |
.jules/bolt.md |
Adds a journal note about using async clients in FastAPI to avoid event loop blocking. |
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "version": "2.3.2", | ||
| "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz", | ||
| "integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==", | ||
| "dev": true, | ||
| "hasInstallScript": true, | ||
| "license": "MIT", | ||
| "optional": true, |
There was a problem hiding this comment.
This PR is scoped to switching Groq usage in the backend, but it also changes frontend/package-lock.json (removing the "dev": true flag for fsevents). If this wasn’t intentional, please revert the lockfile change; if it was, please regenerate/commit the full lockfile changes from a single npm install run so the lock state is consistent and reviewable.
| from groq import AsyncGroq | ||
| from livekit import api | ||
| from contextlib import asynccontextmanager | ||
| try: |
There was a problem hiding this comment.
asynccontextmanager is imported near the top of the file, but it’s also imported again later (from contextlib import asynccontextmanager). Since this PR is already touching the import block, please remove the duplicate import to avoid confusion and keep imports centralized.
💡 What:
Replaced the synchronous
Groqclient instantiation and completions creation withAsyncGroqandawaitinside thebackend/main.pyFastAPI routes (/ai-voiceand/initiate-transfer). Added a new journal entry to.jules/bolt.mdrecording this FastAPI optimization learning.🎯 Why:
Using a synchronous IO-bound client inside an
async defrouting function in FastAPI is a severe performance bottleneck. While the synchronous network request is in flight, the entire event loop is blocked, causing all other concurrent requests to hang until the call completes. This directly impacts the scalability and responsiveness of the application.📊 Impact:
Massive improvement to concurrent throughput. The event loop is no longer suspended while waiting for the LLM to stream or return responses, meaning the server can serve multiple users or endpoints simultaneously without blocking.
🔬 Measurement:
This can be verified by load testing the
/initiate-transferendpoint or the/ai-voiceendpoint with multiple concurrent clients; withoutawait, subsequent requests queue behind the first one. Tests have been successfully run locally, verifying that functionality remains exactly identical.PR created automatically by Jules for task 2496928201364128679 started by @Deepaksingh7238