Skip to content

⚡ Implement in-memory caching for TTS API#25

Open
google-labs-jules[bot] wants to merge 1 commit intomainfrom
perf/tts-caching-11748698826946078814
Open

⚡ Implement in-memory caching for TTS API#25
google-labs-jules[bot] wants to merge 1 commit intomainfrom
perf/tts-caching-11748698826946078814

Conversation

@google-labs-jules
Copy link
Contributor

💡 What:
Implemented an in-memory LRU-like cache for the Voice Proxy API (api/proxy-voice.ts).

  • Uses a Map to store generated base64 audio keyed by input text.
  • Limits cache size to 50 entries to prevent memory leaks in the serverless environment.
  • Returns cached audio immediately for identical text inputs.

🎯 Why:
The Google GenAI TTS API is an expensive operation (both in latency and potentially cost). Since the system prompts for a specific "neutral, calm, cold, mechanical robotic British voice", the output for a given text input is deterministic and highly cacheable. This reduces latency from ~500ms+ (API dependent) to <1ms for repeated requests (e.g., users re-playing a message or concurrent users hitting common system phrases).

📊 Measured Improvement:
A synthetic benchmark (scripts/perf-test-cache.ts) was created to simulate the API latency (500ms) and measure the impact of the caching strategy.

Results (5 Iterations):

  • Baseline (No Cache): 2500ms total execution time.
  • Optimized (With Cache): 501ms total execution time.
  • Speedup: ~5x faster overall for repeated calls.
  • Latency Reduction: 99.9% reduction for cache hits (from 500ms to ~0ms).

Note: As this is a serverless function, the cache persists only as long as the container is "warm". This is a standard optimization pattern for Vercel functions (also used in api/admin-stats.ts) to handle burst traffic efficiently without external infrastructure overhead.


PR created automatically by Jules for task 11748698826946078814 started by @cjo93

@google-labs-jules
Copy link
Contributor Author

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@vercel
Copy link

vercel bot commented Feb 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
thisisdefrag Ready Ready Preview, Comment Feb 5, 2026 7:15am
v0-thisisdefrag Ready Ready Preview, Comment, Open in v0 Feb 5, 2026 7:15am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants