Skip to content

feat: SSE streaming support + flexible model routing for CCR integration#67

Open
gyy0592 wants to merge 3 commits intoAmm1rr:masterfrom
gyy0592:master
Open

feat: SSE streaming support + flexible model routing for CCR integration#67
gyy0592 wants to merge 3 commits intoAmm1rr:masterfrom
gyy0592:master

Conversation

@gyy0592
Copy link
Copy Markdown

@gyy0592 gyy0592 commented Mar 15, 2026

Description:

Summary

  • Fix streaming: /v1/chat/completions now returns a proper SSE
    StreamingResponse (text/event-stream) when stream=true, instead
    of a plain JSON blob. This allows claude-code-router (CCR) to correctly
    relay responses back to Claude Code.

  • Flexible model names: Removed the GeminiModels enum restriction.
    model field is now a plain str, accepting any model name without
    schema drift.

  • Model alias resolution: Added MODEL_ALIASES in models/gemini.py
    with verified short names:

    • flashgemini-2.0-flash-exp (UI: Gemini 3.1 Flash)
    • thinkinggemini-2.0-exp-advanced (UI: Gemini 3.1 Flash Thinking)
    • progemini-1.5-pro (UI: Gemini 3.1 Pro)
      Note: the Gemini web UI shows marketing names ("3.1 Flash") which differ
      from the internal identifiers accepted by gemini-webapi.

Files Changed

  • src/app/endpoints/chat.py — SSE streaming via stream_openai_format()
    generator
  • src/app/endpoints/gemini.py — remove .value call (model no longer an
    enum)
  • src/models/gemini.pyMODEL_ALIASES + resolve_model_name()
  • src/schemas/request.py — replace GeminiModels enum with str

gyy0592 and others added 3 commits March 15, 2026 14:53
- chat.py: implement stream_openai_format() generator returning proper
  SSE events (role delta -> content delta -> stop); use StreamingResponse
  with text/event-stream so CCR can relay responses to Claude Code
- models/gemini.py: add MODEL_ALIASES dict + resolve_model_name() for
  short aliases (flash, thinking, pro) and compatibility mappings
- gemini.py / chat.py: remove .value calls now that model is plain str
- schemas/request.py: replace GeminiModels enum with plain str field to
  allow arbitrary model names without schema drift

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Gemini web UI shows marketing names (e.g. "3.1 Flash") which differ
from the internal identifiers accepted by gemini-webapi. Removed unverified
aliases and documented the confirmed mapping:
  flash    -> gemini-2.0-flash-exp     (web UI: Gemini 3.1 Flash)
  thinking -> gemini-2.0-exp-advanced  (web UI: Gemini 3.1 Flash Thinking)
  pro      -> gemini-1.5-pro           (web UI: Gemini 3.1 Pro)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Old model names used stale x-goog-ext header hashes from v1.8.3.
Upgrade to v1.21.0 with correct identifiers:
  flash    -> gemini-3.0-flash
  thinking -> gemini-3.0-flash-thinking
  pro      -> gemini-3.1-pro

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gyy0592
Copy link
Copy Markdown
Author

gyy0592 commented Mar 15, 2026

Update: gemini-webapi upgraded to v1.21.0 + model name fixes

Root cause found

The previous model aliases (gemini-2.0-flash-exp, gemini-1.5-pro, etc.) used
stale x-goog-ext-525001261-jspb header hashes from gemini-webapi v1.8.3.
Google changed these identifiers, so all requests silently fell back to Flash
regardless of the model selected.

Changes since last update

  • src/models/gemini.py: Updated MODEL_ALIASES to v1.21.0 model names:
    • flashgemini-3.0-flash
    • thinkinggemini-3.0-flash-thinking
    • progemini-3.1-pro
  • Dependency: requires gemini-webapi >= 1.21.0

Recommendation

Update requirements.txt to pin gemini-webapi>=1.21.0 to prevent regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant