Background
Tabbit browser has built-in image generation capability (triggered via task mode), but tabbit2api currently only bridges the text chat interface (/v1/chat/completions) and does not expose image generation capabilities to downstream clients like CherryStudio, Codex, or Claude Code.
Packet Capture Analysis
We conducted a network packet capture using Playwright to reverse-engineer how Tabbit handles image generation internally.
Image Generation API Call Chain
POST /api/v1/chat/completion — User sends an image prompt (e.g., "画一只猫")
GET /proxy/v0/browser-task/graph — Tabbit polls task mode status
POST /proxy/v0/cos/presigned-download-url — Fetches a presigned download URL from Tencent COS
- Image URL is returned via SSE stream embedded as Markdown:

Key Internal Endpoints
| Endpoint |
Method |
Purpose |
/api/v1/chat/completion |
POST |
Send chat/image generation request |
/proxy/v0/browser-task/graph |
GET |
Poll task execution status |
/proxy/v0/cos/presigned-download-url |
POST |
Get COS presigned download URL |
Image Storage
- Provider: Tencent Cloud COS (Singapore region)
- Path pattern:
https://tab-sg-1300456063.cos.ap-singapore.myqcloud.com/{user_id}/{session_id}/image_generation/{YYYYMMDD}/{UUID}.png
- Access: Presigned URLs with time-limited signatures
Example Request
POST /api/v1/chat/completion
{
"messages": [
{"role": "user", "content": "生成一张猫的图片"}
],
"model": "gpt-4o",
"stream": true
}
Example Response (SSE)
data: {"choices":[{"delta":{"content":""}}]}
Proposed Solutions
Option A: Extend Existing /v1/chat/completions (Recommended for MVP)
Parse Markdown image syntax from the SSE stream and return image URLs as part of the assistant message content.
Pros: Minimal code changes, backward compatible.
Cons: Clients need to parse Markdown images; no standard /v1/images/generations endpoint.
Option B: Add /v1/images/generations Endpoint (OpenAI Compatible)
Implement the standard OpenAI image generation API:
POST /v1/images/generations
{
"prompt": "a cat",
"n": 1,
"size": "1024x1024"
}
Response:
{
"created": 1234567890,
"data": [{ "url": "https://..." }]
}
Pros: Fully compatible with OpenAI API spec; any OpenAI client can use it directly.
Cons: Requires waiting for SSE stream to complete; higher latency.
Option C: Browser Automation Click (Fallback)
Use Playwright to simulate clicking the "image generation" button in the Tabbit UI.
Pros: Does not depend on internal API stability.
Cons: Fragile to UI changes; high overhead.
Recommended Implementation
Phase 1 (Short-term): Extend chat/completions to detect and relay image URLs.
Phase 2 (Medium-term): Add native /v1/images/generations endpoint.
Environment
- tabbit2api v0.1.3
- Tabbit browser 1.41.10
- OS: Windows 10
- Capture date: 2026-06-23
Additional Context
- Models with
supports_images: true flag (e.g., GPT-5.5, Claude-Opus-4.7) should work
- Image generation currently requires Tabbit's "task mode" to be activated in the UI
- The existing Playwright-based bridge (
tabbit-web-bridge.js) could be extended to pass image prompts
Background
Tabbit browser has built-in image generation capability (triggered via task mode), but tabbit2api currently only bridges the text chat interface (
/v1/chat/completions) and does not expose image generation capabilities to downstream clients like CherryStudio, Codex, or Claude Code.Packet Capture Analysis
We conducted a network packet capture using Playwright to reverse-engineer how Tabbit handles image generation internally.
Image Generation API Call Chain
POST /api/v1/chat/completion— User sends an image prompt (e.g., "画一只猫")GET /proxy/v0/browser-task/graph— Tabbit polls task mode statusPOST /proxy/v0/cos/presigned-download-url— Fetches a presigned download URL from Tencent COSKey Internal Endpoints
/api/v1/chat/completion/proxy/v0/browser-task/graph/proxy/v0/cos/presigned-download-urlImage Storage
https://tab-sg-1300456063.cos.ap-singapore.myqcloud.com/{user_id}/{session_id}/image_generation/{YYYYMMDD}/{UUID}.pngExample Request
Example Response (SSE)
Proposed Solutions
Option A: Extend Existing
/v1/chat/completions(Recommended for MVP)Parse Markdown image syntax from the SSE stream and return image URLs as part of the assistant message content.
Pros: Minimal code changes, backward compatible.
Cons: Clients need to parse Markdown images; no standard
/v1/images/generationsendpoint.Option B: Add
/v1/images/generationsEndpoint (OpenAI Compatible)Implement the standard OpenAI image generation API:
Response:
{ "created": 1234567890, "data": [{ "url": "https://..." }] }Pros: Fully compatible with OpenAI API spec; any OpenAI client can use it directly.
Cons: Requires waiting for SSE stream to complete; higher latency.
Option C: Browser Automation Click (Fallback)
Use Playwright to simulate clicking the "image generation" button in the Tabbit UI.
Pros: Does not depend on internal API stability.
Cons: Fragile to UI changes; high overhead.
Recommended Implementation
Phase 1 (Short-term): Extend
chat/completionsto detect and relay image URLs.Phase 2 (Medium-term): Add native
/v1/images/generationsendpoint.Environment
Additional Context
supports_images: trueflag (e.g., GPT-5.5, Claude-Opus-4.7) should worktabbit-web-bridge.js) could be extended to pass image prompts