diff --git a/README.md b/README.md index 90a676b..f85a06f 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,12 @@ # opencode-llm-proxy -An [OpenCode](https://opencode.ai) plugin that starts a local OpenAI-compatible HTTP server backed by your OpenCode providers. +An [OpenCode](https://opencode.ai) plugin that starts a local HTTP server backed by your OpenCode providers, with support for multiple LLM API formats: -Any tool or application that speaks the OpenAI Chat Completions or Responses API can use it — including LangChain, custom scripts, local frontends, etc. +- **OpenAI** Chat Completions (`POST /v1/chat/completions`) and Responses (`POST /v1/responses`) +- **Anthropic** Messages API (`POST /v1/messages`) +- **Google Gemini** API (`POST /v1beta/models/:model:generateContent`) + +Any tool or SDK that targets one of these APIs can point at the proxy without code changes. ## Quickstart @@ -92,7 +96,7 @@ curl http://127.0.0.1:4010/v1/models Returns all models from all providers configured in your OpenCode setup (e.g. `github-copilot/claude-sonnet-4.6`, `ollama/qwen3.5:9b`, etc.). -### Chat completions +### OpenAI Chat Completions ```bash curl http://127.0.0.1:4010/v1/chat/completions \ @@ -105,7 +109,7 @@ curl http://127.0.0.1:4010/v1/chat/completions \ }' ``` -Use the fully-qualified `provider/model` ID from `GET /v1/models`. +Use the fully-qualified `provider/model` ID from `GET /v1/models`. Supports `"stream": true` for SSE streaming. ### OpenAI Responses API @@ -118,6 +122,69 @@ curl http://127.0.0.1:4010/v1/responses \ }' ``` +Supports `"stream": true` for SSE streaming. + +### Anthropic Messages API + +Point the Anthropic SDK (or any client) at this proxy: + +```bash +curl http://127.0.0.1:4010/v1/messages \ + -H "Content-Type: application/json" \ + -d '{ + "model": "anthropic/claude-3-5-sonnet", + "max_tokens": 1024, + "system": "You are a helpful assistant.", + "messages": [{"role": "user", "content": "Hello!"}] + }' +``` + +Supports `"stream": true` for SSE streaming with standard Anthropic streaming events (`message_start`, `content_block_delta`, `message_stop`, etc.). + +To point the official Anthropic SDK at this proxy: + +```js +import Anthropic from "@anthropic-ai/sdk" + +const client = new Anthropic({ + baseURL: "http://127.0.0.1:4010", + apiKey: "unused", // or your OPENCODE_LLM_PROXY_TOKEN +}) +``` + +### Google Gemini API + +```bash +# Non-streaming +curl http://127.0.0.1:4010/v1beta/models/google/gemini-2.0-flash:generateContent \ + -H "Content-Type: application/json" \ + -d '{ + "contents": [{"role": "user", "parts": [{"text": "Hello!"}]}] + }' + +# Streaming (newline-delimited JSON) +curl http://127.0.0.1:4010/v1beta/models/google/gemini-2.0-flash:streamGenerateContent \ + -H "Content-Type: application/json" \ + -d '{ + "contents": [{"role": "user", "parts": [{"text": "Hello!"}]}] + }' +``` + +The model name in the URL path is resolved the same way as other endpoints (use `provider/model` or a bare model ID if unambiguous). + +To point the Google Generative AI SDK at this proxy, set the `baseUrl` option to `http://127.0.0.1:4010`. + +## Selecting a provider + +All endpoints accept an optional `x-opencode-provider` header to force a specific provider when the model ID is ambiguous: + +```bash +curl http://127.0.0.1:4010/v1/chat/completions \ + -H "x-opencode-provider: anthropic" \ + -H "Content-Type: application/json" \ + -d '{"model": "claude-3-5-sonnet", "messages": [...]}' +``` + ## Configuration All configuration is done through environment variables. No configuration file is needed. @@ -149,15 +216,14 @@ curl http://:4010/v1/models \ ## How it works -The plugin hooks into OpenCode at startup and spawns a Bun HTTP server. Incoming OpenAI-format requests are translated into OpenCode SDK calls (`client.session.create` + `client.session.prompt`), routed through whichever provider/model is requested, and the response is returned in OpenAI format. +The plugin hooks into OpenCode at startup and spawns a Bun HTTP server. Incoming requests (in OpenAI, Anthropic, or Gemini format) are translated into OpenCode SDK calls (`client.session.create` + `client.session.prompt`), routed through whichever provider/model is requested, and the response is returned in the matching API format. Each request creates a temporary OpenCode session, so prompts and responses appear in the OpenCode session list. ## Limitations -- Streaming (`"stream": true`) is not yet implemented — requests will return a 400 error. - Tool/function calling is not forwarded; all built-in OpenCode tools are disabled for proxy sessions. -- The proxy only handles `POST /v1/chat/completions` and `POST /v1/responses`. Other OpenAI endpoints are not implemented. +- Only text content is handled; image and file inputs are ignored. ## License diff --git a/index.js b/index.js index c9c1918..227f01e 100644 --- a/index.js +++ b/index.js @@ -569,6 +569,130 @@ function createModelResponse(models) { } } +// --------------------------------------------------------------------------- +// Anthropic Messages API helpers +// --------------------------------------------------------------------------- + +export function normalizeAnthropicMessages(messages) { + return messages + .map((message) => { + let content = "" + if (typeof message.content === "string") { + content = message.content.trim() + } else if (Array.isArray(message.content)) { + content = message.content + .filter((block) => block && block.type === "text" && typeof block.text === "string") + .map((block) => block.text.trim()) + .filter(Boolean) + .join("\n\n") + } + return { role: message.role, content } + }) + .filter((message) => message.content.length > 0) +} + +export function mapFinishReasonToAnthropic(finish) { + if (!finish) return "end_turn" + if (finish.includes("length")) return "max_tokens" + if (finish.includes("tool")) return "tool_use" + return "end_turn" +} + +function createAnthropicResponse(result, model) { + const tokensIn = result.completion.data.info?.tokens?.input ?? 0 + const tokensOut = result.completion.data.info?.tokens?.output ?? 0 + return { + id: `msg_${crypto.randomUUID().replace(/-/g, "")}`, + type: "message", + role: "assistant", + content: [{ type: "text", text: result.content }], + model: model.id, + stop_reason: mapFinishReasonToAnthropic(result.completion.data.info?.finish), + stop_sequence: null, + usage: { input_tokens: tokensIn, output_tokens: tokensOut }, + } +} + +function anthropicBadRequest(message, status = 400, request) { + return json( + { type: "error", error: { type: "invalid_request_error", message } }, + status, + {}, + request, + ) +} + +function anthropicInternalError(message, status = 500, request) { + return json( + { type: "error", error: { type: "api_error", message } }, + status, + {}, + request, + ) +} + +// --------------------------------------------------------------------------- +// Google Gemini API helpers +// --------------------------------------------------------------------------- + +export function normalizeGeminiContents(contents) { + if (!Array.isArray(contents)) return [] + return contents + .map((item) => { + const role = item.role === "model" ? "assistant" : (item.role ?? "user") + const content = Array.isArray(item.parts) + ? item.parts + .map((part) => (typeof part?.text === "string" ? part.text.trim() : "")) + .filter(Boolean) + .join("\n\n") + : "" + return { role, content } + }) + .filter((m) => m.content.length > 0) +} + +export function extractGeminiSystemInstruction(systemInstruction) { + if (!systemInstruction) return null + if (typeof systemInstruction === "string") return systemInstruction.trim() + if (Array.isArray(systemInstruction.parts)) { + return systemInstruction.parts + .map((part) => (typeof part?.text === "string" ? part.text.trim() : "")) + .filter(Boolean) + .join("\n\n") + } + return null +} + +export function mapFinishReasonToGemini(finish) { + if (!finish) return "STOP" + if (finish.includes("length")) return "MAX_TOKENS" + if (finish.includes("tool")) return "STOP" + return "STOP" +} + +function createGeminiResponse(content, finish, tokens) { + return { + candidates: [ + { + content: { role: "model", parts: [{ text: content }] }, + finishReason: mapFinishReasonToGemini(finish), + index: 0, + }, + ], + usageMetadata: { + promptTokenCount: tokens?.input ?? 0, + candidatesTokenCount: tokens?.output ?? 0, + totalTokenCount: (tokens?.input ?? 0) + (tokens?.output ?? 0), + }, + } +} + +function geminiModelFromPath(pathname) { + // Matches /v1beta/models/some-model:generateContent or :streamGenerateContent + const match = pathname.match(/^\/v1beta\/models\/([^/:]+)(?::(?:generate|stream)(?:Content|GenerateContent))?$/) + return match ? match[1] : null +} + export function createProxyFetchHandler(client) { return async (request) => { const url = new URL(request.url) @@ -890,6 +1014,250 @@ export function createProxyFetchHandler(client) { } } + // ----------------------------------------------------------------------- + // Anthropic Messages API POST /v1/messages + // ----------------------------------------------------------------------- + + if (request.method === "POST" && url.pathname === "/v1/messages") { + let body + try { + body = await request.json() + } catch { + return anthropicBadRequest("Request body must be valid JSON.", 400, request) + } + + if (!body.model) { + return anthropicBadRequest("The 'model' field is required.", 400, request) + } + + if (!Array.isArray(body.messages) || body.messages.length === 0) { + return anthropicBadRequest("The 'messages' field must contain at least one message.", 400, request) + } + + const messages = normalizeAnthropicMessages(body.messages) + if (messages.length === 0) { + return anthropicBadRequest("No text content was found in the supplied messages.", 400, request) + } + + // Prepend Anthropic top-level system string as a system message so buildSystemPrompt picks it up. + const allMessages = + typeof body.system === "string" && body.system.trim() + ? [{ role: "system", content: body.system.trim() }, ...messages] + : messages + + const system = buildSystemPrompt(allMessages, { + temperature: body.temperature, + max_tokens: body.max_tokens, + }) + + let model + try { + const providerOverride = request.headers.get("x-opencode-provider") + model = await resolveModel(client, body.model, providerOverride) + } catch (error) { + const message = error instanceof Error ? error.message : String(error) + await safeLog(client, "error", "Anthropic proxy call failed (model resolve)", { error: message, requestedModel: body.model }) + return anthropicBadRequest(message, 400, request) + } + + if (body.stream) { + const msgID = `msg_${crypto.randomUUID().replace(/-/g, "")}` + const queue = createSseQueue() + + function sseEvent(eventType, data) { + return `event: ${eventType}\ndata: ${JSON.stringify(data)}\n\n` + } + + async function* generateSse() { + queue.enqueue(sseEvent("message_start", { + type: "message_start", + message: { + id: msgID, + type: "message", + role: "assistant", + content: [], + model: model.id, + stop_reason: null, + stop_sequence: null, + usage: { input_tokens: 0, output_tokens: 0 }, + }, + })) + queue.enqueue(sseEvent("content_block_start", { + type: "content_block_start", + index: 0, + content_block: { type: "text", text: "" }, + })) + + const runPromise = executePromptStreaming( + client, + model, + messages, + system, + (delta) => { + queue.enqueue(sseEvent("content_block_delta", { + type: "content_block_delta", + index: 0, + delta: { type: "text_delta", text: delta }, + })) + }, + ) + .then((streamResult) => { + queue.enqueue(sseEvent("content_block_stop", { type: "content_block_stop", index: 0 })) + queue.enqueue(sseEvent("message_delta", { + type: "message_delta", + delta: { + stop_reason: mapFinishReasonToAnthropic(streamResult.finish), + stop_sequence: null, + }, + usage: { output_tokens: streamResult.tokens.output }, + })) + queue.enqueue(sseEvent("message_stop", { type: "message_stop" })) + }) + .catch(async (err) => { + const errMsg = err instanceof Error ? err.message : String(err) + await safeLog(client, "error", "Anthropic proxy streaming call failed", { error: errMsg, requestedModel: body.model }) + queue.enqueue(sseEvent("error", { type: "error", error: { type: "api_error", message: errMsg } })) + }) + .finally(() => { + queue.finish() + }) + + yield* queue.generateChunks() + await runPromise + } + + return sseResponse(corsHeaders(request), generateSse()) + } + + try { + const result = await executePrompt(client, body, model, messages, system) + return json(createAnthropicResponse(result, model), 200, {}, request) + } catch (error) { + const message = error instanceof Error ? error.message : String(error) + await safeLog(client, "error", "Anthropic proxy call failed", { error: message, requestedModel: body.model }) + return anthropicInternalError(message, 500, request) + } + } + + // ----------------------------------------------------------------------- + // Google Gemini API POST /v1beta/models/:model:generateContent (non-streaming) + // POST /v1beta/models/:model:streamGenerateContent (streaming) + // ----------------------------------------------------------------------- + + const isGeminiNonStream = request.method === "POST" && url.pathname.endsWith(":generateContent") + const isGeminiStream = request.method === "POST" && url.pathname.endsWith(":streamGenerateContent") + + if (isGeminiNonStream || isGeminiStream) { + const geminiModelName = geminiModelFromPath(url.pathname) + if (!geminiModelName) { + return badRequest("Could not extract model name from URL.", 400, request) + } + + let body + try { + body = await request.json() + } catch { + return badRequest("Request body must be valid JSON.", 400, request) + } + + if (!Array.isArray(body.contents) || body.contents.length === 0) { + return badRequest("The 'contents' field must contain at least one item.", 400, request) + } + + const messages = normalizeGeminiContents(body.contents) + if (messages.length === 0) { + return badRequest("No text content was found in the supplied contents.", 400, request) + } + + const systemText = extractGeminiSystemInstruction(body.systemInstruction) + const systemMessages = systemText ? [{ role: "system", content: systemText }, ...messages] : messages + const system = buildSystemPrompt(systemMessages, { + temperature: body.generationConfig?.temperature, + max_tokens: body.generationConfig?.maxOutputTokens, + }) + + let model + try { + const providerOverride = request.headers.get("x-opencode-provider") + model = await resolveModel(client, geminiModelName, providerOverride) + } catch (error) { + const message = error instanceof Error ? error.message : String(error) + await safeLog(client, "error", "Gemini proxy call failed (model resolve)", { error: message, requestedModel: geminiModelName }) + return badRequest(message, 400, request) + } + + if (isGeminiStream) { + const queue = createSseQueue() + + async function* generateNdJson() { + const runPromise = executePromptStreaming( + client, + model, + messages, + system, + (delta) => { + const chunk = JSON.stringify(createGeminiResponse(delta, null, null)) + queue.enqueue(chunk + "\n") + }, + ) + .then((streamResult) => { + const finalChunk = JSON.stringify( + createGeminiResponse("", streamResult.finish, streamResult.tokens), + ) + queue.enqueue(finalChunk + "\n") + }) + .catch(async (err) => { + const errMsg = err instanceof Error ? err.message : String(err) + await safeLog(client, "error", "Gemini proxy streaming call failed", { error: errMsg, requestedModel: geminiModelName }) + const errChunk = JSON.stringify({ error: { code: 500, message: errMsg, status: "INTERNAL" } }) + queue.enqueue(errChunk + "\n") + }) + .finally(() => { + queue.finish() + }) + + yield* queue.generateChunks() + await runPromise + } + + const encoder = new TextEncoder() + const body_ = new ReadableStream({ + async start(controller) { + try { + for await (const chunk of generateNdJson()) { + controller.enqueue(encoder.encode(chunk)) + } + } catch { + // errors surfaced via data + } finally { + controller.close() + } + }, + }) + + return new Response(body_, { + status: 200, + headers: { + "content-type": "application/json", + "cache-control": "no-cache", + connection: "keep-alive", + ...corsHeaders(request), + }, + }) + } + + try { + const result = await executePrompt(client, body, model, messages, system) + const finish = result.completion.data.info?.finish + const tokens = result.completion.data.info?.tokens + return json(createGeminiResponse(result.content, finish, tokens), 200, {}, request) + } catch (error) { + const message = error instanceof Error ? error.message : String(error) + await safeLog(client, "error", "Gemini proxy call failed", { error: message, requestedModel: geminiModelName }) + return badRequest(message, 500, request) + } + } + return text("Not found", 404, request) } } diff --git a/index.test.js b/index.test.js index f26b033..f663ea0 100644 --- a/index.test.js +++ b/index.test.js @@ -12,6 +12,11 @@ import { extractAssistantText, mapFinishReason, resolveModel, + normalizeAnthropicMessages, + mapFinishReasonToAnthropic, + normalizeGeminiContents, + extractGeminiSystemInstruction, + mapFinishReasonToGemini, } from "./index.js" // --------------------------------------------------------------------------- @@ -1131,3 +1136,553 @@ test("POST /v1/responses stream: true with session.error emits response.failed", const text = await response.text() assert.ok(text.includes("response.failed") || text.includes("Rate limit exceeded")) }) + +// --------------------------------------------------------------------------- +// Unit: normalizeAnthropicMessages +// --------------------------------------------------------------------------- +describe("normalizeAnthropicMessages", () => { + it("passes through string content unchanged", () => { + const input = [{ role: "user", content: "hello" }] + assert.deepEqual(normalizeAnthropicMessages(input), [{ role: "user", content: "hello" }]) + }) + + it("trims whitespace from string content", () => { + const input = [{ role: "user", content: " hi " }] + assert.deepEqual(normalizeAnthropicMessages(input), [{ role: "user", content: "hi" }]) + }) + + it("joins text blocks from array content", () => { + const input = [ + { + role: "user", + content: [ + { type: "text", text: "first" }, + { type: "text", text: "second" }, + ], + }, + ] + assert.deepEqual(normalizeAnthropicMessages(input), [{ role: "user", content: "first\n\nsecond" }]) + }) + + it("ignores non-text blocks in array content", () => { + const input = [ + { + role: "user", + content: [ + { type: "image", source: {} }, + { type: "text", text: "only this" }, + ], + }, + ] + assert.deepEqual(normalizeAnthropicMessages(input), [{ role: "user", content: "only this" }]) + }) + + it("drops messages with empty content", () => { + const input = [ + { role: "user", content: "" }, + { role: "assistant", content: "response" }, + ] + assert.deepEqual(normalizeAnthropicMessages(input), [{ role: "assistant", content: "response" }]) + }) +}) + +// --------------------------------------------------------------------------- +// Unit: mapFinishReasonToAnthropic +// --------------------------------------------------------------------------- +describe("mapFinishReasonToAnthropic", () => { + it("returns end_turn for undefined", () => { + assert.equal(mapFinishReasonToAnthropic(undefined), "end_turn") + }) + + it("returns end_turn for null", () => { + assert.equal(mapFinishReasonToAnthropic(null), "end_turn") + }) + + it("returns max_tokens when finish includes length", () => { + assert.equal(mapFinishReasonToAnthropic("max_length"), "max_tokens") + }) + + it("returns tool_use when finish includes tool", () => { + assert.equal(mapFinishReasonToAnthropic("tool_use"), "tool_use") + }) + + it("returns end_turn for unrecognised values", () => { + assert.equal(mapFinishReasonToAnthropic("stop"), "end_turn") + }) +}) + +// --------------------------------------------------------------------------- +// Unit: normalizeGeminiContents +// --------------------------------------------------------------------------- +describe("normalizeGeminiContents", () => { + it("returns empty array for non-array input", () => { + assert.deepEqual(normalizeGeminiContents(null), []) + assert.deepEqual(normalizeGeminiContents("string"), []) + }) + + it("converts user role and joins text parts", () => { + const contents = [{ role: "user", parts: [{ text: "hello" }] }] + assert.deepEqual(normalizeGeminiContents(contents), [{ role: "user", content: "hello" }]) + }) + + it("maps model role to assistant", () => { + const contents = [{ role: "model", parts: [{ text: "hi there" }] }] + assert.deepEqual(normalizeGeminiContents(contents), [{ role: "assistant", content: "hi there" }]) + }) + + it("joins multiple parts with double newline", () => { + const contents = [{ role: "user", parts: [{ text: "line one" }, { text: "line two" }] }] + assert.deepEqual(normalizeGeminiContents(contents), [{ role: "user", content: "line one\n\nline two" }]) + }) + + it("drops items with no text content", () => { + const contents = [ + { role: "user", parts: [{ text: "" }] }, + { role: "user", parts: [{ text: "kept" }] }, + ] + assert.deepEqual(normalizeGeminiContents(contents), [{ role: "user", content: "kept" }]) + }) +}) + +// --------------------------------------------------------------------------- +// Unit: extractGeminiSystemInstruction +// --------------------------------------------------------------------------- +describe("extractGeminiSystemInstruction", () => { + it("returns null for null/undefined input", () => { + assert.equal(extractGeminiSystemInstruction(null), null) + assert.equal(extractGeminiSystemInstruction(undefined), null) + }) + + it("returns trimmed string for string input", () => { + assert.equal(extractGeminiSystemInstruction(" be helpful "), "be helpful") + }) + + it("joins parts array", () => { + const si = { parts: [{ text: "be concise" }, { text: "and clear" }] } + assert.equal(extractGeminiSystemInstruction(si), "be concise\n\nand clear") + }) + + it("returns null for object without parts", () => { + assert.equal(extractGeminiSystemInstruction({ role: "system" }), null) + }) +}) + +// --------------------------------------------------------------------------- +// Unit: mapFinishReasonToGemini +// --------------------------------------------------------------------------- +describe("mapFinishReasonToGemini", () => { + it("returns STOP for undefined", () => { + assert.equal(mapFinishReasonToGemini(undefined), "STOP") + }) + + it("returns MAX_TOKENS when finish includes length", () => { + assert.equal(mapFinishReasonToGemini("max_length"), "MAX_TOKENS") + }) + + it("returns STOP for tool_use", () => { + assert.equal(mapFinishReasonToGemini("tool_use"), "STOP") + }) + + it("returns STOP for end_turn", () => { + assert.equal(mapFinishReasonToGemini("end_turn"), "STOP") + }) +}) + +// --------------------------------------------------------------------------- +// Integration: POST /v1/messages (Anthropic Messages API) +// --------------------------------------------------------------------------- + +function createAnthropicClient(responseContent = "Hello from Anthropic.") { + return { + app: { log: async () => {} }, + tool: { ids: async () => ({ data: [] }) }, + config: { + providers: async () => ({ + data: { + providers: [ + { + id: "anthropic", + models: { "claude-3-5-sonnet": { id: "claude-3-5-sonnet", name: "Claude 3.5 Sonnet" } }, + }, + ], + }, + }), + }, + session: { + create: async () => ({ data: { id: "sess-ant-1" } }), + prompt: async () => ({ + data: { + parts: [{ type: "text", text: responseContent }], + info: { tokens: { input: 15, output: 10, reasoning: 0, cache: { read: 0, write: 0 } }, finish: "end_turn" }, + }, + }), + }, + } +} + +test("POST /v1/messages returns a well-formed Anthropic response", async () => { + const handler = createProxyFetchHandler(createAnthropicClient("Hi there!")) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + model: "anthropic/claude-3-5-sonnet", + max_tokens: 1024, + messages: [{ role: "user", content: "Say hello." }], + }), + }) + + const response = await handler(request) + const body = await response.json() + + assert.equal(response.status, 200) + assert.equal(body.type, "message") + assert.equal(body.role, "assistant") + assert.ok(body.id.startsWith("msg_")) + assert.ok(Array.isArray(body.content)) + assert.equal(body.content[0].type, "text") + assert.equal(body.content[0].text, "Hi there!") + assert.equal(body.stop_reason, "end_turn") + assert.equal(body.usage.input_tokens, 15) + assert.equal(body.usage.output_tokens, 10) +}) + +test("POST /v1/messages system string is included in prompt", async () => { + let capturedSystem = null + const client = { + app: { log: async () => {} }, + tool: { ids: async () => ({ data: [] }) }, + config: { + providers: async () => ({ + data: { + providers: [{ id: "anthropic", models: { "claude-3-5-sonnet": { id: "claude-3-5-sonnet" } } }], + }, + }), + }, + session: { + create: async () => ({ data: { id: "sess-ant-sys" } }), + prompt: async ({ body }) => { + capturedSystem = body.system + return { + data: { + parts: [{ type: "text", text: "ok" }], + info: { tokens: { input: 1, output: 1, reasoning: 0, cache: { read: 0, write: 0 } }, finish: "end_turn" }, + }, + } + }, + }, + } + + const handler = createProxyFetchHandler(client) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + model: "anthropic/claude-3-5-sonnet", + system: "You are a pirate.", + messages: [{ role: "user", content: "Hello." }], + }), + }) + + await handler(request) + assert.ok(capturedSystem?.includes("You are a pirate.")) +}) + +test("POST /v1/messages missing model returns Anthropic error format", async () => { + const handler = createProxyFetchHandler(createAnthropicClient()) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ messages: [{ role: "user", content: "hi" }] }), + }) + + const response = await handler(request) + const body = await response.json() + + assert.equal(response.status, 400) + assert.equal(body.type, "error") + assert.ok(body.error.type === "invalid_request_error") + assert.ok(body.error.message.includes("model")) +}) + +test("POST /v1/messages missing messages returns 400", async () => { + const handler = createProxyFetchHandler(createAnthropicClient()) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ model: "anthropic/claude-3-5-sonnet" }), + }) + + const response = await handler(request) + const body = await response.json() + + assert.equal(response.status, 400) + assert.equal(body.type, "error") +}) + +test("POST /v1/messages malformed JSON returns 400", async () => { + const handler = createProxyFetchHandler(createAnthropicClient()) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: "{ bad json", + }) + + const response = await handler(request) + const body = await response.json() + + assert.equal(response.status, 400) + assert.equal(body.type, "error") +}) + +test("POST /v1/messages stream: true returns Anthropic SSE events", async () => { + const events = [ + { + type: "message.part.updated", + properties: { + part: { sessionID: "sess-123", type: "text" }, + delta: "Hello", + }, + }, + { + type: "message.part.updated", + properties: { + part: { sessionID: "sess-123", type: "text" }, + delta: " world", + }, + }, + { type: "session.idle", properties: { sessionID: "sess-123" } }, + ] + + const handler = createProxyFetchHandler(createStreamingClient(events)) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + model: "gpt-4o", + stream: true, + messages: [{ role: "user", content: "hi" }], + }), + }) + + const response = await handler(request) + + assert.equal(response.status, 200) + assert.ok(response.headers.get("content-type")?.includes("text/event-stream")) + + const text = await response.text() + assert.ok(text.includes("message_start")) + assert.ok(text.includes("content_block_start")) + assert.ok(text.includes("content_block_delta")) + assert.ok(text.includes("Hello")) + assert.ok(text.includes(" world")) + assert.ok(text.includes("message_stop")) +}) + +test("POST /v1/messages stream: true with session.error emits SSE error event", async () => { + const events = [ + { + type: "session.error", + properties: { + sessionID: "sess-123", + error: { message: "Model overloaded" }, + }, + }, + { type: "session.idle", properties: { sessionID: "sess-123" } }, + ] + + const handler = createProxyFetchHandler(createStreamingClient(events)) + const request = new Request("http://127.0.0.1:4010/v1/messages", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + model: "gpt-4o", + stream: true, + messages: [{ role: "user", content: "hi" }], + }), + }) + + const response = await handler(request) + assert.equal(response.status, 200) + + const text = await response.text() + assert.ok(text.includes("error") || text.includes("Model overloaded")) +}) + +// --------------------------------------------------------------------------- +// Integration: POST /v1beta/models/:model:generateContent (Gemini API) +// --------------------------------------------------------------------------- + +function createGeminiClient(responseContent = "Hello from Gemini.") { + return { + app: { log: async () => {} }, + tool: { ids: async () => ({ data: [] }) }, + config: { + providers: async () => ({ + data: { + providers: [ + { + id: "google", + models: { "gemini-2.0-flash": { id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" } }, + }, + ], + }, + }), + }, + session: { + create: async () => ({ data: { id: "sess-gem-1" } }), + prompt: async () => ({ + data: { + parts: [{ type: "text", text: responseContent }], + info: { tokens: { input: 12, output: 7, reasoning: 0, cache: { read: 0, write: 0 } }, finish: "end_turn" }, + }, + }), + }, + } +} + +test("POST /v1beta/models/gemini-2.0-flash:generateContent returns Gemini response", async () => { + const handler = createProxyFetchHandler(createGeminiClient("Gemini says hi!")) + const request = new Request("http://127.0.0.1:4010/v1beta/models/gemini-2.0-flash:generateContent", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + contents: [{ role: "user", parts: [{ text: "Say hi." }] }], + }), + }) + + const response = await handler(request) + const body = await response.json() + + assert.equal(response.status, 200) + assert.ok(Array.isArray(body.candidates)) + assert.equal(body.candidates[0].content.role, "model") + assert.equal(body.candidates[0].content.parts[0].text, "Gemini says hi!") + assert.equal(body.candidates[0].finishReason, "STOP") + assert.equal(body.usageMetadata.promptTokenCount, 12) + assert.equal(body.usageMetadata.candidatesTokenCount, 7) + assert.equal(body.usageMetadata.totalTokenCount, 19) +}) + +test("POST /v1beta/models/:model:generateContent systemInstruction is included", async () => { + let capturedSystem = null + const client = { + app: { log: async () => {} }, + tool: { ids: async () => ({ data: [] }) }, + config: { + providers: async () => ({ + data: { + providers: [{ id: "google", models: { "gemini-2.0-flash": { id: "gemini-2.0-flash" } } }], + }, + }), + }, + session: { + create: async () => ({ data: { id: "sess-gem-sys" } }), + prompt: async ({ body }) => { + capturedSystem = body.system + return { + data: { + parts: [{ type: "text", text: "ok" }], + info: { tokens: { input: 1, output: 1, reasoning: 0, cache: { read: 0, write: 0 } }, finish: "end_turn" }, + }, + } + }, + }, + } + + const handler = createProxyFetchHandler(client) + const request = new Request("http://127.0.0.1:4010/v1beta/models/gemini-2.0-flash:generateContent", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + contents: [{ role: "user", parts: [{ text: "Hello." }] }], + systemInstruction: { parts: [{ text: "You are a helpful assistant." }] }, + }), + }) + + await handler(request) + assert.ok(capturedSystem?.includes("You are a helpful assistant.")) +}) + +test("POST /v1beta/models/:model:generateContent missing contents returns 400", async () => { + const handler = createProxyFetchHandler(createGeminiClient()) + const request = new Request("http://127.0.0.1:4010/v1beta/models/gemini-2.0-flash:generateContent", { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ generationConfig: { maxOutputTokens: 100 } }), + }) + + const response = await handler(request) + const body = await response.json() + + assert.equal(response.status, 400) + assert.ok(body.error.message.includes("contents")) +}) + +test("POST /v1beta/models/:model:generateContent malformed JSON returns 400", async () => { + const handler = createProxyFetchHandler(createGeminiClient()) + const request = new Request("http://127.0.0.1:4010/v1beta/models/gemini-2.0-flash:generateContent", { + method: "POST", + headers: { "content-type": "application/json" }, + body: "{ not json", + }) + + const response = await handler(request) + + assert.equal(response.status, 400) +}) + +test("POST /v1beta/models/:model:streamGenerateContent returns NDJSON stream", async () => { + const events = [ + { + type: "message.part.updated", + properties: { + part: { sessionID: "sess-123", type: "text" }, + delta: "Gem", + }, + }, + { + type: "message.part.updated", + properties: { + part: { sessionID: "sess-123", type: "text" }, + delta: "ini", + }, + }, + { type: "session.idle", properties: { sessionID: "sess-123" } }, + ] + + // Use streaming client but swap provider to google + const streamingClient = createStreamingClient(events) + streamingClient.config = { + providers: async () => ({ + data: { + providers: [ + { id: "google", models: { "gemini-2.0-flash": { id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" } } }, + ], + }, + }), + } + + const handler = createProxyFetchHandler(streamingClient) + const request = new Request( + "http://127.0.0.1:4010/v1beta/models/gemini-2.0-flash:streamGenerateContent", + { + method: "POST", + headers: { "content-type": "application/json" }, + body: JSON.stringify({ + contents: [{ role: "user", parts: [{ text: "Stream this." }] }], + }), + }, + ) + + const response = await handler(request) + + assert.equal(response.status, 200) + assert.ok(response.headers.get("content-type")?.includes("application/json")) + + const text = await response.text() + // Should contain NDJSON lines with candidates + assert.ok(text.includes("candidates")) + assert.ok(text.includes("Gem")) + assert.ok(text.includes("ini")) +})