Skip to content

fix(openai): harden Chat-to-Responses compatibility#5772

Merged
Calcium-Ion merged 1 commit into
mainfrom
feat/chat-responses-compat
Jun 27, 2026
Merged

fix(openai): harden Chat-to-Responses compatibility#5772
Calcium-Ion merged 1 commit into
mainfrom
feat/chat-responses-compat

Conversation

@Calcium-Ion

@Calcium-Ion Calcium-Ion commented Jun 27, 2026

Copy link
Copy Markdown
Member

⚠️ 提交说明 / PR Notice

Important

  • 请提供人工撰写的简洁摘要,避免直接粘贴未经整理的 AI 输出。

📝 变更描述 / Description

完善 ChatCompletions -> Responses -> ChatCompletions 兼容链路,将 Responses 流式回转逻辑收敛到 service/openaicompat 的状态机中,relay 层只负责 SSE 读写、格式回转和 usage 回传。

主要修复:

  • 保留 assistant 正文与 tool_calls 并存,不再因为工具调用清空正文。
  • Responses streaming 工具参数优先按 output_index 归位,并兼容 item_id / call_id
  • 支持 delta 先到、item 后到、缺少 call_id、EOF finalize 等场景。
  • 映射 response.incomplete 的结束原因:max_output_tokens -> lengthcontent_filter -> content_filter
  • 支持 reasoning summary/text delta 输出到 Chat reasoning 字段。
  • 支持 function_callcustom_tool_call,以及对应参数 delta。
  • 非流式客户端遇到上游 SSE 时可 buffered 聚合为 Chat JSON。
  • Claude/Gemini 回转时允许同一 assistant 输出同时包含 text 和 tool/function parts。

Related to #5745. That PR surfaced overlapping Responses streaming tool-call compatibility gaps; this PR implements the fix independently around a shared state machine and adds broader deterministic/relay coverage.

🚀 变更类型 / Type of change

  • 🐛 Bug 修复 (Bug fix) - 请关联对应 Issue,避免将设计取舍、理解偏差或预期不一致直接归类为 bug
  • ✨ 新功能 (New feature) - 重大特性建议先通过 Issue 沟通
  • ⚡ 性能优化 / 重构 (Refactor)
  • 📝 文档更新 (Documentation)

🔗 关联任务 / Related Issue

✅ 提交前检查项 / Checklist

  • 人工确认: 我已亲自整理并撰写此描述,没有直接粘贴未经处理的 AI 输出。
  • 非重复提交: 我已搜索现有的 IssuesPRs,确认不是重复提交。
  • Bug fix 说明: 若此 PR 标记为 Bug fix,我已提交或关联对应 Issue,且不会将设计取舍、预期不一致或理解偏差直接归类为 bug。
  • 变更理解: 我已理解这些更改的工作原理及可能影响。
  • 范围聚焦: 本 PR 未包含任何与当前任务无关的代码改动。
  • 本地验证: 已在本地运行并通过测试或手动验证,维护者可以据此复核结果。
  • 安全合规: 代码中无敏感凭据,且符合项目代码规范。

📸 运行证明 / Proof of Work

已运行并通过:

go test ./service/openaicompat ./relay/channel/openai ./relay/...

关键覆盖:

  • Chat request -> Responses request: system/developer instructions、assistant text + tool calls、tool output、multimodal input、n>1 拒绝。
  • 非流式 Responses -> Chat: text + function call 共存、reasoning summary、incomplete finish reason、buffered SSE 聚合。
  • 流式 Responses -> Chat: output_index 工具参数归位、delta 先到 item 后到、custom tool、reasoning delta、terminal done/completed/incomplete/failed、EOF finalize。
  • Relay OpenAI SSE 输出顺序:role/start chunk、content/tool delta、finish chunk、usage chunk、[DONE]

Summary by CodeRabbit

  • New Features

    • Added broader support for Responses-based chat conversions, including buffered handling for streamed results and improved output formatting.
    • Improved handling of reasoning text, tool calls, and completion details in chat responses.
  • Bug Fixes

    • Fixed streaming behavior so non-streaming clients can now receive buffered results from streamed upstream responses.
    • Corrected finish-reason and usage reporting for incomplete or partial responses.
    • Improved conversion of tool-call arguments and content ordering for more consistent output.

Add a shared Responses-to-Chat stream state machine and use it from the OpenAI relay path. Preserve assistant text alongside tool calls, bind tool argument deltas by output_index, map incomplete finish reasons, support reasoning/custom tool events, and buffer upstream SSE for non-stream Chat clients.

Add deterministic service tests and relay SSE tests for the conversion path.

Related to #5745.
@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This PR updates Responses-to-chat compatibility across shared response mappings, streaming conversion, buffered handling, relay routing, and Claude/Gemini output shaping. It also adds tests for request/response conversion, stream events, buffered reconstruction, and local live-test artifact handling.

Changes

Responses-to-chat compatibility

Layer / File(s) Summary
Response contracts and finish reasons
dto/openai_response.go, service/openai_chat_responses_compat.go, service/openaicompat/responses_to_chat.go, service/openaicompat/chat_responses_compat_test.go
IncompleteDetails.reason replaces reasoning, a finish-reason wrapper is added, and Responses status, usage, and reasoning are mapped into chat-completion responses with compatibility tests.
Streaming state and buffered reconstruction
service/openaicompat/responses_to_chat.go, service/openaicompat/chat_responses_compat_test.go, .gitignore
Responses stream events are accumulated into chat chunks, finalized, and reconstructed from buffered output; tests cover deltas, terminal output, finalization, and buffered backfilling, and local live-test artifacts are ignored.
Claude and Gemini response shaping
service/convert.go
Claude and Gemini conversions now append text before tool parts and parse tool arguments with common.Unmarshal fallbacks.
Relay routing and handlers
relay/chat_completions_via_responses.go, relay/channel/openai/chat_via_responses.go, relay/channel/openai/chat_via_responses_test.go
Routing now separates client streaming intent from upstream SSE, adds buffered stream handling, and tests both streamed and buffered outputs.

Sequence Diagram(s)

sequenceDiagram
  participant chatCompletionsViaResponses
  participant OaiResponsesToChatStreamHandler
  participant OaiResponsesToChatBufferedStreamHandler
  participant ResponsesStreamEventToChatChunks
  participant FinalizeResponsesToChatStream
  participant ResponsesBufferedAccumulator

  chatCompletionsViaResponses->>OaiResponsesToChatStreamHandler: clientStream && upstreamStream
  chatCompletionsViaResponses->>OaiResponsesToChatBufferedStreamHandler: upstreamStream && !clientStream
  OaiResponsesToChatStreamHandler->>ResponsesStreamEventToChatChunks: convert each stream event
  ResponsesStreamEventToChatChunks-->>OaiResponsesToChatStreamHandler: chat chunks
  OaiResponsesToChatStreamHandler->>FinalizeResponsesToChatStream: emit remaining chunks
  OaiResponsesToChatBufferedStreamHandler->>ResponsesBufferedAccumulator: ProcessEvent(data lines)
  ResponsesBufferedAccumulator-->>OaiResponsesToChatBufferedStreamHandler: rebuilt output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • QuantumNous/new-api#1484: Overlaps with the service/convert.go changes that adjust OpenAI-to-Gemini response conversion and tool argument parsing.
  • QuantumNous/new-api#2837: Related to the Responses-to-chat streaming conversion path and reasoning/tool-chunk handling changed in openaicompat.
  • QuantumNous/new-api#2889: Touches the same relay/channel OpenAI Responses handling path and stream output behavior.

Suggested reviewers

  • seefs001

Poem

A rabbit hopped through streams of light,
and buffered crumbs turned out just right.
🐇 JSON shimmered, chunks sang true,
with carrots, tools, and finish reasons too.
The moon winked once at “data: [DONE]”.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and accurately reflects the main change: hardening OpenAI Chat-to-Responses compatibility.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/chat-responses-compat

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
relay/channel/openai/chat_via_responses_test.go (1)

10-16: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use assert for non-fatal value checks.

Keep require for setup/fatal checks, but switch value/content assertions to assert so one mismatch does not hide the rest. As per coding guidelines, “New or substantially rewritten Go backend tests MUST use require for setup and fatal assertions, and assert for non-fatal value checks.”

Example adjustment
 	"github.com/gin-gonic/gin"
+	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )
...
-	require.Equal(t, 2, usage.PromptTokens)
-	require.Equal(t, 3, usage.CompletionTokens)
-	require.Equal(t, 5, usage.TotalTokens)
+	assert.Equal(t, 2, usage.PromptTokens)
+	assert.Equal(t, 3, usage.CompletionTokens)
+	assert.Equal(t, 5, usage.TotalTokens)
...
-	require.Contains(t, got, `"role":"assistant"`)
+	assert.Contains(t, got, `"role":"assistant"`)

Also applies to: 63-86, 106-116

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@relay/channel/openai/chat_via_responses_test.go` around lines 10 - 16, The
test file currently uses require for value/content checks in
chat_via_responses_test.go, which makes later assertions stop running after the
first mismatch. Keep require only for setup and fatal preconditions, and change
the non-fatal checks in the affected test cases (including the blocks around the
mentioned helper/test symbols) to assert so multiple failures can be reported in
one run.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@relay/channel/openai/chat_via_responses.go`:
- Around line 138-146: The terminal response handling in chat_via_responses.go
only backfills defaults when finalResponse is nil, so partial response objects
can still miss metadata like created and model. Update the response
normalization in the response.done/response.completed path to also fill any
missing fields on an existing finalResponse, using the same defaults currently
set in the nil case (helper.GetResponseID, time.Now().Unix, and
info.UpstreamModelName) before calling accumulator.SupplementResponseOutput.

In `@relay/chat_completions_via_responses.go`:
- Line 149: The upstream stream detection in the routing logic is too strict
because `HasPrefix` on `httpResp.Header.Get("Content-Type")` misses valid SSE
values with case differences or parameters. Update the `upstreamStream` check in
`chat_completions_via_responses.go` to parse the header as a media type (using
the existing or new `mime` import) and compare the media type against
`text/event-stream`, so SSE responses are routed correctly before the JSON
handler.

In `@service/openaicompat/responses_to_chat.go`:
- Around line 785-795: The buffered delta handling in responses_to_chat.go is
storing the same tool-argument fragments under both pendingByOutputIndex and
pendingByItemID, which can cause double replay when ensureTool() later resolves
both keys. Update the responsesEventFunctionArgsDelta and
responsesEventCustomToolInputDelta path to use one canonical pending key for
buffered mode, matching the streaming behavior, and keep the logic in a single
place around a.findToolIndex, pendingByOutputIndex, and pendingByItemID.
- Around line 806-841: BuildOutput currently serializes only a.tools, so
arg-only tool calls left in pendingByOutputIndex/pendingByItemID are dropped in
the non-stream SSE path. Update ResponsesBufferedAccumulator.BuildOutput to
synthesize any remaining buffered tools from those pending maps before
assembling the final dto.ResponsesOutput list, using the same recovery logic as
FinalizeResponsesToChatStream. Keep the existing reasoning/text handling intact
and ensure unmatched tool deltas are converted into function-call outputs even
when no output_item.added/done arrives.
- Around line 437-445: The unmatched tool-delta handling in
findToolForEvent/ensureToolForEvent is queuing the same delta twice when an
event has both OutputIndex and ItemID. Update the logic in responses_to_chat.go
so a delta is stored in only one pending bucket for a given event—prefer a
single canonical key or track whether one identifier was already used before
appending to the second map. Make the change in the tool lookup path around
findToolForEvent and ensureToolForEvent so draining pendingArgsByOutputIndex and
pendingArgsByItemID cannot emit duplicate arguments for the same delta.

---

Nitpick comments:
In `@relay/channel/openai/chat_via_responses_test.go`:
- Around line 10-16: The test file currently uses require for value/content
checks in chat_via_responses_test.go, which makes later assertions stop running
after the first mismatch. Keep require only for setup and fatal preconditions,
and change the non-fatal checks in the affected test cases (including the blocks
around the mentioned helper/test symbols) to assert so multiple failures can be
reported in one run.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: de4d5289-dd6d-476c-a2fa-69e2ba17f6fa

📥 Commits

Reviewing files that changed from the base of the PR and between 626dadb and 9250d1c.

📒 Files selected for processing (9)
  • .gitignore
  • dto/openai_response.go
  • relay/channel/openai/chat_via_responses.go
  • relay/channel/openai/chat_via_responses_test.go
  • relay/chat_completions_via_responses.go
  • service/convert.go
  • service/openai_chat_responses_compat.go
  • service/openaicompat/chat_responses_compat_test.go
  • service/openaicompat/responses_to_chat.go

Comment on lines +138 to +146
if finalResponse == nil {
finalResponse = &dto.OpenAIResponsesResponse{
ID: helper.GetResponseID(c),
CreatedAt: int(time.Now().Unix()),
Model: info.UpstreamModelName,
Status: []byte(`"completed"`),
}
}
accumulator.SupplementResponseOutput(finalResponse)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Backfill missing metadata on partial terminal responses.

When response.done/response.completed includes a response object with only status/usage/output fields, this skips defaults and can emit created:0 or an empty model in the Chat response.

Proposed fix
 	if finalResponse == nil {
-		finalResponse = &dto.OpenAIResponsesResponse{
-			ID:        helper.GetResponseID(c),
-			CreatedAt: int(time.Now().Unix()),
-			Model:     info.UpstreamModelName,
-			Status:    []byte(`"completed"`),
-		}
+		finalResponse = &dto.OpenAIResponsesResponse{}
+	}
+	if finalResponse.ID == "" {
+		finalResponse.ID = helper.GetResponseID(c)
+	}
+	if finalResponse.CreatedAt == 0 {
+		finalResponse.CreatedAt = int(time.Now().Unix())
+	}
+	if finalResponse.Model == "" {
+		finalResponse.Model = info.UpstreamModelName
+	}
+	if len(finalResponse.Status) == 0 {
+		finalResponse.Status = []byte(`"completed"`)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if finalResponse == nil {
finalResponse = &dto.OpenAIResponsesResponse{
ID: helper.GetResponseID(c),
CreatedAt: int(time.Now().Unix()),
Model: info.UpstreamModelName,
Status: []byte(`"completed"`),
}
}
accumulator.SupplementResponseOutput(finalResponse)
if finalResponse == nil {
finalResponse = &dto.OpenAIResponsesResponse{}
}
if finalResponse.ID == "" {
finalResponse.ID = helper.GetResponseID(c)
}
if finalResponse.CreatedAt == 0 {
finalResponse.CreatedAt = int(time.Now().Unix())
}
if finalResponse.Model == "" {
finalResponse.Model = info.UpstreamModelName
}
if len(finalResponse.Status) == 0 {
finalResponse.Status = []byte(`"completed"`)
}
accumulator.SupplementResponseOutput(finalResponse)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@relay/channel/openai/chat_via_responses.go` around lines 138 - 146, The
terminal response handling in chat_via_responses.go only backfills defaults when
finalResponse is nil, so partial response objects can still miss metadata like
created and model. Update the response normalization in the
response.done/response.completed path to also fill any missing fields on an
existing finalResponse, using the same defaults currently set in the nil case
(helper.GetResponseID, time.Now().Unix, and info.UpstreamModelName) before
calling accumulator.SupplementResponseOutput.

httpResp = resp.(*http.Response)
info.IsStream = info.IsStream || strings.HasPrefix(httpResp.Header.Get("Content-Type"), "text/event-stream")
clientStream := info.IsStream
upstreamStream := strings.HasPrefix(httpResp.Header.Get("Content-Type"), "text/event-stream")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Parse Content-Type as a media type before routing.

HasPrefix misses valid SSE headers like Text/Event-Stream; charset=utf-8, which would route an upstream SSE body into the non-stream JSON handler.

Proposed fix
-	upstreamStream := strings.HasPrefix(httpResp.Header.Get("Content-Type"), "text/event-stream")
+	mediaType, _, parseErr := mime.ParseMediaType(httpResp.Header.Get("Content-Type"))
+	upstreamStream := parseErr == nil && strings.EqualFold(mediaType, "text/event-stream")

Also add the mime import if it is not already present.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
upstreamStream := strings.HasPrefix(httpResp.Header.Get("Content-Type"), "text/event-stream")
mediaType, _, parseErr := mime.ParseMediaType(httpResp.Header.Get("Content-Type"))
upstreamStream := parseErr == nil && strings.EqualFold(mediaType, "text/event-stream")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@relay/chat_completions_via_responses.go` at line 149, The upstream stream
detection in the routing logic is too strict because `HasPrefix` on
`httpResp.Header.Get("Content-Type")` misses valid SSE values with case
differences or parameters. Update the `upstreamStream` check in
`chat_completions_via_responses.go` to parse the header as a media type (using
the existing or new `mime` import) and compare the media type against
`text/event-stream`, so SSE responses are routed correctly before the JSON
handler.

Comment on lines +437 to +445
tool := s.findToolForEvent(event)
if tool == nil {
if event.OutputIndex != nil {
s.pendingArgsByOutputIndex[*event.OutputIndex] += event.Delta
}
if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
s.pendingArgsByItemID[itemID] += event.Delta
}
return nil

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don't queue the same tool delta twice.

Line 439-444 appends one unmatched delta to both pendingArgsByOutputIndex and pendingArgsByItemID when an event carries both identifiers. ensureToolForEvent() later drains both maps into the same tool, so the emitted arguments get duplicated.

Suggested fix
-	if event.OutputIndex != nil {
-		s.pendingArgsByOutputIndex[*event.OutputIndex] += event.Delta
-	}
-	if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
-		s.pendingArgsByItemID[itemID] += event.Delta
-	}
+	if event.OutputIndex != nil {
+		s.pendingArgsByOutputIndex[*event.OutputIndex] += event.Delta
+	} else if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
+		s.pendingArgsByItemID[itemID] += event.Delta
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
tool := s.findToolForEvent(event)
if tool == nil {
if event.OutputIndex != nil {
s.pendingArgsByOutputIndex[*event.OutputIndex] += event.Delta
}
if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
s.pendingArgsByItemID[itemID] += event.Delta
}
return nil
tool := s.findToolForEvent(event)
if tool == nil {
if event.OutputIndex != nil {
s.pendingArgsByOutputIndex[*event.OutputIndex] += event.Delta
} else if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
s.pendingArgsByItemID[itemID] += event.Delta
}
return nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@service/openaicompat/responses_to_chat.go` around lines 437 - 445, The
unmatched tool-delta handling in findToolForEvent/ensureToolForEvent is queuing
the same delta twice when an event has both OutputIndex and ItemID. Update the
logic in responses_to_chat.go so a delta is stored in only one pending bucket
for a given event—prefer a single canonical key or track whether one identifier
was already used before appending to the second map. Make the change in the tool
lookup path around findToolForEvent and ensureToolForEvent so draining
pendingArgsByOutputIndex and pendingArgsByItemID cannot emit duplicate arguments
for the same delta.

Comment on lines +785 to +795
case responsesEventFunctionArgsDelta, responsesEventCustomToolInputDelta:
if idx, ok := a.findToolIndex(event); ok {
a.tools[idx].Arguments.WriteString(event.Delta)
return
}
if event.OutputIndex != nil {
a.pendingByOutputIndex[*event.OutputIndex] += event.Delta
}
if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
a.pendingByItemID[itemID] += event.Delta
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Use one canonical pending key in buffered mode too.

Line 790-795 has the same double-buffer path as the streaming state. If a later response.output_item.added resolves both output_index and item_id, ensureTool() replays both pending entries and the buffered JSON path returns duplicated tool arguments.

Suggested fix
-		if event.OutputIndex != nil {
-			a.pendingByOutputIndex[*event.OutputIndex] += event.Delta
-		}
-		if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
-			a.pendingByItemID[itemID] += event.Delta
-		}
+		if event.OutputIndex != nil {
+			a.pendingByOutputIndex[*event.OutputIndex] += event.Delta
+		} else if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
+			a.pendingByItemID[itemID] += event.Delta
+		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case responsesEventFunctionArgsDelta, responsesEventCustomToolInputDelta:
if idx, ok := a.findToolIndex(event); ok {
a.tools[idx].Arguments.WriteString(event.Delta)
return
}
if event.OutputIndex != nil {
a.pendingByOutputIndex[*event.OutputIndex] += event.Delta
}
if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
a.pendingByItemID[itemID] += event.Delta
}
case responsesEventFunctionArgsDelta, responsesEventCustomToolInputDelta:
if idx, ok := a.findToolIndex(event); ok {
a.tools[idx].Arguments.WriteString(event.Delta)
return
}
if event.OutputIndex != nil {
a.pendingByOutputIndex[*event.OutputIndex] += event.Delta
} else if itemID := strings.TrimSpace(event.ItemID); itemID != "" {
a.pendingByItemID[itemID] += event.Delta
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@service/openaicompat/responses_to_chat.go` around lines 785 - 795, The
buffered delta handling in responses_to_chat.go is storing the same
tool-argument fragments under both pendingByOutputIndex and pendingByItemID,
which can cause double replay when ensureTool() later resolves both keys. Update
the responsesEventFunctionArgsDelta and responsesEventCustomToolInputDelta path
to use one canonical pending key for buffered mode, matching the streaming
behavior, and keep the logic in a single place around a.findToolIndex,
pendingByOutputIndex, and pendingByItemID.

Comment on lines +806 to +841
func (a *ResponsesBufferedAccumulator) BuildOutput() []dto.ResponsesOutput {
if a == nil {
return nil
}
out := make([]dto.ResponsesOutput, 0, 2+len(a.tools))
if a.reasoning.Len() > 0 {
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeReasoning,
Content: []dto.ResponsesOutputContent{
{Type: "summary_text", Text: a.reasoning.String()},
},
})
}
if a.text.Len() > 0 {
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeMessage,
Role: "assistant",
Content: []dto.ResponsesOutputContent{
{Type: "output_text", Text: a.text.String()},
},
})
}
for _, tool := range a.tools {
if tool == nil {
continue
}
argsRaw, _ := common.Marshal(tool.Arguments.String())
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeFunctionCall,
ID: tool.ItemID,
CallId: tool.CallID,
Name: tool.Name,
Arguments: argsRaw,
})
}
return out

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Buffered reconstruction still drops arg-only tools.

BuildOutput() only serializes a.tools. But ProcessEvent() can leave unmatched deltas in pendingByOutputIndex / pendingByItemID when the stream ends before any output_item.added/done arrives, so the non-stream SSE path loses that tool call entirely.

Suggested fix
 func (a *ResponsesBufferedAccumulator) BuildOutput() []dto.ResponsesOutput {
 	if a == nil {
 		return nil
 	}
+	a.flushPendingTools()
 	out := make([]dto.ResponsesOutput, 0, 2+len(a.tools))

That helper should synthesize buffered tools from the pending maps before the final output is built, mirroring FinalizeResponsesToChatStream().

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (a *ResponsesBufferedAccumulator) BuildOutput() []dto.ResponsesOutput {
if a == nil {
return nil
}
out := make([]dto.ResponsesOutput, 0, 2+len(a.tools))
if a.reasoning.Len() > 0 {
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeReasoning,
Content: []dto.ResponsesOutputContent{
{Type: "summary_text", Text: a.reasoning.String()},
},
})
}
if a.text.Len() > 0 {
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeMessage,
Role: "assistant",
Content: []dto.ResponsesOutputContent{
{Type: "output_text", Text: a.text.String()},
},
})
}
for _, tool := range a.tools {
if tool == nil {
continue
}
argsRaw, _ := common.Marshal(tool.Arguments.String())
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeFunctionCall,
ID: tool.ItemID,
CallId: tool.CallID,
Name: tool.Name,
Arguments: argsRaw,
})
}
return out
func (a *ResponsesBufferedAccumulator) BuildOutput() []dto.ResponsesOutput {
if a == nil {
return nil
}
a.flushPendingTools()
out := make([]dto.ResponsesOutput, 0, 2+len(a.tools))
if a.reasoning.Len() > 0 {
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeReasoning,
Content: []dto.ResponsesOutputContent{
{Type: "summary_text", Text: a.reasoning.String()},
},
})
}
if a.text.Len() > 0 {
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeMessage,
Role: "assistant",
Content: []dto.ResponsesOutputContent{
{Type: "output_text", Text: a.text.String()},
},
})
}
for _, tool := range a.tools {
if tool == nil {
continue
}
argsRaw, _ := common.Marshal(tool.Arguments.String())
out = append(out, dto.ResponsesOutput{
Type: responsesOutputTypeFunctionCall,
ID: tool.ItemID,
CallId: tool.CallID,
Name: tool.Name,
Arguments: argsRaw,
})
}
return out
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@service/openaicompat/responses_to_chat.go` around lines 806 - 841,
BuildOutput currently serializes only a.tools, so arg-only tool calls left in
pendingByOutputIndex/pendingByItemID are dropped in the non-stream SSE path.
Update ResponsesBufferedAccumulator.BuildOutput to synthesize any remaining
buffered tools from those pending maps before assembling the final
dto.ResponsesOutput list, using the same recovery logic as
FinalizeResponsesToChatStream. Keep the existing reasoning/text handling intact
and ensure unmatched tool deltas are converted into function-call outputs even
when no output_item.added/done arrives.

@Calcium-Ion Calcium-Ion merged commit 3a506f5 into main Jun 27, 2026
2 checks passed
@Calcium-Ion Calcium-Ion deleted the feat/chat-responses-compat branch June 28, 2026 05:42
@coderabbitai coderabbitai Bot mentioned this pull request Jun 28, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant