Skip to content

Fix OpenAI content encoding for thinking parts and empty arrays#428

Open
laudney wants to merge 3 commits intoagentjido:mainfrom
mmonad:fix/openai-thinking-content-encoding
Open

Fix OpenAI content encoding for thinking parts and empty arrays#428
laudney wants to merge 3 commits intoagentjido:mainfrom
mmonad:fix/openai-thinking-content-encoding

Conversation

@laudney
Copy link
Contributor

@laudney laudney commented Feb 16, 2026

Summary

  • Strip :thinking ContentParts in the OpenAI encoder — reasoning models (e.g. gpt-oss on vLLM) emit chain-of-thought thinking parts that have no OpenAI encoding. The catch-all returns nil, but an explicit clause makes intent clear.
  • Collapse empty content arrays to "" — when all parts are filtered (e.g. assistant message with only :thinking before tool calls), content: [] is rejected by vLLM's strict Pydantic validation. Now normalizes to "" instead.

Test plan

  • Tested with gpt-oss-12b on vLLM — multi-turn tool calling with thinking content now works
  • Tested with Qwen3-Coder on vLLM — no regression
  • Add unit test for :thinking content part encoding
  • Add unit test for empty content array normalization

L.B.R. added 3 commits February 13, 2026 17:29
Gemini 3 models require thoughtSignature to be echoed back on
functionCall parts in conversation history. Without it, the API
returns a 400 error.

- Add provider_meta field to ToolCall struct for opaque provider data
- Capture thoughtSignature from Gemini response parts (streaming and
  non-streaming)
- Propagate through StreamChunk metadata, accumulation, and
  normalization pipelines
- Re-attach as thoughtSignature on functionCall parts when encoding
  requests back to Gemini

Includes temporary debug logging in convert_tool_call_to_function_call.
Two bugs in build_request_body prevented multi-turn tool calling:

1. Assistant messages with tool_calls were skipped entirely, so the
   input array had function_call_output items with no matching
   function_call items. Now serializes tool calls as function_call
   input items per the Responses API spec.

2. previous_response_id was sent without store: true, so OpenAI
   couldn't look up the referenced response. Now sets store: true
   alongside previous_response_id.
Two issues in the OpenAI content encoder:

1. :thinking ContentParts (from reasoning models like gpt-oss) have no
   encoding clause. The catch-all returns nil, which gets filtered out,
   potentially leaving content: []. Add an explicit clause that returns
   nil (strip thinking from OpenAI encoding since the format has no
   standard representation).

2. When all content parts are filtered (e.g. assistant message with only
   :thinking content before tool calls), the encoded content becomes [].
   vLLM's strict Pydantic validation rejects "content": [] — it expects
   null or a string. Collapse empty arrays to "" after filtering.

Rename maybe_flatten_single_text to normalize_encoded_content to reflect
its expanded responsibility.
@mikehostetler mikehostetler added the needs_work Changes requested before merge label Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs_work Changes requested before merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants