Fix encoding of channel, recipient, and constrain by bbrowning · Pull Request #114 · openai/harmony

bbrowning · 2026-04-17T16:07:11Z

We're seeing a lot of failures with gpt-oss-20b and gpt-oss-120b models in vLLM where we get <|channel|> markers leaking into tool call names and the model outputting reasoning after <|constrain|> tokens in very long multi-turn conversations.

These are fairly trivial to reproduce with gpt-oss-20b using BFCL multi_turn_base as a simple eval, and occasionally show up with gpt-oss-120b as well. In SWE Bench Verified with gpt-oss-120b, this shows up a lot more.

Here's an example of the kind of errors we see across 500 tests cases with SWE Bench Verified and gpt-oss-120b:

  HarmonyErrors (special token leaks in message parsing):
      122x  Unknown role: assistant<|channel|>commentary
       25x  Unknown role: assistant<|channel|>bash<|channel|>commentary
       24x  unexpected tokens remaining in message header: Some("to=functions.bash")
        3x  Unknown role: final
        2x  Unknown role: assistant<|channel|>bash<|channel|>analysis
        2x  unexpected tokens remaining in message header: Some("<|constrain|>commentary")
        2x  Unknown role: assistant<|channel|>analysis
        1x  unexpected tokens remaining in message header: Some("<|constrain|>control<|end|><|start|>assistant<|channel|>commentary"
        1x  unexpected tokens remaining in message header: Some("bash")
        1x  unexpected tokens remaining in message header: Some("to=functions.bash # We'll create conftest.py with needed code<|end|
        1x  unexpected tokens remaining in message header: Some("to=functions.bash (We attempt to")
        1x  unexpected tokens remaining in message header: Some("<|constrain|>Complete_TASK_AND_SUBMIT_FINAL_OUTPUT &&")
        1x  could not decode header
        1x  Unknown role: assistant<|channel|>bash<|channel|>commentary,json
        1x  unexpected tokens remaining in message header: Some("to=functions.bash\"I'm thinking the repository is not trusted. We'l
        1x  unexpected tokens remaining in message header: Some("to=functions.bash by using git checkout...<|end|><|start|>assistant
        1x  unexpected tokens remaining in message header: Some("to=functions.bash as<|end|><|start|>assistant<|channel|>commentary"

Through experimentation and closely monitoring what the models output versus the tokens we send back in as input, I discovered two important things.

First, for tool calls output to the commentary channel, the models always output the recipient and channel order backwards from what encoding.rs was doing. So, this swaps the order of the channel and recipient when outputting tool calls to the commentary channel to match what the model does, which presumably is a closer match to the training data. In anecdotal testing this seems to greatly reduce if not eliminate all the extra <|channel|> markers we were seeing in the model output. So, this change may obviate the need for #97.

Second, we caught examples of the model emitting reasoning text after a <|constrain|> token in long multi-turn conversations. A real-world example of that we saw: <|start|>assistant<|channel|>commentary to=functions.echo<|constrain|>2.0? Actually we need to write content. Use echo.<|end|><|start|>assistant<|channel|>commentary to=functions.echo<|constrain|>json<|message|>{"content": "2.0000", "file_name": "result.txt"}<|call|>. To fix this, instead of ever passing in a bare content-type (such as json), this PR always emits the <|constrain|> special token before that content type. In long multi-turn scenarios this appears to solve the problem of the model incorrectly reasoning after that token. This is perhaps also partially a side-effect of roundtripping this model's outputs via APIs that have no provision to include a "content type" that the client can send back in, so we loose the details of whether the model output a json or <|constrain|>json and have to reconstruct as best we can on the inference side.

We'll do some more testing on our side, but I wanted to go ahead and open this PR to get some eyes on the issue and any feedback about these fixes.

We're seeing a lot of failures with gpt-oss-20b and gpt-oss-120b models in vLLM where we get <|channel|> markers leaking into tool call names and the model outputting reasoning after <|constrain|> tokens in very long multi-turn conversations. These are fairly trivial to reproduce with gpt-oss-20b using BFCL multi_turn_base as a simple eval, and occasionally show up with gpt-oss-120b as well. In SWE Bench Verified with gpt-oss-120b, this shows up a lot more. Here's an example of the kind of errors we see across 500 tests cases with SWE Bench Verified and gpt-oss-120b: ``` HarmonyErrors (special token leaks in message parsing): 122x Unknown role: assistant<|channel|>commentary 25x Unknown role: assistant<|channel|>bash<|channel|>commentary 24x unexpected tokens remaining in message header: Some("to=functions.bash") 3x Unknown role: final 2x Unknown role: assistant<|channel|>bash<|channel|>analysis 2x unexpected tokens remaining in message header: Some("<|constrain|>commentary") 2x Unknown role: assistant<|channel|>analysis 1x unexpected tokens remaining in message header: Some("<|constrain|>control<|end|><|start|>assistant<|channel|>commentary" 1x unexpected tokens remaining in message header: Some("bash") 1x unexpected tokens remaining in message header: Some("to=functions.bash # We'll create conftest.py with needed code<|end| 1x unexpected tokens remaining in message header: Some("to=functions.bash (We attempt to") 1x unexpected tokens remaining in message header: Some("<|constrain|>Complete_TASK_AND_SUBMIT_FINAL_OUTPUT &&") 1x could not decode header 1x Unknown role: assistant<|channel|>bash<|channel|>commentary,json 1x unexpected tokens remaining in message header: Some("to=functions.bash\"I'm thinking the repository is not trusted. We'l 1x unexpected tokens remaining in message header: Some("to=functions.bash by using git checkout...<|end|><|start|>assistant 1x unexpected tokens remaining in message header: Some("to=functions.bash as<|end|><|start|>assistant<|channel|>commentary" ``` Through experimentation and closely monitoring what the models output versus the tokens we send back in as input, I discovered two important things. First, for tool calls output to the commentary channel, the models always output the recipient and channel order backwards from what encoding.rs was doing. So, this swaps the order of the channel and recipient when outputting tool calls to the commentary channel to match what the model does, which presumably is a closer match to the training data. In anecdotal testing this seems to greatly reduce if not eliminate all the extra <|channel|> markers we were seeing in the model output. So, this change may obviate the need for openai#97. Second, we caught examples of the model emitting reasoning text after a <|constrain|> token in long multi-turn conversations. A real-world example of that we saw: `<|start|>assistant<|channel|>commentary to=functions.echo<|constrain|>2.0? Actually we need to write content. Use echo.<|end|><|start|>assistant<|channel|>commentary to=functions.echo<|constrain|>json<|message|>{"content": "2.0000", "file_name": "result.txt"}<|call|>`. To fix this, instead of ever passing in a bare content-type (such as `json`), this PR always emits the `<|constrain|>` special token before that content type. In long multi-turn scenarios this appears to solve the problem of the model incorrectly reasoning after that token. This is perhaps also partially a side-effect of roundtripping this model's outputs via APIs that have no provision to include a "content type" that the client can send back in, so we loose the details of whether the model output a `json` or `<|constrain|>json` and have to reconstruct as best we can on the inference side. We'll do some more testing on our side, but I wanted to go ahead and open this PR to get some eyes on the issue and any feedback about these fixes. Signed-off-by: Ben Browning <bbrownin@redhat.com>

madclaws mentioned this pull request May 27, 2026

Tool call fails often due to improper harmony parser encoding tilesprivacy/tiles#145

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix encoding of channel, recipient, and constrain#114

Fix encoding of channel, recipient, and constrain#114
bbrowning wants to merge 1 commit into
openai:mainfrom
bbrowning:channel-constrain-multi-turn-fixes

bbrowning commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bbrowning commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant