[codex] fix codex streaming output and telegram duplicate retries#2462
Open
Glucksberg wants to merge 1 commit intosipeed:mainfrom
Open
[codex] fix codex streaming output and telegram duplicate retries#2462Glucksberg wants to merge 1 commit intosipeed:mainfrom
Glucksberg wants to merge 1 commit intosipeed:mainfrom
Conversation
|
Markus seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The story behind this fix
I ran into this while trying to keep a small Android TV box alive as a real PicoClaw node.
The setup was a little unusual but very real-world: Android 7, Termux, Telegram, OpenAI OAuth, and
gpt-5.4. For about two days everything worked well enough that I stopped suspecting auth or model access entirely. Then the bot suddenly started failing with:That error sent me in the wrong direction at first.
I re-checked the obvious things:
None of those explained the failure cleanly, because the same model still worked elsewhere, and PicoClaw still looked "authenticated".
So I started tracing the problem from the bottom instead of from the config.
What actually turned out to be broken
1. Codex/OpenAI streaming response parsing
The first regression was subtle.
The backend was sometimes delivering the assistant text through
response.output_item.done, while the finalresponse.completedevent arrived with an emptyresponse.outputarray.From the user's perspective, that looks absurd: the model did answer, but PicoClaw surfaced it as an empty response.
That was especially confusing because it made the OAuth/model path look broken when the real issue was response assembly.
The fix here is to preserve streamed output items by index and rebuild the final output if
response.completedcomes in empty.2. Telegram duplicate sends on flaky connections
While debugging the first issue, I also hit a second one that made everything feel much worse than it really was.
On Telegram, a single user message could produce repeated replies.
That turned out to be caused by
sendMessagefalling back to a second plain-text send for any error, not just parse failures. On unstable connections, that means the first send may already have landed, but PicoClaw still retries with a second message. The result is visible duplicate replies in chat.That behavior makes debugging much harder because it looks like the model or agent is looping, when in reality the channel layer is duplicating delivery.
The fix here is to keep the plain-text fallback only for real parse failures (
Bad Request) and to treat likely post-connect transport errors as "probably delivered" so the user does not get duplicates.Why I think these belong together
These two bugs amplify each other in a very misleading way.
One bug makes a valid streamed answer disappear.
The other bug makes channel delivery look unstable and repetitive.
Together, they create the exact kind of failure mode that is painful to diagnose from user reports:
That is exactly the situation I hit.
What this PR changes
CodexProvidernow preservesresponse.output_item.doneitems and reconstructs final output whenresponse.completed.outputis empty.Validation
I validated this in two ways:
reproducible tests in the repo
go test ./pkg/providers ./pkg/channels/telegramreal-device behavior
gpt-5.4That real-device path is what exposed the issue clearly enough to isolate both failures.
Why I'm sending this
This was one of those bugs that wastes a lot of time because every layer looks suspicious for a different reason: auth, model access, network, Telegram, session state.
In the end, the fixes were small compared to the amount of confusion they caused.
So this PR is mainly an attempt to save the next person from spending hours thinking their OAuth or model access is broken, when the real problem is lower-level response handling or duplicate delivery behavior.