Skip to content

[codex] fix codex streaming output and telegram duplicate retries#2462

Open
Glucksberg wants to merge 1 commit intosipeed:mainfrom
Glucksberg:codex/fix-codex-telegram-regressions
Open

[codex] fix codex streaming output and telegram duplicate retries#2462
Glucksberg wants to merge 1 commit intosipeed:mainfrom
Glucksberg:codex/fix-codex-telegram-regressions

Conversation

@Glucksberg
Copy link
Copy Markdown

@Glucksberg Glucksberg commented Apr 9, 2026

The story behind this fix

I ran into this while trying to keep a small Android TV box alive as a real PicoClaw node.

The setup was a little unusual but very real-world: Android 7, Termux, Telegram, OpenAI OAuth, and gpt-5.4. For about two days everything worked well enough that I stopped suspecting auth or model access entirely. Then the bot suddenly started failing with:

The model returned an empty response.

That error sent me in the wrong direction at first.

I re-checked the obvious things:

  • OAuth login state
  • token refresh
  • model access
  • network changes after moving the box to a different Wi‑Fi
  • Telegram behavior

None of those explained the failure cleanly, because the same model still worked elsewhere, and PicoClaw still looked "authenticated".

So I started tracing the problem from the bottom instead of from the config.

What actually turned out to be broken

1. Codex/OpenAI streaming response parsing

The first regression was subtle.

The backend was sometimes delivering the assistant text through response.output_item.done, while the final response.completed event arrived with an empty response.output array.

From the user's perspective, that looks absurd: the model did answer, but PicoClaw surfaced it as an empty response.

That was especially confusing because it made the OAuth/model path look broken when the real issue was response assembly.

The fix here is to preserve streamed output items by index and rebuild the final output if response.completed comes in empty.

2. Telegram duplicate sends on flaky connections

While debugging the first issue, I also hit a second one that made everything feel much worse than it really was.

On Telegram, a single user message could produce repeated replies.

That turned out to be caused by sendMessage falling back to a second plain-text send for any error, not just parse failures. On unstable connections, that means the first send may already have landed, but PicoClaw still retries with a second message. The result is visible duplicate replies in chat.

That behavior makes debugging much harder because it looks like the model or agent is looping, when in reality the channel layer is duplicating delivery.

The fix here is to keep the plain-text fallback only for real parse failures (Bad Request) and to treat likely post-connect transport errors as "probably delivered" so the user does not get duplicates.

Why I think these belong together

These two bugs amplify each other in a very misleading way.

One bug makes a valid streamed answer disappear.
The other bug makes channel delivery look unstable and repetitive.

Together, they create the exact kind of failure mode that is painful to diagnose from user reports:

  • "the bot is authenticated but says the model returned nothing"
  • "Telegram is sending duplicates"
  • "it worked yesterday and now it feels haunted"

That is exactly the situation I hit.

What this PR changes

  • CodexProvider now preserves response.output_item.done items and reconstructs final output when response.completed.output is empty.
  • Telegram only falls back to plain text on parse errors.
  • Telegram now swallows likely post-connect transport failures instead of creating duplicate user-visible sends.
  • Regression tests were added for both cases.

Validation

I validated this in two ways:

  1. reproducible tests in the repo

    • go test ./pkg/providers ./pkg/channels/telegram
  2. real-device behavior

    • Android TV box
    • Termux
    • OpenAI OAuth
    • gpt-5.4
    • Telegram bot path

That real-device path is what exposed the issue clearly enough to isolate both failures.

Why I'm sending this

This was one of those bugs that wastes a lot of time because every layer looks suspicious for a different reason: auth, model access, network, Telegram, session state.

In the end, the fixes were small compared to the amount of confusion they caused.

So this PR is mainly an attempt to save the next person from spending hours thinking their OAuth or model access is broken, when the real problem is lower-level response handling or duplicate delivery behavior.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Markus seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Glucksberg Glucksberg marked this pull request as ready for review April 9, 2026 23:24
@sipeed-bot sipeed-bot bot added type: bug Something isn't working domain: provider domain: channel go Pull requests that update go code labels Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: channel domain: provider go Pull requests that update go code type: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants