fix: diagnose cron message delivery delay#690
Open
dorey-agent[bot] wants to merge 3 commits into
Open
Conversation
added 2 commits
May 17, 2026 01:17
Add info-level tracing at key points in the message delivery chain to identify where cron-triggered messages get buffered: 1. ChannelsManager: when event arrives from orchestrator 2. ChannelsManager: when notification is sent to channel subprocess 3. Channel harness: when deliverMessage is received from stdin 4. Presenter actor: when render() is called on the adapter This will help identify whether the delay is in: - Orchestrator → ChannelsManager routing - ChannelsManager → channel subprocess stdin - Channel subprocess stdin → harness dispatch - Presenter actor → adapter render
Add unsolicited_delay_ms and unsolicited_message options to agent-mock that send a session/update notification after a prompt completes without being prompted again. This simulates cron-triggered agent output. Add flow test that verifies unsolicited notifications are delivered to the channel without requiring a new user message. The test passes with debug-http channel, confirming the core delivery path works correctly. The production bug is likely Telegram-specific (session_chat_map race or Docker attach buffering).
The Docker attach demux task writes agent stdout/stderr chunks to DuplexStream via write_all() but never flushes. While DuplexStream itself makes data immediately available, adding explicit flush ensures downstream readers (the SDK transport's BufReader) see each chunk without delay. This is a defensive fix for the cron message delivery delay — when an agent sends a small unsolicited notification (e.g., from a cron job), the data must be immediately available to the SDK's line reader rather than potentially sitting in any intermediate buffer.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Cron-triggered agent messages (unsolicited
session/updatenotifications) are delayed until the user sends a new message. The output appears buffered — it only flushes when the next incoming message triggers activity.Solution
Two-pronged approach:
1. Diagnostic tracing (to identify the exact buffering point)
Added
info-level tracing at key delivery points:deliverMessageis received from stdinrender()is called on the adapter2. Reproduction infrastructure
unsolicited_delay_msandunsolicited_messageoptions to agent-mockwhen_agent_sends_unsolicited_notification_then_delivered_without_user_pingThe flow test passes with debug-http channel, confirming the core delivery path (orchestrator → channels manager → channel subprocess → harness → presenter → adapter) works correctly for unsolicited notifications.
This means the production bug is Telegram-specific — likely one of:
session_chat_maprace (session not registered when unsolicited message arrives)Testing
when_agent_sends_unsolicited_notification_then_delivered_without_user_ping— passesNext Steps
Deploy this build and check logs when cron fires to identify which layer delays delivery.
[ai-assisted]