Skip to content

[orchestrator-v2] Grok v2 steer rows vanish and runs wedge on Working after reply #3580

Description

@mwolson

Summary

On orchestrator v2 with Grok, three user-visible reliability problems showed up in manual testing against the t3code/codex-turn-mapping stack (#2829):

  1. Steered user text vanishes on the v2 timeline between server events.
  2. Runs stay on Working after Grok already streamed a full reply; the next message is auto-routed as a steer instead of starting a new turn.
  3. Tool turns stop early after an assistant preamble — the run completes before tools execute.

Area

  • apps/web (steer row visibility)
  • apps/server (Grok ACP v2 root-turn settlement)

Steps to reproduce

Steer visibility

  1. Start T3 Code on orchestrator v2 with Grok.
  2. Send a message that triggers a long-running Grok turn.
  3. While the run is in flight, steer with a short follow-up (for example continue).
  4. Watch the timeline as message.updated and turn-item.updated events arrive.

Hung Working / mis-steer

  1. On orchestrator v2 with Grok, send a message that produces a full assistant reply.
  2. Wait for the reply to finish streaming in the UI.
  3. Observe whether the run indicator clears (Working → idle).
  4. Send a new question intended as a fresh turn (for example what's strategy A?).
  5. Check whether the second message is treated as steer (restart_active) instead of turn_start.

Tool turn stops after preamble

  1. On orchestrator v2 with Grok, send a message that should invoke tools (for example a file search or command).
  2. Watch for an assistant preamble (for example "Running the Codex review now.").
  3. Observe whether the run completes before tool execution turn items appear.

Expected behavior

  • Steered user text stays visible until the committed user_message turn item lands on the timeline.
  • After Grok finishes replying, the run completes and the next user message starts a normal turn.
  • Tool turns keep running through tool execution and final assistant output before completing.

Actual behavior

Symptom What happens
Steer row flash Optimistic steer row disappears briefly (or until refresh) even though the message is persisted server-side.
Hung Working Run stays running / UI shows Working after the assistant reply is visible; next send is auto-routed as steer.
Early tool-turn stop Run completes right after assistant preamble; tools never run or never appear in the timeline.

Impact

Major degradation or frequent failure — blocks normal Grok conversation flow on orchestrator v2.

Environment

Notes

  • The hung-Working case correlates with a stranded root session/prompt RPC: Grok streamed output but the provider turn never terminalized.
  • Many xAI prompt_complete notifications are subagent completions, not root-turn completion — root settlement must not rely on foreign session ids.
  • Fix tracked in draft PR fix(orchestrator): Harden Grok v2 settlement and steer message visibility #3578 (fix/grok-v2).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions