The agent runner is a Node.js TypeScript process located at agent-runner/src/index.ts (~1,754 lines). It serves as the bridge between the Go backend and the Anthropic Claude API, wrapping the Claude Agent SDK (@anthropic-ai/claude-agent-sdk) to provide a multi-turn, streaming conversation experience.
The Go backend spawns one agent-runner process per active conversation. Communication happens via JSON lines over stdin/stdout — the backend writes user messages and control commands to the process's stdin, and the agent runner emits streaming events on stdout. Stderr is captured for diagnostics.
The agent runner has four core responsibilities:
- Lifecycle management — Initialize the SDK, maintain the session across turns, handle shutdown
- Event translation — Convert SDK events into JSON events the Go backend understands
- Runtime control — Handle model changes, permission mode switches, interrupts, and file rewinds mid-session
- MCP orchestration — Merge and manage MCP servers from multiple sources
graph LR
Backend[Go Backend] -->|stdin JSON| AR[Agent Runner]
AR -->|stdout JSON| Backend
AR <-->|SDK| Claude[Claude API]
AR <-->|stdio/SSE/HTTP| MCP[MCP Servers]
The most important architectural decision in the agent runner is its use of a streaming input generator pattern. Rather than spawning a new process for each turn (which would require --resume to reload state), a single query() call persists for the entire session lifetime.
An async generator function called messageStream() yields user messages to the SDK one at a time. Between yields, the generator blocks waiting for the next message from stdin. The SDK processes each message, generates a response (which the agent runner captures via a for-await loop), and then asks the generator for the next message.
This design provides several critical benefits:
- No subprocess restarts — The process stays alive across turns, avoiding startup overhead
- Natural state persistence — Session state persists in memory without serialization
- MCP connections stay alive — MCP servers don't need to reconnect between turns
- stdin stays open — Hooks,
canUseToolcallbacks, and MCP servers continue to function
sequenceDiagram
participant B as Go Backend
participant AR as Agent Runner
participant SDK as Claude SDK
participant API as Claude API
B->>AR: spawn process
AR->>SDK: query(messageStream(), options)
Note over AR: Turn 1
B->>AR: {"type":"message","content":"..."}
AR->>SDK: yield SDKUserMessage
SDK->>API: Messages API call
API-->>SDK: streaming response
SDK-->>AR: for-await events
AR-->>B: assistant_text, tool_start, tool_end, result
Note over AR: Turn 2 (same process, same query())
B->>AR: {"type":"message","content":"..."}
AR->>SDK: yield SDKUserMessage
SDK->>API: Messages API call (with full history)
API-->>SDK: streaming response
SDK-->>AR: for-await events
AR-->>B: assistant_text, tool_start, tool_end, result
Note over AR: Session ends
B->>AR: {"type":"stop"}
AR->>AR: mainLoopRunning = false
AR->>AR: generator returns
AR-->>B: {"type":"complete"}
The Go backend spawns the agent runner with these command-line arguments:
| Argument | Type | Default | Purpose |
|---|---|---|---|
--cwd |
string | process.cwd() |
Working directory (worktree path) |
--conversation-id |
string | "default" |
Conversation identifier for tracking |
--resume |
string | — | SDK session ID to resume from |
--fork |
flag | false | Fork the resumed session |
--linear-issue |
string | — | Linear issue identifier (e.g., "LIN-123") |
--target-branch |
string | — | Base branch for PR operations |
--tool-preset |
string | "full" |
Tool restriction level |
--enable-checkpointing |
flag | false | Enable file checkpoint support |
--structured-output |
JSON | — | JSON schema for structured responses |
--max-budget-usd |
number | — | Maximum cost limit in USD |
--max-turns |
number | — | Maximum conversation turn count |
--max-thinking-tokens |
number | — | Extended thinking token budget |
--permission-mode |
string | "bypassPermissions" |
Initial permission mode |
--setting-sources |
CSV | — | Where to load settings (project, user, local) |
--betas |
CSV | — | Beta feature flags |
--model |
string | — | Model override |
--fallback-model |
string | — | Fallback model if primary unavailable |
--instructions-file |
string | — | Path to system prompt file |
--mcp-servers-file |
string | — | Path to user-configured MCP servers JSON |
--sdk-debug |
flag | false | Enable SDK debugging |
--sdk-debug-file |
string | — | Write SDK debug output to file |
Tool presets restrict which tools the agent can use. They are applied via allowedTools and disallowedTools in the SDK query options:
| Preset | Allowed Tools | Use Case |
|---|---|---|
full |
All tools | Default — unrestricted access |
read-only |
Read, Glob, Grep, WebFetch, WebSearch | Code exploration without modifications |
no-bash |
All except Bash | Prevent shell command execution |
safe-edit |
Read, Glob, Grep, Edit, WebFetch, WebSearch | File editing without Bash or Write |
Messages from the Go backend arrive as JSON lines on stdin. The agent runner parses each line and routes it based on the type field:
graph TD
stdin[stdin JSON line] --> parse[Parse JSON]
parse --> switch{type?}
switch -->|message| queue[Queue for next turn]
switch -->|stop| stop[Break main loop]
switch -->|interrupt| int[queryRef.interrupt]
switch -->|set_model| model[queryRef.setModel]
switch -->|set_permission_mode| perm[queryRef.setPermissionMode]
switch -->|get_supported_models| query1[Query SDK → emit response]
switch -->|get_supported_commands| query2[Query SDK → emit response]
switch -->|get_mcp_status| query3[Query SDK → emit response]
switch -->|get_account_info| query4[Query SDK → emit response]
switch -->|rewind_files| rewind[queryRef.rewindFiles]
switch -->|user_question_response| uq[Resolve pending question]
switch -->|plan_approval_response| pa[Resolve pending approval]
The message type is the only one that goes through the queue. A messageQueue array buffers messages, and a messageWaiter callback resolves a pending Promise when a message arrives. If someone is already waiting, the message is delivered immediately; otherwise, it's buffered.
The heart of the multi-turn architecture is the async generator:
async function* messageStream(): AsyncGenerator<SDKUserMessage> {
while (mainLoopRunning) {
const msg = await waitForNextMessage(); // Blocks until stdin has a message
if (!msg) break; // stdin closed or stop received
turnCount++;
currentTurnStartTime = Date.now();
blockBuffer = ""; // Reset text buffer
resetRunStats(); // Clear tool tracking
yield buildUserMessage(msg); // Yield to SDK
// SDK processes, returns for next iteration
}
}The generator blocks on waitForNextMessage() until a user message arrives (or the stop signal is received). It then yields a single SDKUserMessage to the SDK. The SDK processes the message, streams back results through the for-await loop, and eventually returns control to the generator, which blocks again waiting for the next message.
The SDK is initialized with a single query() call:
const result = query({
prompt: messageStream(), // Streaming input generator
options: {
cwd, // Worktree directory
permissionMode: initialPermissionMode,
allowDangerouslySkipPermissions: true,
canUseTool: async () => ({ behavior: "allow" }),
mcpServers: mergedMcpServers, // Built-in + .mcp.json + user
includePartialMessages: true, // Enable streaming
tools: { type: "preset", preset: "claude_code" },
systemPrompt: instructions
? { type: "preset", preset: "claude_code", append: instructions }
: { type: "preset", preset: "claude_code" },
hooks, // All hooks always enabled
allowedTools: presetConfig.allowedTools,
disallowedTools: presetConfig.disallowedTools,
enableFileCheckpointing: enableCheckpointing,
outputFormat, // Structured output schema
maxBudgetUsd, maxTurns, maxThinkingTokens,
model, fallbackModel,
abortController: sessionAbortController,
resume: resumeSessionId,
forkSession,
},
});Key options:
includePartialMessages: trueenables streaming text events during generationcanUseToolalways returnsallow— actual permission logic lives in the PreToolUse hooksallowDangerouslySkipPermissions: trueis required because permission mode can change at runtime- The system prompt uses the
claude_codepreset (built into the SDK) with optional appended instructions from conversation summaries
All hooks are always enabled. They provide real-time tracking of every agent activity:
AskUserQuestion (24-hour timeout): When Claude uses the AskUserQuestion tool, the hook intercepts it, emits a user_question_request event to the Go backend, and blocks on a Promise until the user responds. The 24-hour timeout gives users unlimited practical time to answer.
ExitPlanMode (24-hour timeout): When Claude uses ExitPlanMode, the hook emits a plan_approval_request event and blocks until the user approves or rejects the plan.
Default (all other tools): Emits a hook_pre_tool event for tracking. For sub-agent tools, also tracks the tool start with the agent ID.
Tracks tool completion, calculates duration from start time, and emits a hook_post_tool event. For sub-agent tools, emits tool_end with the agent ID.
Tracks failed tools, emits failure events, and cleans up sub-agent tool tracking.
- Notification — Emits
agent_notificationevents (title, message, type) - SessionStart — Updates
currentSessionId, emitssession_started - SessionEnd — Emits
session_endedwith reason - Stop — Emits
agent_stopevent - SubagentStart — Maps session ID to agent ID for correlation, finds parent Task tool
- SubagentStop — Cleans up session-to-agent mapping, emits
subagent_stopped
All events are emitted as JSON lines on stdout via console.log(JSON.stringify(event)):
ready— Process initialized and ready for messagesinit— SDK configuration (model, tools, MCP servers, budget config)session_started/session_ended— Session lifecyclesession_id_update— SDK session ID changed (for resume)turn_complete— Turn finished, process stays alive for next messagecomplete— Session ended, process will exit
assistant_text— Streamed text content (paragraph-buffered)thinking_start/thinking_delta/thinking— Extended thinking blocks
tool_start— Tool invocation begins (with params)tool_end— Tool completed (success/failure, summary, duration)tool_progress— Elapsed time update for long-running tools
subagent_started— Child agent spawned (with agentId, agentType, parentToolUseId)subagent_stopped— Child agent completed
user_question_request— AskUserQuestion waiting for user inputplan_approval_request— ExitPlanMode waiting for approvaltodo_update— TodoWrite tool updated the task list
result— Turn summary with cost, usage, stats, success/error statuscontext_usage— Per-message token countscontext_window_size— Model context window infocompact_boundary— Context compaction occurred
checkpoint_created— File snapshot created (with UUID)files_rewound— Files restored to checkpoint
model_changed— Model switched via set_modelpermission_mode_changed— Permission mode toggled
error— Fatal errorauth_error— Authentication failure (user-friendly message)interrupted— Execution interrupted by usershutdown— Graceful exit with reason
The agent runner uses block-level buffering to emit text in readable paragraphs rather than character-by-character:
- Text chunks from the SDK accumulate in a
blockBufferstring - The buffer is split on double newlines (
\n\n) — paragraph breaks - Complete paragraphs are emitted immediately as
assistant_textevents - The last incomplete paragraph stays in the buffer
- If the buffer exceeds 4,096 characters, a force-flush occurs at the nearest newline
- At turn boundaries,
flushBlockBuffer()emits any remaining text
This approach balances streaming responsiveness with readable output — the frontend receives complete paragraphs rather than individual words.
When Claude spawns child agents via the Task tool, the agent runner tracks them for proper event correlation:
sessionToAgentIdMap — Maps sub-agent session IDs to agent IDs. Populated bySubagentStartHook, cleaned up bySubagentStopHooksubagentActiveToolsMap — Tracks tools currently executing in sub-agent context, with agentId for UI attribution
The SubagentStart hook also identifies the parent Task tool by iterating through active tools to find the one whose session matches. This parentToolUseId lets the frontend nest sub-agent activity under its parent tool call.
Sub-agent text and thinking messages are skipped in the main handleMessage() function to prevent duplicate output — only the parent agent's content is emitted to the frontend.
Per-turn statistics are tracked and emitted in the result event:
| Stat | What It Counts |
|---|---|
toolCalls |
Total tool invocations |
toolsByType |
Breakdown by tool name (e.g., {"Read": 5, "Write": 2}) |
subAgents |
Number of Task tool invocations |
filesRead |
Read, Glob, and Grep invocations |
filesWritten |
Write and Edit invocations |
bashCommands |
Bash tool invocations |
webSearches |
WebSearch and WebFetch invocations |
totalToolDurationMs |
Cumulative time spent in tools |
Statistics are reset at the start of each turn and included in the result event at the end of the turn.
The agent runner merges MCP servers from three sources:
The chatml MCP server is always available and provides workspace-aware tools:
| Tool | Purpose |
|---|---|
get_session_status |
Current branch, git state, Linear issue |
get_workspace_diff |
Git diff summary or full diff |
get_recent_activity |
Recent git log entries |
add_review_comment |
Post a code review comment |
list_review_comments |
List review comments (with optional file filter) |
get_review_comment_stats |
Per-file comment counts |
get_linear_context |
Current Linear issue details |
start_linear_issue |
Create branch from Linear issue |
update_linear_status |
Update Linear issue status (local only) |
get_workspace_scripts_config |
Read .chatml/config.json |
propose_scripts_config |
Generate a config proposal for user approval |
These tools use a WorkspaceContext class that manages session state: the working directory, workspace ID, session ID, target branch, Linear issue, and git state (current branch, uncommitted changes, ahead/behind counts).
If the repository contains a .mcp.json file, its servers are loaded and merged. This allows projects to define MCP servers that are always available when working on that codebase.
The Go backend writes user-configured MCP servers to a temporary JSON file and passes its path via --mcp-servers-file. These are servers the user has configured through the settings UI. Supported transport types: stdio, sse, and http.
The agent runner pattern-matches on common authentication error strings: "authentication_error", "oauth token has expired", "invalid api key", "401 unauthorized", and others. When detected, a user-friendly auth_error event is emitted instead of a generic error.
Graceful shutdown follows a 9-step sequence:
- Set
mainLoopRunning = falseto break the generator loop - Signal the
AbortControllerto cancel pending operations - Reject all pending user question and plan approval Promises
- Emit
tool_endfor any in-flight tools (prevents infinite spinners in the UI) - Flush the text buffer
- Call
queryRef.interrupt()to tear down MCP connections - Unblock the message waiter (if blocked on
waitForNextMessage) - Close the readline interface
- Emit
shutdownevent
- SIGTERM / SIGINT → cleanup() →
process.exit(0) - Unhandled rejection → cleanup() → emit error →
process.exit(1) - Uncaught exception → cleanup() → detect auth errors → emit error →
process.exit(1)
When --enable-checkpointing is passed, the SDK creates file checkpoints at turn boundaries:
- Each checkpoint has a UUID that's included in
checkpoint_createdevents - The frontend can later call
rewind_fileswith a checkpoint UUID to restore all files to that state - Checkpoints are created for both user messages (during processing) and result messages (at turn end)
- This enables a "rewind" feature where users can undo unwanted changes within a session
| Mode | Behavior |
|---|---|
bypassPermissions |
Allow all tools without prompting (default in agent-runner) |
default |
Show prompts for sensitive operations |
acceptEdits |
Accept file edits automatically, prompt for others |
plan |
Agent proposes plans that require user approval |
dontAsk |
Never prompt (experimental) |
Permission modes can be changed mid-session via the set_permission_mode input message.
User messages can include file and image attachments:
- Images: Embedded as base64 content blocks with MIME type
- Files: Wrapped in
<attached_file>XML tags with path and line count metadata - Closing tags (
</attached_file>) are escaped to prevent XML injection
- Overview: See overview.md for the application architecture
- Streaming Events: See claude-sdk-events.md for the event catalog
- WebSocket Streaming: See websocket-streaming.md for the event pipeline
- Session Management: See session-management.md for process spawning