Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
141 changes: 141 additions & 0 deletions docs/codex-mid-turn-native-resume.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Codex Mid-Turn Interaction: Native Resume Approach

## Problem

Codex currently uses the **interrupt-and-continue fallback** for mid-turn interaction:

1. SIGINT kills the process
2. New process spawned with a hand-crafted continuation prompt containing partial output
3. Agent loses all conversation context except what Maestro manually captures and stuffs into the prompt

This is fragile, lossy, and architecturally different from Claude Code's native stdin-based mid-turn input.

## Research Findings (Codex CLI Reference)

- `codex exec` is explicitly **non-interactive** — designed for "scripted or CI-style runs that should finish without human interaction"
- stdin with `-` only accepts the **initial prompt**, not continuous streaming input
- No documented stdin streaming, IPC, socket, or signal-based mid-turn message injection
- `--json` emits JSONL events **out** but there's no way to send events **back in**
- **Conclusion: True mid-turn stdin injection (like Claude Code's `stream-json`) is impossible with Codex**

## Proposed Approach: Interrupt + Native Resume

Codex supports `codex exec resume <SESSION_ID> "follow-up prompt"`. Instead of reconstructing context manually, leverage Codex's own session persistence.

### Current Flow (Fallback)

```text
User interjects while Codex is working
→ SIGINT sent to process
→ Process exits
→ Maestro collects partial stdout captured so far
→ Maestro builds continuation prompt:
buildContinuationPrompt(partialOutput, userMessage)
wraps partial output in <partial_output> tags
→ Spawn fresh: codex exec -- "giant continuation prompt"
→ Agent sees ONLY what Maestro stuffed in the prompt
→ Context loss: tool calls in progress, reasoning state, earlier turns
```

### Proposed Flow (Native Resume)

```text
User interjects while Codex is working
→ Grab thread_id (already stored as sessionId on the tab/process)
→ SIGINT sent to process
→ Process exits (Codex saves state to ~/.codex/sessions/ JSONL files)
→ Spawn: codex exec resume <thread_id> -- "user's interjection message"
→ Codex loads FULL conversation history from its own session files
→ Agent has complete context of everything that happened
```

### Comparison

| Aspect | Current (Fallback) | Proposed (Native Resume) |
| -------------------------- | ------------------------------------------ | ---------------------------------------- |
| Context preservation | Partial — only captured stdout | Full — Codex's own session files |
| Continuation prompt | Hand-crafted with `<partial_output>` tags | Just the user's interjection |
| Tool call history | Lost | Preserved |
| Reasoning state | Lost | Preserved (in session JSONL) |
| Earlier conversation turns | Lost | Preserved |
| Complexity | High — prompt reconstruction logic | Low — use existing resume infrastructure |
| Reliability | Fragile — depends on stdout capture timing | Robust — Codex manages its own state |

## Infrastructure Already in Place

| Component | File | Status |
| --------------------------- | ----------------------------------------- | ----------------------------------------------- |
| `thread_id` extraction | `src/main/parsers/codex-output-parser.ts` | Done — parsed from `thread.started` JSONL event |
| `resumeArgs` definition | `src/main/agents/definitions.ts` | Done — `(sessionId) => ['resume', sessionId]` |
| `supportsResume` capability | `src/main/agents/capabilities.ts` | Done — `true` |
| Resume arg building | `src/main/utils/agent-args.ts` | Done — inserts `resume <id>` into CLI args |
| Session ID storage | Tab/process state in agentStore | Done — stored when parser emits `init` event |

## Implementation Plan

### Primary Change: `src/renderer/hooks/input/useInputProcessing.ts`

In the interrupt-and-continue fallback path (~line 443-538), replace:

```typescript
// BEFORE: Build continuation prompt with partial output
const continuationPrompt = buildContinuationPrompt(partialOutput, userMessage);
queueExecution({ prompt: continuationPrompt, sessionId: undefined });
```

With:

```typescript
// AFTER: Resume with native session continuation
const threadId = getCurrentSessionId(); // already captured from thread.started
queueExecution({ prompt: userMessage, sessionId: threadId });
```

### Capability Gating

Gate this behavior on agents that support native resume:

```typescript
if (capabilities.supportsResume && sessionId) {
// Use native resume — full context preserved by agent
queueExecution({ prompt: userMessage, sessionId });
} else {
// Fall back to continuation prompt reconstruction
const continuationPrompt = buildContinuationPrompt(partialOutput, userMessage);
queueExecution({ prompt: continuationPrompt });
}
```

### Secondary Changes

1. **`src/main/process-manager/spawners/ChildProcessSpawner.ts`** — Ensure resume args are passed through when `sessionId` is provided on a queued execution
2. **`src/main/utils/agent-args.ts`** — Verify the resume + follow-up prompt combination produces correct CLI: `codex exec resume <id> -- "message"`

## Key Risk: Session State on SIGINT

**Does Codex save session state when interrupted with SIGINT (not just on clean exit)?**

Codex uses incrementally-written `.jsonl` rollout files at `~/.codex/sessions/YYYY/MM/DD/rollout-<timestamp>-<uuid>.jsonl`. Since JSONL files are append-only and typically flushed per-line, partial sessions should be persisted even on interrupt.

**Mitigation:** If SIGINT doesn't reliably save state, two fallback strategies:

1. **Graceful wait** — Let the current turn complete, then resume (queue-and-wait instead of interrupt)
2. **Hybrid** — Try native resume first; if it fails (session not found), fall back to continuation prompt

## Agents This Applies To

| Agent | Supports Resume | Session Persistence | Candidate? |
| ------------- | --------------- | ------------------------- | ------------------ |
| Codex | Yes | JSONL rollout files | Yes |
| Claude Code | N/A | Has native mid-turn stdin | No (already works) |
| OpenCode | Yes | Local session files | Yes |
| Factory Droid | Yes | `~/.factory/sessions/` | Yes |

## References

- Codex CLI reference: https://developers.openai.com/codex/cli/reference.md
- Agent definitions: `src/main/agents/definitions.ts:143-190`
- Agent capabilities: `src/main/agents/capabilities.ts:206-232`
- Codex output parser: `src/main/parsers/codex-output-parser.ts`
- Codex session storage: `src/main/storage/codex-session-storage.ts`
- Interrupt fallback path: `src/renderer/hooks/input/useInputProcessing.ts:443-538`
1 change: 1 addition & 0 deletions src/__tests__/main/agents/capabilities.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,7 @@ describe('agent-capabilities', () => {
'supportsBatchMode',
'supportsStreaming',
'supportsStreamJsonInput',
'supportsMidTurnInput',
'supportsResultMessages',
'supportsModelSelection',
'requiresPromptToStart',
Expand Down
2 changes: 2 additions & 0 deletions src/__tests__/main/ipc/handlers/process.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -285,9 +285,11 @@ describe('process IPC handlers', () => {
const expectedChannels = [
'process:spawn',
'process:write',
'process:writeInterjection',
'process:interrupt',
'process:kill',
'process:resize',
'process:hasResultEmitted',
'process:getActiveProcesses',
'process:spawnTerminalTab',
'process:runCommand',
Expand Down
1 change: 1 addition & 0 deletions src/__tests__/renderer/hooks/useAgentCapabilities.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ const baseCapabilities = {
supportsResultMessages: true,
supportsModelSelection: false,
supportsStreamJsonInput: true,
supportsMidTurnInput: false,
supportsThinkingDisplay: false, // Added in Show Thinking feature
supportsContextMerge: false,
supportsContextExport: false,
Expand Down
90 changes: 88 additions & 2 deletions src/__tests__/renderer/hooks/useAgentListeners.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ let onCommandExitHandler: ListenerCallback | undefined;
let onUsageHandler: ListenerCallback | undefined;
let onAgentErrorHandler: ListenerCallback | undefined;
let onThinkingChunkHandler: ListenerCallback | undefined;
let onInterjectionAckHandler: ListenerCallback | undefined;
let onSshRemoteHandler: ListenerCallback | undefined;
let onToolExecutionHandler: ListenerCallback | undefined;

Expand All @@ -105,6 +106,7 @@ const mockUnsubscribeCommandExit = vi.fn();
const mockUnsubscribeUsage = vi.fn();
const mockUnsubscribeAgentError = vi.fn();
const mockUnsubscribeThinkingChunk = vi.fn();
const mockUnsubscribeInterjectionAck = vi.fn();
const mockUnsubscribeSshRemote = vi.fn();
const mockUnsubscribeToolExecution = vi.fn();

Expand Down Expand Up @@ -145,6 +147,10 @@ const mockProcess = {
onThinkingChunkHandler = handler;
return mockUnsubscribeThinkingChunk;
}),
onInterjectionAck: vi.fn((handler: ListenerCallback) => {
onInterjectionAckHandler = handler;
return mockUnsubscribeInterjectionAck;
}),
onSshRemote: vi.fn((handler: ListenerCallback) => {
onSshRemoteHandler = handler;
return mockUnsubscribeSshRemote;
Expand Down Expand Up @@ -207,6 +213,7 @@ beforeEach(() => {
onUsageHandler = undefined;
onAgentErrorHandler = undefined;
onThinkingChunkHandler = undefined;
onInterjectionAckHandler = undefined;
onSshRemoteHandler = undefined;
onToolExecutionHandler = undefined;

Expand Down Expand Up @@ -275,7 +282,7 @@ describe('getErrorTitleForType', () => {

describe('useAgentListeners', () => {
describe('listener registration', () => {
it('registers all 11 IPC listeners on mount', () => {
it('registers all 12 IPC listeners on mount', () => {
const deps = createMockDeps();
renderHook(() => useAgentListeners(deps));

Expand All @@ -288,11 +295,12 @@ describe('useAgentListeners', () => {
expect(mockProcess.onUsage).toHaveBeenCalledTimes(1);
expect(mockProcess.onAgentError).toHaveBeenCalledTimes(1);
expect(mockProcess.onThinkingChunk).toHaveBeenCalledTimes(1);
expect(mockProcess.onInterjectionAck).toHaveBeenCalledTimes(1);
expect(mockProcess.onSshRemote).toHaveBeenCalledTimes(1);
expect(mockProcess.onToolExecution).toHaveBeenCalledTimes(1);
});

it('unsubscribes all 11 listeners on unmount', () => {
it('unsubscribes all 12 listeners on unmount', () => {
const deps = createMockDeps();
const { unmount } = renderHook(() => useAgentListeners(deps));

Expand All @@ -307,6 +315,7 @@ describe('useAgentListeners', () => {
expect(mockUnsubscribeUsage).toHaveBeenCalledTimes(1);
expect(mockUnsubscribeAgentError).toHaveBeenCalledTimes(1);
expect(mockUnsubscribeThinkingChunk).toHaveBeenCalledTimes(1);
expect(mockUnsubscribeInterjectionAck).toHaveBeenCalledTimes(1);
expect(mockUnsubscribeSshRemote).toHaveBeenCalledTimes(1);
expect(mockUnsubscribeToolExecution).toHaveBeenCalledTimes(1);
});
Expand All @@ -324,6 +333,83 @@ describe('useAgentListeners', () => {
});
});

describe('onInterjectionAck', () => {
it('moves queued interjection from executionQueue to tab logs as delivered', () => {
const deps = createMockDeps();
const session = createMockSession({
id: 'sess-1',
aiTabs: [createMockTab({ id: 'tab-1', logs: [] })],
activeTabId: 'tab-1',
executionQueue: [
{
id: 'interjection-1',
timestamp: 1700000000000,
tabId: 'tab-1',
type: 'message' as const,
text: 'follow up question',
images: undefined,
},
],
});
useSessionStore.setState({
sessions: [session],
activeSessionId: 'sess-1',
});

renderHook(() => useAgentListeners(deps));

onInterjectionAckHandler?.('sess-1-ai-tab-1', 'interjection-1');

const updated = useSessionStore.getState().sessions.find((s) => s.id === 'sess-1');
// Should be removed from queue
expect(updated?.executionQueue.length).toBe(0);
// Should now be in tab logs as delivered
const deliveredLog = updated?.aiTabs[0].logs[0];
expect(deliveredLog?.id).toBe('interjection-1');
expect(deliveredLog?.text).toBe('follow up question');
expect(deliveredLog?.interjection).toBe(true);
expect(deliveredLog?.delivered).toBe(true);
expect(deliveredLog?.deliveryFailed).toBe(false);
});

it('falls back to marking existing log entry as delivered (interrupt-resume path)', () => {
const deps = createMockDeps();
const session = createMockSession({
id: 'sess-1',
aiTabs: [
createMockTab({
id: 'tab-1',
logs: [
{
id: 'interjection-1',
timestamp: Date.now(),
source: 'user',
text: 'follow up',
interjection: true,
delivered: false,
deliveryFailed: true,
},
],
}),
],
activeTabId: 'tab-1',
});
useSessionStore.setState({
sessions: [session],
activeSessionId: 'sess-1',
});

renderHook(() => useAgentListeners(deps));

onInterjectionAckHandler?.('sess-1-ai-tab-1', 'interjection-1');

const updated = useSessionStore.getState().sessions.find((s) => s.id === 'sess-1');
const updatedLog = updated?.aiTabs[0].logs[0];
expect(updatedLog?.delivered).toBe(true);
expect(updatedLog?.deliveryFailed).toBe(false);
});
});

// ========================================================================
// onData handler
// ========================================================================
Expand Down
Loading