feat(warm): Warm Agents — deterministic orchestration with blast-radius enforcement#502
feat(warm): Warm Agents — deterministic orchestration with blast-radius enforcement#502Mnehmos wants to merge 9 commits intoKilo-Org:devfrom
Conversation
…riant, audit Introduces the Warm Agents orchestration subsystem in packages/opencode/src/warm/. This adds deterministic dispatch primitives, warmness scoring, blast-radius enforcement, and an append-only audit log — all as additive code with zero changes to existing files. Modules: - agent-state: typed 5-state lifecycle (cold→warming→warm→executing→cooling) - task-state: typed 7-state lifecycle (pending→claimed→...→completed|rolled_back) - scorer: 4-dimension warmness scoring (recency, familiarity, toolMatch, continuity) - invariant: blast-radius enforcement for tool calls + pre/postcondition checks - audit: JSONL append-only audit log for dispatch decisions and state transitions - state-store: durable persistence for agent/task state via Bun.file + Lock - failure-report: structured failure reports answering the 3 quality-bar questions - bus-events: Bus event definitions for warm agent lifecycle All 66 tests passing. Ref: Kilo-Org#455 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…CLI flag Adds the dispatch and integration layer for Warm Agents: - policy: last-wins rule evaluation for dispatch decisions (deny, allow, require_approval, pin_agent), blast-radius ceiling, capability deny-list - scheduler: routes tasks to warmest qualified agent via scorer, with pinned → warmest → cold-spawn fallback chain, writes audit entries - capability-registry: in-memory agent capability index with MCP server tracking, qualified-agent queries, and unhealthy-server detection - warm-session: high-level orchestration API (submitTask, completeTask, toolPreCheck, registerAgent) bridging all subsystems - --warm CLI flag on `kilo run` to opt-in (zero behavior change without it) Seam changes: - cli/cmd/run.ts: additive --warm option + warm context initialization (marked with kilocode_change comments) 85 tests passing (19 new for policy + capability-registry). Ref: Kilo-Org#455 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Completes the Warm Agents prototype with hardening subsystems: - mcp-health: server registration, success/failure tracking with 3-strike threshold, tool schema drift detection (added/removed), recovery marking, healthy/unhealthy queries integrated with CapabilityRegistry - rollback: task-level rollback protocol extending Snapshot contract, reversibility checks, blast-radius-scoped file restoration, failure report generation with full quality-bar coverage - replay: audit log trace builder, structural verification (dispatch determinism, lifecycle integrity, invariant coverage), summary output Test suite: 119 tests across 10 files (34 new): - mcp-health: 14 tests (register, success, drift, failure threshold, recovery) - rollback: 6 tests (non-reversible skip, missing snapshot, file restore, reports) - replay: 7 tests (trace counts, determinism, lifecycle chain, coverage, summary) - integration: 7 tests (full lifecycle, policy denial, blast ceiling, MCP health, postcondition violation, replay verification, cold spawn fallback) Ref: Kilo-Org#455 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add integration bridge module for safe globalThis context access - Wire blast-radius invariant pre-checks into prompt.ts tool execution (both registry tools and MCP tools) - Register warm agent and create default task on --warm CLI startup - Add warm status output to run.ts event loop (tool completion + session end) - Add createDefaultTask for CLI integration with working-directory scope - 136 tests passing (17 new integration bridge tests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add --warm option to TUI command ($0/default) - Pass KILO_WARM=1 env var to worker thread - Add ensureContext() lazy initialization in integration bridge (auto-creates warm context on first tool call when KILO_WARM=1) - Works seamlessly in both TUI and headless run modes - 136 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cular deps
The dynamic import("../warm") loaded the barrel index.ts which triggered
circular dependencies, causing WarmIntegration to be undefined at runtime.
Switch all dynamic imports to target specific modules directly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Orchestrators can now spawn sub-agents with narrower scope: - Scope inference from task descriptions (extracts file paths/dirs) - Child scope validated to be within parent's blast-radius - Sub-task tool calls enforced against narrower scope - Parent task restored after sub-agent completes - Full audit trail for scope narrowing decisions - 18 new tests for hierarchical scope (154 total) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Exercises the full warm API surface without requiring an LLM: agent lifecycle, blast-radius enforcement, hierarchical sub-tasks, scope narrowing, postcondition violations, and audit log generation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
| export function matchesGlob(filePath: string, patterns: string[]): boolean { | ||
| for (const pattern of patterns) { | ||
| if (pattern === "**" || pattern === "**/*") return true | ||
| if (filePath.startsWith(pattern.replace("/**", "").replace("/*", ""))) return true |
There was a problem hiding this comment.
WARNING: matchesGlob uses startsWith after stripping glob suffixes, which causes false positives.
For example, the pattern src/routes/** becomes src/routes after .replace("/**", ""), so filePath.startsWith("src/routes") will incorrectly match src/routesExtra/file.ts.
This is a blast-radius enforcement bypass — a tool operating on src/routesExtra/ would pass the check when it shouldn't.
Fix: append a / separator before the startsWith check:
| if (filePath.startsWith(pattern.replace("/**", "").replace("/*", ""))) return true | |
| if (filePath.startsWith(pattern.replace("/**", "/").replace("/*", "/"))) return true |
| tool(part) | ||
| // kilocode_change start - warm agent status after tool completion | ||
| if (args.warm && args.format !== "json") { | ||
| const warmCtx = (globalThis as any).__warmContext |
There was a problem hiding this comment.
WARNING: Direct (globalThis as any).__warmContext access bypasses the centralized WarmIntegration.getContext() that was created specifically to encapsulate this.
Since WarmIntegration is already imported on line 543, consider using WarmIntegration.getContext() here instead for consistency and type safety. The same applies to line 540.
| using _ = await Lock.read(filePath) | ||
| const data = await Bun.file(filePath).json() | ||
| return AgentState.Info.parse(data) | ||
| } catch { |
There was a problem hiding this comment.
WARNING: Empty catch block silently swallows all errors, including unexpected ones (e.g., permission denied, corrupt JSON). Per AGENTS.md: "Never leave a catch block empty."
For getAgent/getTask, ENOENT (file not found) returning undefined is expected, but other errors should be logged:
} catch (e) {
if ((e as NodeJS.ErrnoException).code !== "ENOENT") {
log.warn("getAgent failed", { agentID, error: e })
}
return undefined
}Same issue at lines 68, 87, and 119.
| ): Promise<void> { | ||
| await Audit.append(task.sessionID, { | ||
| type: "dispatch_decision", | ||
| id: `audit_dispatch_${Date.now()}`, |
There was a problem hiding this comment.
WARNING: Audit ID audit_dispatch_${Date.now()} can collide when multiple dispatches happen within the same millisecond (e.g., batch task submission).
Other audit sites in integration.ts already use a random suffix pattern: ${Date.now()}_${Math.random().toString(36).slice(2, 8)}. Consider using the same pattern here for consistency.
| id: `audit_dispatch_${Date.now()}`, | |
| id: `audit_dispatch_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`, |
| ): Promise<void> { | ||
| await Audit.append(sessionID, { | ||
| type: "state_transition", | ||
| id: `audit_transition_${Date.now()}`, |
There was a problem hiding this comment.
WARNING: Same audit ID collision risk as in scheduler.ts — audit_transition_${Date.now()} can produce duplicates for rapid state transitions.
| id: `audit_transition_${Date.now()}`, | |
| id: `audit_transition_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`, |
|
|
||
| // kilocode_change start - warm agents: create scoped sub-task for sub-agent | ||
| let warmSubTask: { taskID: string; parentTaskID: string; narrowed: boolean; scope: string[]; previousTask?: any } | undefined | ||
| if ((globalThis as any).__warmContext?.enabled || process.env.KILO_WARM === "1") { |
There was a problem hiding this comment.
SUGGESTION: This check duplicates the logic in WarmIntegration.isEnabled(). Consider using the centralized helper instead:
const { WarmIntegration } = await import("../warm/integration")
if (WarmIntegration.isEnabled()) {This keeps the warm-context detection logic in one place and avoids direct globalThis access in upstream files.
Code Review SummaryStatus: 11 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
SUGGESTION
Other Observations (not in diff)Issues found in unchanged code or patterns that span multiple locations:
Files Reviewed (35 files)
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| * but no context exists yet. This handles the TUI case where the | ||
| * worker thread has the env var but warm init hasn't happened yet. | ||
| */ | ||
| export async function ensureContext(sessionID: string): Promise<WarmSessionType.WarmContext | undefined> { |
There was a problem hiding this comment.
[WARNING]: Race condition in ensureContext — concurrent tool calls can double-initialize
If two tool calls arrive simultaneously (e.g., parallel tool execution), both can pass the existing?.enabled check on line 42 before either call reaches setContext(ctx) on line 51. This would create two separate warm contexts, two agents, and two tasks — with only the last one surviving on globalThis.
Consider adding a guard (e.g., a pending promise) to serialize initialization:
let pending: Promise<WarmSessionType.WarmContext | undefined> | undefined
export function ensureContext(sessionID: string): Promise<WarmSessionType.WarmContext | undefined> {
const existing = getContext()
if (existing?.enabled) return Promise.resolve(existing)
if (process.env.KILO_WARM !== "1") return Promise.resolve(undefined)
if (!pending) pending = doInit(sessionID).finally(() => { pending = undefined })
return pending
}| sessionID: string, | ||
| toolName: string, | ||
| args: Record<string, unknown>, | ||
| durationMs: number, |
There was a problem hiding this comment.
[WARNING]: durationMs parameter is accepted but never used in the function body
The durationMs parameter on line 130 is dead code — it's never referenced inside logToolExecution. The caller in prompt.ts always passes 0. Either measure and pass the actual duration, or remove the parameter.
| type: "invariant_check", | ||
| id: `audit_exec_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`, | ||
| taskID: ctx.activeTask.id, | ||
| phase: "tool_pre", |
There was a problem hiding this comment.
[WARNING]: Audit entry uses phase: "tool_pre" for a post-execution log
This function is called after tool execution (from prompt.ts line 851), but the audit entry records phase: "tool_pre". This makes the audit log misleading — replay/analysis would interpret this as a pre-check rather than a post-execution record. Should be "tool_post" to match the actual semantics.
| sessionID: task.sessionID, | ||
| reason: "postcondition failure", | ||
| filesRestored: restored, | ||
| }).catch(() => {}) |
There was a problem hiding this comment.
[WARNING]: Empty .catch(() => {}) silently swallows Bus publish errors
Per the project's style guide (AGENTS.md): "Never leave a catch block empty." While the comment explains Bus may not have Instance context, at minimum log the error so failures are visible:
| }).catch(() => {}) | |
| }).catch((e) => log.warn("bus publish failed", { error: e })) |
| // kilocode_change start - warm agent audit logging | ||
| if (warmCheck?.logged) { | ||
| const { WarmIntegration } = await import("../warm/integration") | ||
| await WarmIntegration.logToolExecution(ctx.sessionID, item.id, args, 0).catch(() => {}) |
There was a problem hiding this comment.
[WARNING]: Empty .catch(() => {}) silently swallows audit logging errors, and durationMs is hardcoded to 0
Two issues:
- The empty catch violates the project's "no empty catch blocks" rule (AGENTS.md). At minimum, log the error.
durationMs: 0is always passed — if duration tracking isn't needed, remove the parameter fromlogToolExecution. If it is needed, measure the actual tool execution time.
| await WarmIntegration.logToolExecution(ctx.sessionID, item.id, args, 0).catch(() => {}) | |
| await WarmIntegration.logToolExecution(ctx.sessionID, item.id, args, 0).catch((e) => log.warn("warm audit failed", { error: e })) |
Closes #455
Summary
Warm Agents adds a deterministic, stateful orchestration layer to Kilo that answers three questions current agent loops cannot:
Key capabilities
executewhen onlyread/writeallowed) are rejected.$XDG_DATA_HOME/kilo/warm/audit/{sessionID}.jsonl.--warmflag checks and lazyimport(). No performance impact on normal usage.Architecture
Integration touchpoints (4 existing files, surgical changes)
session/prompt.ts__warmContext?.enabledorKILO_WARM=1tool/task.tscli/cmd/run.ts--warmflag, warm init, status display, task completionargs.warmcli/cmd/tui/thread.ts--warmflag forwarded to worker viaKILO_WARM=1envargs.warmAll changes are:
--warmis not passed)import()— no new imports in hot paths// kilocode_changecomments following codebase conventionDemo output
The standalone audit demo (
bun test/warm/demo-audit.ts) exercises the full API surface without an LLM:Test plan
bun run typecheckinpackages/opencode)--warmis not activeFiles added
src/warm/test/warm/docs/Total: ~6,200 lines added, ~160 lines modified across 4 existing files
🤖 Generated with Claude Code