Local MCP and CLI supervisor for spawning external coding agents as managed background jobs.
See Project Boundary and Long-Term Vision before changing the architecture. The current stdio MCP implementation is a hardening phase; the long-term lifecycle owner is a durable local daemon. See Verification Notes for the current Windows, WSL, and real Claude Code baseline. See Service Lifecycle for the current manual daemon start, inspect, and stop workflow.
Claude Code is the frozen compatibility backend. New agent integration work should happen behind backend adapters, starting with OpenCode. Supervisor must remain a lifecycle owner and must not become a provider/model router.
The repository targets a Codex-like lifecycle:
claude_run: start Claude Code and return a job handle quicklyclaude_status: inspect current job metadataclaude_wait: wait briefly for a terminal stateclaude_result: read stdout, stderr, parsed JSON, and exit metadataclaude_continue: start a new job from a persisted Claude Code session idclaude_kill: kill the process treeclaude_cleanup: remove terminal job directories while preserving running jobs
Install, build, and run the deterministic gate:
pnpm install
pnpm run build
pnpm run typecheck
pnpm testRun a fake-Claude CLI job before spending real Claude Code quota:
SUPERVISOR_CLAUDE_COMMAND=node SUPERVISOR_CLAUDE_PREFIX_ARGS=tests/fixtures/fake-claude.mjs node dist/cli.js run --cwd . --prompt "hello"
node dist/cli.js wait <jobId> --timeout-ms 30000
node dist/cli.js result <jobId>PowerShell users can set the same overrides with $env:SUPERVISOR_CLAUDE_COMMAND = "node" and $env:SUPERVISOR_CLAUDE_PREFIX_ARGS = "tests/fixtures/fake-claude.mjs".
The supervisor calls the system claude command by default and does not add permission-bypass flags. It intentionally does not expose Claude Code bypassPermissions / --dangerously-skip-permissions. If local claude is managed by cc-switch, routing and quota remain controlled by that existing Claude Code setup.
Prompts are written to job-local prompt.md and sent to Claude Code over stdin. They are not passed as command-line arguments and are not returned by default in status; metadata stores only promptPath, promptPreview, and promptSha256.
Jobs are stored under:
%LOCALAPPDATA%\supervisor
on Windows, or:
$XDG_STATE_HOME/supervisor
~/.local/state/supervisor
on Linux/WSL. Set SUPERVISOR_STATE_DIR to override this.
pnpm install
pnpm run buildpnpm run build
node dist/cli.js run --cwd . --prompt "Reply exactly: OK"
node dist/cli.js status <jobId>
node dist/cli.js wait <jobId> --timeout-ms 30000
node dist/cli.js result <jobId>
node dist/cli.js continue --cwd . --job-id <jobId> --prompt "Follow up"
node dist/cli.js kill <jobId>
node dist/cli.js cleanup --older-than-ms 86400000For deterministic tests, override the Claude command:
SUPERVISOR_CLAUDE_COMMAND=node
SUPERVISOR_CLAUDE_PREFIX_ARGS=/path/to/fake-claude.mjsSUPERVISOR_CLAUDE_PREFIX_ARGS can also be a JSON string array.
After pnpm run build, package bins point at:
| Bin | Built file | Purpose |
|---|---|---|
supervisor |
dist/cli.js |
Local CLI for run/status/wait/result/continue/peek/kill/cleanup |
supervisor-mcp |
dist/mcp.js |
Stdio MCP server exposing Claude lifecycle tools |
supervisor-daemon |
dist/daemon.js |
Manual loopback daemon for durable lifecycle ownership |
| Variable | Used by | Purpose |
|---|---|---|
SUPERVISOR_STATE_DIR |
CLI, MCP, daemon | Override the state directory for job metadata and artifacts |
SUPERVISOR_CLAUDE_COMMAND |
CLI, MCP, daemon | Override the executable, usually for fake-Claude tests |
SUPERVISOR_CLAUDE_PREFIX_ARGS |
CLI, MCP, daemon | Add fixed arguments before supervisor's Claude Code arguments |
SUPERVISOR_DAEMON_URL |
CLI, MCP | Explicit daemon URL for adapter mode |
SUPERVISOR_DAEMON_DISCOVERY |
CLI, MCP | Set to 1 to read <stateDir>/daemon.json explicitly |
SUPERVISOR_DEFAULT_RUNTIME_TIMEOUT_MS |
CLI, MCP, daemon | Default runtime timeout for jobs that do not pass timeoutMs |
SUPERVISOR_MAX_CONCURRENT_JOBS |
CLI, MCP, daemon | Limit concurrent running jobs for that supervisor process |
The first daemon milestone is manual and loopback-only by default:
pnpm run build
node dist/daemon.js --host 127.0.0.1 --port 27777The daemon prints one JSON readiness line and then serves:
GET /health
POST /v1/jobs/run
POST /v1/jobs/status
POST /v1/jobs/wait
POST /v1/jobs/result
POST /v1/jobs/continue
POST /v1/jobs/peek
POST /v1/jobs/kill
POST /v1/jobs/cleanup
GET /health returns readiness metadata:
{
"status": "ok",
"version": "0.1.0",
"pid": 12345,
"stateDir": "..."
}Daemon errors use a stable object shape:
{
"error": {
"code": "invalid_request",
"message": "Missing required jobId"
}
}Current daemon error codes are not_found, bad_json, body_too_large, invalid_request, and internal_error. JSON request bodies are limited to 1 MiB by default.
This is not service installation or auto-start. It is the first step toward the long-term architecture where the daemon owns job lifecycle and CLI/MCP become adapters.
The CLI delegates to a running daemon only when configured explicitly:
SUPERVISOR_DAEMON_URL=http://127.0.0.1:27777 node dist/cli.js run --cwd . --prompt "Reply exactly: OK"
node dist/cli.js --daemon-url http://127.0.0.1:27777 status <jobId>The daemon also writes a discovery file at <stateDir>/daemon.json after it binds. CLI discovery is still explicit:
node dist/cli.js --discover-daemon status <jobId>
SUPERVISOR_DAEMON_DISCOVERY=1 node dist/cli.js status <jobId>Without SUPERVISOR_DAEMON_URL or --daemon-url, CLI keeps the direct local supervisor path.
After pnpm run build, configure an MCP client to run:
node G:/repository/supervisor/dist/mcp.jsEnvironment overrides:
SUPERVISOR_DAEMON_URL
SUPERVISOR_DAEMON_DISCOVERY
SUPERVISOR_STATE_DIR
SUPERVISOR_CLAUDE_COMMAND
SUPERVISOR_CLAUDE_PREFIX_ARGS
When SUPERVISOR_DAEMON_URL is set, MCP tools delegate to the running daemon. When SUPERVISOR_DAEMON_DISCOVERY=1 is set, MCP reads <stateDir>/daemon.json and rejects stale daemon metadata. Without either setting, MCP keeps the direct in-process supervisor path for fallback and debugging.
Example MCP configuration using explicit daemon discovery:
{
"mcpServers": {
"supervisor": {
"command": "node",
"args": ["G:/repository/supervisor/dist/mcp.js"],
"env": {
"SUPERVISOR_DAEMON_DISCOVERY": "1"
}
}
}
}Use SUPERVISOR_DAEMON_URL instead of discovery when the daemon URL is fixed and known.
Inspect a daemon without running a job:
node dist/cli.js daemon-health --daemon-url http://127.0.0.1:27777
node dist/cli.js --discover-daemon daemon-healthclaude_result returns bounded stdout/stderr by default, plus stdoutPath, stderrPath, byte counts, and truncation flags. Read the files directly only when a full local artifact is needed.
claude_result and claude_peek return bounded stdout/stderr text for client safety. Full local artifacts remain available at stdoutPath and stderrPath.
Clean terminal jobs with:
node dist/cli.js cleanup --older-than-ms 86400000Cleanup removes terminal job directories and reports removed temp files. It preserves running and abandoned jobs.
- Job finalization is handled from the child
closeevent so stdout/stderr pipes have closed before metadata is finalized. - Completed Claude JSON
session_idis persisted tometa.jsonassessionId. - New job metadata is written with
schemaVersion: 1; older metadata without a schema version remains readable. - Running jobs can be limited with
maxConcurrentJobsin code andtimeoutMsper run. - If a previous MCP process exited and left stale
runningmetadata, status reconciliation marks a missing PID asorphaned. - If stale
runningmetadata points at a live PID that the current supervisor instance does not own, status reconciliation marks it asabandonedrather than reporting normalrunning. - Cleanup removes terminal job directories and reports temp JSON files removed with those directories; it preserves
runningandabandonedjob directories. - Windows and WSL should not share one
node_modulesdirectory. Runpnpm install --frozen-lockfileseparately inside WSL before Linux-side tests because packages such as Rollup install OS-specific optional dependencies.
Unknown command: ...: runnode dist/cli.js <command>afterpnpm run build; package bins also require builtdist/files.Stale daemon discovery: stop the old PID if it is still running, or start a new daemon so<stateDir>/daemon.jsonis refreshed.Cannot find moduleafter moving between Windows and WSL: runpnpm install --frozen-lockfileseparately in that environment.claudeis missing or routes unexpectedly: checkwhere.exe claudeon Windows orwhich claudeon WSL/Linux. Supervisor does not route providers itself.- Job output looks truncated: use
stdoutPathandstderrPathfromclaude_resultfor full artifacts.
pnpm run typecheck
pnpm test
pnpm run buildManual and opt-in real Claude Code probes are documented in Real Claude Code Probes. They are not part of the default deterministic test suite.