From e8ac7a5e7370cafca3ea7a519eeadf72dd533a34 Mon Sep 17 00:00:00 2001 From: terra tauri Date: Mon, 30 Mar 2026 05:38:44 -0700 Subject: [PATCH] =?UTF-8?q?security:=20harden=20sandbox=20=E2=80=94=20fix?= =?UTF-8?q?=20injection,=20env=20leak,=20cross-nav=20bypass?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CRITICAL fixes: - Cross-nav queries now inherit target navigator's sandbox config instead of running completely unsandboxed (cross-nav.ts, related-navs.ts) - Shell injection via NONO_FLAGS fixed: flags now passed via temp file (one per line) read with `while read`, not unquoted env var expansion HIGH fixes: - Environment variable filtering: subprocess gets allowlisted vars only, not full process.env (strips plugin tokens, DB creds, etc.) - Path validation: config schema rejects traversal (..), null bytes - Command deny-list: bash, sudo, rm, python, etc. blocked in config schema - Wrapper script uses random UUID filename + mode 0700 (TOCTOU fix) MEDIUM fixes: - isSandboxEnabled() now warns on stderr before silent degradation - Shared buildSandboxConfigForOperation() eliminates duplicated logic Docs: security-model.md fully rewritten to reflect 3-provider model, nono wrapper architecture, env filtering, config validation, and sandbox_query diagnostic tool. --- docs/security-model.md | 227 ++++++++++-------- .../src/harness/claude-code-harness.ts | 46 +++- packages/autonav/src/harness/index.ts | 3 + .../src/harness/sandbox-config-builder.ts | 70 ++++++ packages/autonav/src/harness/sandbox.ts | 69 +++--- packages/autonav/src/tools/cross-nav.ts | 11 +- packages/autonav/src/tools/related-navs.ts | 9 +- packages/communication-layer/src/index.ts | 1 + .../communication-layer/src/schemas/config.ts | 84 +++++-- 9 files changed, 364 insertions(+), 156 deletions(-) create mode 100644 packages/autonav/src/harness/sandbox-config-builder.ts diff --git a/docs/security-model.md b/docs/security-model.md index 64580c7..819920c 100644 --- a/docs/security-model.md +++ b/docs/security-model.md @@ -1,6 +1,6 @@ # Security Model -Autonav uses defense in depth: multiple independent layers restrict what a navigator agent can do. Each layer fails independently, so a bypass in one doesn't compromise the others. Most layers degrade gracefully when their underlying mechanism is unavailable. +Autonav uses defense in depth: multiple independent layers restrict what a navigator agent can do. Each layer fails independently, so a bypass in one doesn't compromise the others. --- @@ -8,23 +8,40 @@ Autonav uses defense in depth: multiple independent layers restrict what a navig Every agent session runs through a **harness** — a universal adapter that translates `AgentConfig` into runtime-specific options. The three implementations are: +- **ClaudeCodeHarness** — uses the Claude Code Agent SDK, delegates sandboxing to [nono](https://github.com/always-further/nono) (default) or the SDK's built-in Seatbelt/bubblewrap sandbox - **ChibiHarness** — runs `chibi-json` as a subprocess, uses [nono](https://github.com/always-further/nono) for kernel-level sandboxing (Landlock/Seatbelt) -- **ClaudeCodeHarness** — uses the Claude Code Agent SDK, passes sandbox settings to the SDK runtime - **OpenCodeHarness** — uses the [OpenCode](https://opencode.ai/) SDK (`@opencode-ai/sdk`), manages a shared server process with SSE event streaming -Callers set `AgentConfig.sandbox` without knowing which harness is active. The harness translates: +### Sandbox Provider Model -| Field | ChibiHarness | ClaudeCodeHarness | OpenCodeHarness | -|---|---|---|---| -| `sandbox.readPaths` | [`nono`](https://github.com/always-further/nono) `--read ` | Not used (sandbox disabled) | Denies `edit` + `bash` permissions | -| `sandbox.writePaths` | [`nono`](https://github.com/always-further/nono) `--allow ` | Not used (sandbox disabled) | Allows all permissions | -| `sandbox.blockNetwork` | [`nono`](https://github.com/always-further/nono) `--net-block` | Not used (sandbox disabled) | Not supported | -| Mechanism | Landlock (Linux), Seatbelt (macOS) | Disabled (cwd/tool/permission layers only) | OpenCode permission system | -| Fallback | Runs unsandboxed if nono not on PATH | N/A | All permissions allowed | +Each navigator selects a sandbox provider in `config.json`: -> **Note**: The OpenCode harness translates sandbox config into OpenCode's permission system (`"allow"` / `"deny"` per tool). This is **application-level**, not kernel-enforced — a determined agent could potentially bypass it. Read-only sandbox profiles deny `edit` and `bash`; read-write profiles allow all permissions. +```json +{ + "sandbox": { + "provider": "nono" + } +} +``` -> **Source**: `src/harness/types.ts` (SandboxConfig, AgentConfig), `src/harness/sandbox.ts` (nono wrapper), `src/harness/claude-code-harness.ts` (configToSdkOptions), `src/harness/opencode-harness.ts` (permission config) +| Provider | Mechanism | Behavior When Missing | +|---|---|---| +| `"nono"` (default) | Kernel-enforced via [nono](https://github.com/always-further/nono) (Seatbelt on macOS, Landlock on Linux) | **Hard error** — refuses to start | +| `"claude-code"` | Claude Code SDK's built-in Seatbelt/bubblewrap sandbox | Always available (bundled with SDK) | +| `"none"` | No sandbox enforcement | N/A | + +There is **no silent fallback**. If nono is configured (the default) but not installed, autonav fails with install instructions rather than running unsandboxed. + +### Harness Sandbox Matrix + +| Field | ClaudeCodeHarness (nono) | ClaudeCodeHarness (claude-code) | ChibiHarness | OpenCodeHarness | +|---|---|---|---|---| +| `sandbox.readPaths` | `nono --read ` via flags file | SDK sandbox | `nono --read ` | Denies `edit` + `bash` | +| `sandbox.writePaths` | `nono --allow ` via flags file | SDK sandbox | `nono --allow ` | Allows all permissions | +| `sandbox.blockNetwork` | `nono --net-block` | Not supported | `nono --net-block` | Not supported | +| Fallback | Hard error if nono missing | N/A | Hard error if nono missing | All permissions allowed | + +> **Source**: `src/harness/types.ts` (SandboxConfig, SandboxProvider), `src/harness/sandbox.ts` (nono wrapper), `src/harness/claude-code-harness.ts` (configToSdkOptions), `src/harness/sandbox-config-builder.ts` --- @@ -36,7 +53,7 @@ Each operation gets a sandbox profile appropriate to its trust level. Defaults a |---|---|---|---| | `query` | enabled | Read-only to navigator directory | Queries should never modify state | | `update` | enabled | Read+write to navigator directory | Updates modify the knowledge base | -| `chat` | enabled | Read-only to navigator directory | Interactive sessions read knowledge | +| `chat` | enabled | Read-only (configurable to readwrite) | Interactive sessions read knowledge | | `standup` | enabled | Report: read-only; Sync: read+write | Reporting reads, syncing writes | | `memento` | **disabled** | Full access | Worker agent needs full code access | @@ -54,62 +71,61 @@ Override in `config.json`: ## Layer 1: Sandboxing -File and network access is restricted at the lowest available level. The mechanism varies by harness: +File and network access is restricted at the lowest available level. -### ChibiHarness — [nono](https://github.com/always-further/nono) +### ClaudeCodeHarness — nono (default provider) -[nono](https://github.com/always-further/nono) is a lightweight, zero-dependency sandbox tool that enforces file and network access at the **kernel level** using Landlock (Linux) and Seatbelt (macOS). It's fast, composable, and wraps any command — you just prefix it with `nono run` and declare what the process is allowed to touch. Everything else is denied by the OS kernel itself, not by application-level checks. +When provider is `"nono"`, the harness creates a wrapper script that runs the Claude Code CLI inside a [nono](https://github.com/always-further/nono) sandbox: -Autonav wraps chibi subprocess commands with nono automatically: +1. `buildNonoFlags()` converts `SandboxConfig` into nono CLI flags (`--read`, `--allow`, `--allow-command`, etc.) +2. `writeNonoFlagsFile()` writes flags to a temp file (one per line) to avoid shell injection from unquoted env var expansion +3. `createSdkWrapper()` generates a bash script that reads flags from the file and execs `nono run --profile claude-code ... -- claude` +4. The SDK spawns this wrapper instead of `claude` directly -``` -nono run --silent --allow-cwd --read /path/to/nav -- chibi-json ... -``` - -- `--read`: read-only access to a path -- `--allow`: read+write access to a path -- `--allow-cwd`: always grants access to the working directory -- `--net-block`: blocks all network access (not used here — chibi makes its own API calls) -- System binary paths (`/bin`, `/usr/bin`, `/opt/homebrew`, etc.) are auto-added as read-only +The wrapper script uses nono's built-in `claude-code` profile as a base (providing `~/.claude`, keychain access, tmp dirs, etc.) and adds navigator-specific paths via the flags file. -**Auto-detection**: Autonav checks for [nono](https://github.com/always-further/nono) on PATH at startup. If installed, kernel sandboxing activates automatically. If not, chibi runs unsandboxed (the other layers still apply). Set `AUTONAV_SANDBOX=0` to force-disable. +**Shell injection prevention**: Flags are passed via a newline-delimited temp file read with `while IFS= read -r`, not via an unquoted env var. The flags file uses a random UUID in its filename (e.g., `nono-flags-a1b2c3d4.txt`). -> **Source**: `src/harness/sandbox.ts` (buildSandboxArgs, isSandboxEnabled, wrapCommand) +**Wrapper script security**: Uses a random UUID in the filename (e.g., `nono-claude-wrapper-a1b2c3d4e5f6.sh`) and is created with mode `0700` (owner-only) to prevent TOCTOU attacks. -### ClaudeCodeHarness — SDK Sandbox (Disabled) +> **Source**: `src/harness/sandbox.ts` (buildNonoFlags, writeNonoFlagsFile, createSdkWrapper), `src/harness/claude-code-harness.ts` (configToSdkOptions) -The SDK sandbox is currently **disabled** (`sandbox: { enabled: false }`). The SDK's Seatbelt/bubblewrap sandbox blocks all network access by default and the `allowedDomains` mechanism can't be reliably wired up through the programmatic API yet. Navigators that need to call external APIs (Linear, GitHub, Slack, etc.) would be completely broken with the SDK sandbox enabled. +### ClaudeCodeHarness — claude-code provider -The ClaudeCodeHarness relies on the other layers (cwd scoping, tool restrictions, permission modes, turn/budget limits) for isolation. +When provider is `"claude-code"`, the SDK's built-in Seatbelt/bubblewrap sandbox is enabled (`sandbox: { enabled: true }`). No nono dependency needed. > **Source**: `src/harness/claude-code-harness.ts` (configToSdkOptions) +### ChibiHarness — nono + +[nono](https://github.com/always-further/nono) wraps chibi subprocess commands automatically: + +``` +nono run --silent --allow-cwd --read /path/to/nav -- chibi-json ... +``` + +Uses `buildCapabilitySet()` from nono-ts to programmatically build profiles including Claude Code infrastructure paths, system binaries, and navigator-specific paths. + +> **Source**: `src/harness/sandbox.ts` (buildCapabilitySet, wrapCommand) + ### OpenCodeHarness — Permission System -OpenCode doesn't have kernel-level sandboxing. Instead, `AgentConfig.sandbox` is translated into OpenCode's per-tool permission system: +OpenCode doesn't have kernel-level sandboxing. `AgentConfig.sandbox` is translated into OpenCode's per-tool permission system: -- **Read-only** (`readPaths` set, no `writePaths`): `edit: "deny"`, `bash: "deny"` — the agent can search and read files but cannot modify them or run shell commands -- **Read-write** (`writePaths` set): all permissions allowed -- **No sandbox**: all permissions allowed (default headless behavior) +- **Read-only**: `edit: "deny"`, `bash: "deny"` +- **Read-write**: all permissions allowed -This is **application-level enforcement** — it relies on OpenCode's permission checks, not OS kernel restrictions. It's weaker than [nono](https://github.com/always-further/nono) or SDK sandboxing but still provides meaningful protection for read-only operations. +This is **application-level enforcement** — weaker than kernel sandboxing but still provides meaningful protection. -> **Source**: `src/harness/opencode-harness.ts` (serverConfig.permission in ensureServer) +> **Source**: `src/harness/opencode-harness.ts` --- ## Layer 2: Working Directory Scoping -Every agent session sets `cwd` to the navigator's directory. This is the most fundamental constraint — the agent starts in and operates on its own directory. - -- **Query/Update**: `cwd` = navigator directory -- **Standup report**: `cwd` = navigator directory -- **Standup sync**: `cwd` = navigator directory -- **Chat**: `cwd` = navigator directory (via `navigatorPath`) +Every agent session sets `cwd` to the navigator's directory. `additionalDirectories` grants explicit access beyond `cwd` when needed. -`additionalDirectories` grants explicit access beyond `cwd` when needed (e.g., standup sync agents accessing working directories of monitored projects). - -> **Source**: `src/adapter/navigator-adapter.ts` (query ~line 441, update ~line 644), `src/standup/loop.ts` (report, sync), `src/conversation/App.tsx` +> **Source**: `src/adapter/navigator-adapter.ts`, `src/conversation/App.tsx`, `src/standup/loop.ts` --- @@ -120,13 +136,13 @@ Operations restrict which tools the agent can use via `disallowedTools`: | Context | Mechanism | Effect | Rationale | |---|---|---|---| | Query | `disallowedTools` | Blocks Write, Edit, NotebookEdit | Read-only — queries should never modify state | -| Update | No restriction | All tools available | Read-write — sandbox provides file-level restriction | -| Chat | No restriction | All tools available | Interactive — user is present to approve actions | -| Memento | No restriction | All tools available | Full access — worker needs unrestricted code access | -| Standup report | No restriction | All tools available | Report phase reads and summarizes | -| Standup sync | No restriction | All tools available | Sync phase may update files | +| Cross-nav query | `disallowedTools` | Blocks Write, Edit, NotebookEdit | Sub-queries are read-only | +| Update | No restriction | All tools available | Sandbox provides file-level restriction | +| Chat | No restriction | All tools available | Interactive — user present | +| Memento | No restriction | All tools available | Full access needed | +| Standup | No restriction | All tools available | Report reads, sync writes | -> **Source**: `src/adapter/navigator-adapter.ts` (query disallowedTools) +> **Source**: `src/adapter/navigator-adapter.ts`, `src/tools/cross-nav.ts`, `src/tools/related-navs.ts` --- @@ -136,8 +152,8 @@ Each operation sets a permission mode controlling what the agent can do without | Mode | Behavior | Used By | |---|---|---| -| `"bypassPermissions"` | All actions auto-approved | query, update, standup, memento (non-interactive automation) | -| `"acceptEdits"` | File edits auto-approved; shell commands prompt | chat (interactive sessions) | +| `"bypassPermissions"` | All actions auto-approved | query, update, memento | +| `"acceptEdits"` | File edits auto-approved; shell commands prompt | chat, standup | > **Source**: `src/adapter/navigator-adapter.ts`, `src/conversation/App.tsx`, `src/standup/loop.ts` @@ -151,11 +167,9 @@ Hard limits prevent runaway execution: |---|---|---| | Query | 50 | Configurable via CLI | | Cross-nav sub-query | 10 | — | -| Standup report (per nav) | 30 | — | +| Standup report (per nav) | 15 | — | | Standup sync (per nav) | 30 | — | -`maxBudgetUsd` provides a hard spending cap per session — the harness terminates the session if the budget is exceeded. - > **Source**: `src/adapter/navigator-adapter.ts`, `src/tools/cross-nav.ts`, `src/standup/loop.ts` --- @@ -168,70 +182,80 @@ When navigators query each other, a depth counter prevents infinite loops: MAX_QUERY_DEPTH = 3 ``` -Each cross-nav query increments the depth via a closure counter. Queries beyond depth 3 are rejected with an error. This applies to both the generic `query_navigator` tool and the per-navigator `ask_` tools. +Each cross-nav query increments the depth via a closure counter. Queries beyond depth 3 are rejected with an error. Cross-nav sub-sessions inherit the target navigator's sandbox configuration via `buildSandboxConfigForOperation()`. -> **Source**: `src/tools/cross-nav.ts:16` (MAX_QUERY_DEPTH), `src/tools/related-navs.ts:15` +> **Source**: `src/tools/cross-nav.ts`, `src/tools/related-navs.ts`, `src/harness/sandbox-config-builder.ts` --- ## Layer 7: Ephemeral Home Directories -Each harness session gets an isolated temporary home directory: +Each harness session can get an isolated temporary home directory: ``` -/tmp/autonav-chibi-/ +/tmp/autonav--/ ``` -This directory is: -- Created fresh for each session -- Auto-cleaned on session close -- Always included in `writePaths` when nono sandboxing is active -- Used to inject custom plugins/tools into the agent's environment - -Override the base location with `AUTONAV__HOME` (e.g., `AUTONAV_CHIBI_HOME=/tmp/my-chibi`). +This directory is created fresh for each session, auto-cleaned on close, and always included in `writePaths` when nono sandboxing is active. -> **Source**: `src/harness/ephemeral-home.ts` (createEphemeralHome) +> **Source**: `src/harness/ephemeral-home.ts` --- ## Layer 8: Credential Sanitization -All plugin output passes through credential detection and masking: +All plugin output passes through credential detection and masking. -**Detected patterns:** -- Slack tokens (`xoxb-`, `xoxp-`) -- GitHub tokens (`ghp_`, `gho_`, `ghs_`) -- OpenAI/Anthropic keys (`sk-`) -- AWS credentials (`AKIA`, `aws_secret_access_key`) -- Bearer tokens, generic API keys - -**Functions:** -| Function | Purpose | -|---|---| -| `sanitizeCredentials(text)` | Mask sensitive tokens in arbitrary text | -| `sanitizeError(error)` | Sanitize error messages before logging | -| `sanitizeConfigForLogging(config)` | Replace sensitive config fields with `***SET***` | -| `createSafeError(error, context)` | Create error with sanitized context | -| `assertNoCredentialsInText(text, field)` | Throw if credentials detected in user-facing content | +**Detected patterns:** Slack tokens, GitHub tokens, OpenAI/Anthropic keys, AWS credentials, Bearer tokens, generic API keys. > **Source**: `src/plugins/utils/security.ts` --- -## Layer 9: File Watcher Path Restrictions +## Layer 9: Environment Variable Filtering -The file watcher plugin has hardcoded forbidden paths that cannot be watched: +The Claude Code subprocess receives a **filtered** environment — only variables needed for operation are forwarded. Plugin tokens, database credentials, and other secrets in the parent environment are stripped. -``` -/etc, /sys, /proc, /dev, /root, /boot, /var/log, -/usr/bin, /usr/sbin, /bin, /sbin, -~/.ssh, ~/.aws, ~/.config, -/Windows, /System, C:\Windows, C:\System -``` +Allowed categories: system basics (`PATH`, `HOME`, `TERM`), Anthropic API authentication, autonav internals, git config, and Node runtime vars. + +> **Source**: `src/harness/claude-code-harness.ts` (buildCleanEnv, ALLOWED_ENV_VARS) + +--- + +## Layer 10: Config Input Validation + +Navigator `config.json` fields are validated at parse time: + +- **Paths**: Must not contain `..` (traversal), null bytes, or be empty +- **Commands**: Must not be shell interpreters (`bash`, `sh`, `zsh`), privilege escalation (`sudo`, `su`), destructive tools (`rm`, `chmod`), network tools (`nc`, `socat`), or language interpreters (`python`, `node`, `ruby`) +- **Commands**: Must be bare names (no `/` paths) — resolved from PATH by the sandbox + +The full deny-list is exported as `DENIED_SANDBOX_COMMANDS` from `@autonav/communication-layer`. + +> **Source**: `packages/communication-layer/src/schemas/config.ts` (SafePathSchema, SafeCommandSchema, DENIED_SANDBOX_COMMANDS) + +--- + +## Layer 11: File Watcher Path Restrictions + +The file watcher plugin has hardcoded forbidden paths that cannot be watched (`/etc`, `/sys`, `/proc`, `/dev`, `/root`, `~/.ssh`, `~/.aws`, etc.). Root-level directories (path depth <= 2) are rejected. + +> **Source**: `src/plugins/implementations/file-watcher/index.ts` + +--- + +## Layer 12: Sandbox Diagnostic Tool + +Every agent session gets a `sandbox_query` MCP tool that enables navigators to check their own sandbox status: + +- Check if a path operation (read/write) would be allowed +- Check if network access is permitted +- Check if a specific CLI command is allowed +- Get a policy summary -Additionally, root-level directories (path depth <= 2) are rejected to prevent watching entire filesystems. +Always registered, even when sandbox is disabled — navigators should always be able to diagnose their permissions. -> **Source**: `src/plugins/implementations/file-watcher/index.ts:42` (FORBIDDEN_PATHS, validateSafePath) +> **Source**: `src/tools/sandbox-query.ts` --- @@ -239,23 +263,24 @@ Additionally, root-level directories (path depth <= 2) are rejected to prevent w | Layer | Protects Against | Mechanism | Harness-Specific? | |---|---|---|---| -| Sandboxing | Unauthorized file/network access | OS syscalls ([nono](https://github.com/always-further/nono), SDK) or app-level permissions (OpenCode) | Yes | +| Sandboxing | Unauthorized file/network access | Kernel (nono/Seatbelt) or app-level (OpenCode) | Yes | | Working directory | Agent escaping its directory | `cwd` scoping | No | | Tool restrictions | Agents using dangerous tools | `disallowedTools` | No | | Permission modes | Unauthorized interactive actions | `permissionMode` | No | | Turn/budget limits | Runaway execution and cost | `maxTurns`, `maxBudgetUsd` | No | | Cycle detection | Infinite nav-to-nav loops | Depth counter (max 3) | No | -| Ephemeral homes | Session state leaking | Temp dir per session | Chibi only | +| Ephemeral homes | Session state leaking | Temp dir per session | Chibi | | Credential sanitization | Secret exposure in output | Regex masking | No | +| Env filtering | Secret leakage to subprocess | Allowlist-based env | ClaudeCode | +| Config validation | Path traversal, dangerous commands | Zod schema validation | No | | File watcher restrictions | Watching sensitive directories | Hardcoded forbidden paths | No | +| Sandbox diagnostics | Opaque sandbox behavior | `sandbox_query` MCP tool | No | --- ## Known Limitations -- **Cross-nav queries don't inherit sandbox profiles.** When navigator A queries navigator B, the sub-session doesn't read B's per-operation sandbox config. The sub-query runs with the cross-nav tool's hardcoded config (model, maxTurns, cwd only). - **Nav-to-nav file isolation is behavioral, not kernel-enforced.** System prompts instruct navigators to stay within their knowledge base, but nothing prevents a sandboxed agent from reading files in another navigator's directory if both are within the sandbox's read paths. - **`blockNetwork` is disabled for chibi.** The chibi subprocess makes its own API calls to OpenRouter, so blocking network would prevent it from functioning. -- **Graceful fallback means no sandbox.** If [nono](https://github.com/always-further/nono) is not installed, ChibiHarness runs without kernel sandboxing. The other layers still apply. -- **OpenCode sandbox is application-level only.** The OpenCode harness uses permission flags (`edit: "deny"`, `bash: "deny"`) rather than kernel enforcement. This prevents casual misuse but is not as strong as Landlock/Seatbelt. -- **SDK sandbox is disabled.** The SDK's Seatbelt/bubblewrap sandbox blocks all network access by default and `allowedDomains` can't be reliably wired through the programmatic API. Re-enabling requires investigation into loading `.claude/settings.json` via `settingSources`. +- **OpenCode sandbox is application-level only.** The OpenCode harness uses permission flags rather than kernel enforcement. Weaker than nono or SDK sandboxing. +- **Cross-nav depth counter is not propagated into sub-sessions.** The counter is a closure variable on the tool handler. If cross-nav tools were added to sub-sessions, the depth would need to be threaded through. diff --git a/packages/autonav/src/harness/claude-code-harness.ts b/packages/autonav/src/harness/claude-code-harness.ts index a7aa72c..38f6183 100644 --- a/packages/autonav/src/harness/claude-code-harness.ts +++ b/packages/autonav/src/harness/claude-code-harness.ts @@ -10,7 +10,7 @@ import { query, tool, createSdkMcpServer, type Query, type SDKMessage, type SDKR import * as os from "node:os"; import type { Harness, HarnessSession, AgentConfig, AgentEvent } from "./types.js"; import type { ToolDefinition } from "./tool-server.js"; -import { resolveSandboxProvider, createSdkWrapper, buildNonoFlags } from "./sandbox.js"; +import { resolveSandboxProvider, createSdkWrapper, buildNonoFlags, writeNonoFlagsFile } from "./sandbox.js"; /** * Flatten an SDK message into zero or more AgentEvents. @@ -106,6 +106,42 @@ function flattenMessage(message: SDKMessage): AgentEvent[] { return events; } +/** Environment variables the Claude Code subprocess needs. Everything else is stripped. */ +const ALLOWED_ENV_VARS = new Set([ + // System + "PATH", "HOME", "USER", "SHELL", "TERM", "LANG", "LC_ALL", "LC_CTYPE", + "TMPDIR", "XDG_CONFIG_HOME", "XDG_DATA_HOME", "XDG_CACHE_HOME", + // macOS + "DEVELOPER_DIR", "SDKROOT", + // Anthropic API (the subprocess needs this to authenticate) + "ANTHROPIC_API_KEY", "ANTHROPIC_BASE_URL", + // Claude Code + "CLAUDE_CODE_MAX_MEMORY", + // Autonav internals + "AUTONAV_DEBUG", "AUTONAV_SANDBOX", "AUTONAV_QUERY_DEPTH", + "AUTONAV_METRICS", "AUTONAV_HARNESS", + // Git + "GIT_AUTHOR_NAME", "GIT_AUTHOR_EMAIL", "GIT_COMMITTER_NAME", "GIT_COMMITTER_EMAIL", + // Node + "NODE_PATH", "NODE_ENV", +]); + +function buildCleanEnv(extra: Record = {}): Record { + const env: Record = {}; + for (const key of ALLOWED_ENV_VARS) { + if (process.env[key] !== undefined) { + env[key] = process.env[key]; + } + } + // Forward AUTONAV_NAV_PATH_* vars (needed for related-nav resolution) + for (const [key, value] of Object.entries(process.env)) { + if (key.startsWith("AUTONAV_NAV_PATH_") && value !== undefined) { + env[key] = value; + } + } + return { ...env, ...extra }; +} + /** * Map AgentConfig to Claude Code SDK Options */ @@ -132,19 +168,17 @@ function configToSdkOptions(config: AgentConfig): Record { if (sandboxResolution.provider === "nono" && sandboxResolution.active && config.sandbox) { // nono: kernel-enforced sandbox via wrapper script. - // Uses --profile claude-code as base + navigator paths via NONO_FLAGS. + // Uses --profile claude-code as base + navigator flags from a temp file. const wrapperDir = os.tmpdir(); if (config.stderr) { config.stderr(`[nono] SandboxConfig: ${JSON.stringify({ provider: "nono", readPaths: config.sandbox.readPaths, writePaths: config.sandbox.writePaths, allowedCommands: config.sandbox.allowedCommands })}\n`); } const wrapperPath = createSdkWrapper("", wrapperDir, config.sandbox); const nonoFlags = buildNonoFlags(config.sandbox); + const flagsFilePath = writeNonoFlagsFile(nonoFlags, wrapperDir); options.pathToClaudeCodeExecutable = wrapperPath; - options.env = { - ...process.env, - NONO_FLAGS: nonoFlags, - }; + options.env = buildCleanEnv({ NONO_FLAGS_FILE: flagsFilePath }); // Disable SDK sandbox — nono is the security boundary. options.sandbox = { enabled: false }; } else if (sandboxResolution.provider === "claude-code" && sandboxResolution.active) { diff --git a/packages/autonav/src/harness/index.ts b/packages/autonav/src/harness/index.ts index 36e2f71..57cfe39 100644 --- a/packages/autonav/src/harness/index.ts +++ b/packages/autonav/src/harness/index.ts @@ -44,6 +44,8 @@ export { type ToolResult, } from "./tool-server.js"; +export { buildSandboxConfigForOperation } from "./sandbox-config-builder.js"; + export { createEphemeralHome, type EphemeralHome, @@ -60,6 +62,7 @@ export { writeProfile, createSdkWrapper, buildNonoFlags, + writeNonoFlagsFile, buildCapabilitySet, querySandbox, querySandboxNetwork, diff --git a/packages/autonav/src/harness/sandbox-config-builder.ts b/packages/autonav/src/harness/sandbox-config-builder.ts new file mode 100644 index 0000000..bae41cf --- /dev/null +++ b/packages/autonav/src/harness/sandbox-config-builder.ts @@ -0,0 +1,70 @@ +/** + * Shared sandbox config builder. + * + * Extracts the per-operation sandbox config building logic that was + * duplicated across navigator-adapter.ts and nav-chat.ts into a + * single function. Also used by cross-nav and related-navs tools + * to ensure sub-sessions inherit the target navigator's sandbox. + */ + +import type { NavigatorConfig } from "@autonav/communication-layer"; +import type { SandboxConfig, SandboxProvider } from "./types.js"; + +type Operation = "query" | "update" | "chat" | "standup" | "memento"; + +/** + * Build a SandboxConfig for a specific operation on a navigator. + * + * Merges top-level permissions with per-operation profile settings. + * Returns undefined if sandbox is disabled for this operation/provider. + */ +export function buildSandboxConfigForOperation( + navConfig: NavigatorConfig, + navigatorPath: string, + knowledgeBasePath: string, + operation: Operation, +): SandboxConfig | undefined { + // Resolve provider + const sandboxSection = navConfig.sandbox; + const provider: SandboxProvider = sandboxSection?.dangerouslyDisableSandbox + ? "none" + : (sandboxSection?.provider ?? "nono"); + + if (provider === "none") return undefined; + + // Check per-operation enable flag + const profile = sandboxSection?.[operation] as + | { enabled?: boolean; accessLevel?: string; blockNetwork?: boolean; allowedCommands?: string[]; extraReadPaths?: string[]; extraWritePaths?: string[] } + | undefined; + if (profile?.enabled === false) return undefined; + + // Merge top-level + per-operation permissions + const topCommands = navConfig.permissions?.allowedCommands ?? []; + const opCommands = profile?.allowedCommands ?? []; + const allCommands = [...topCommands, ...opCommands]; + + const topPaths = navConfig.permissions?.allowedPaths ?? []; + const opReadPaths = profile?.extraReadPaths ?? []; + const opWritePaths = profile?.extraWritePaths ?? []; + + const readPaths = [ + navigatorPath, + knowledgeBasePath, + ...topPaths, + ...opReadPaths, + ]; + + // Only grant write if accessLevel is "readwrite" + const writePaths = profile?.accessLevel === "readwrite" + ? [navigatorPath, ...opWritePaths] + : undefined; + + return { + enabled: true, + provider, + readPaths, + writePaths, + allowedCommands: allCommands.length > 0 ? allCommands : undefined, + blockNetwork: profile?.blockNetwork, + }; +} diff --git a/packages/autonav/src/harness/sandbox.ts b/packages/autonav/src/harness/sandbox.ts index fcfe6d5..11a9659 100644 --- a/packages/autonav/src/harness/sandbox.ts +++ b/packages/autonav/src/harness/sandbox.ts @@ -29,6 +29,7 @@ */ import { createRequire } from "node:module"; +import * as crypto from "node:crypto"; import * as fs from "node:fs"; import * as os from "node:os"; import * as path from "node:path"; @@ -217,14 +218,17 @@ export function resolveSandboxProvider(config?: SandboxConfig): { provider: Sand /** * Resolve whether sandboxing should be active for this session. * - * For backward compatibility — wraps resolveSandboxProvider and catches - * errors (returns false if nono is required but missing). Prefer using - * resolveSandboxProvider directly for better error handling. + * @deprecated Use resolveSandboxProvider() directly for proper error handling. + * This function silently returns false when nono is required but missing, + * which can lead to undetected sandbox bypass. Will be removed in a future version. */ export function isSandboxEnabled(config?: SandboxConfig): boolean { try { return resolveSandboxProvider(config).active; - } catch { + } catch (e) { + process.stderr.write( + `[autonav] WARNING: Sandbox check failed, running without sandbox: ${e instanceof Error ? e.message : String(e)}\n` + ); return false; } } @@ -394,18 +398,14 @@ function commandFlags(config?: SandboxConfig): string[] { } /** - * Build nono CLI flags string for env var passing (NONO_FLAGS). + * Build nono CLI flags as an array. * - * Used by ClaudeCodeHarness where the wrapper script reads NONO_FLAGS - * from the environment (trellis pattern). + * Returns individual flag strings — each is a single argument to nono. + * Use writeNonoFlagsFile() to serialize safely for the wrapper script. */ -export function buildNonoFlags(config: SandboxConfig): string { +export function buildNonoFlags(config: SandboxConfig): string[] { const parts: string[] = ["--allow-cwd"]; - // Navigator-specific paths as --read/--allow flags. - // These stack with --profile claude-code's base paths. - // We can't use --config/profile JSON because nono-ts 0.3.0 format - // is incompatible with nono CLI 0.15.0's profile parser. if (config.readPaths) { for (const p of config.readPaths) { if (fs.existsSync(p)) parts.push("--read", p); @@ -423,7 +423,20 @@ export function buildNonoFlags(config: SandboxConfig): string { parts.push("--net-block"); } - return parts.join(" "); + return parts; +} + +/** + * Write nono flags to a temp file (one per line) for safe shell consumption. + * + * Avoids shell injection from unquoted env var expansion — the wrapper + * script reads this file line-by-line instead of word-splitting a string. + */ +export function writeNonoFlagsFile(flags: string[], dir: string): string { + const id = crypto.randomUUID().slice(0, 8); + const flagsPath = path.join(dir, `nono-flags-${id}.txt`); + fs.writeFileSync(flagsPath, flags.join("\n") + "\n", "utf-8"); + return flagsPath; } /** @@ -477,36 +490,36 @@ export function wrapCommand( * @returns Absolute path to the wrapper script. */ export function createSdkWrapper(_profilePath: string, dir: string, _config?: SandboxConfig): string { - const wrapperPath = path.join(dir, "nono-claude-wrapper.sh"); - // Use nono's built-in claude-code profile as a base — it provides all - // the paths claude needs (config, keychain, tmp dirs, etc.) and is - // maintained by the nono team. Our custom --config adds navigator-specific - // paths on top, and NONO_FLAGS adds --allow-command flags. - // - // Use --profile claude-code as base, then NONO_FLAGS adds navigator - // paths as --read/--allow flags + --allow-command flags. - // We can't use a profile JSON file because nono-ts 0.3.0's JSON format - // is not compatible with nono CLI 0.15.0's profile parser (custom fs - // entries are silently ignored). --read/--allow flags DO stack with - // --profile, so we pass everything via NONO_FLAGS. + const id = crypto.randomUUID().slice(0, 12); + const wrapperPath = path.join(dir, `nono-claude-wrapper-${id}.sh`); + // Use nono's built-in claude-code profile as base + navigator flags + // from a temp file (NONO_FLAGS_FILE). Flags are read line-by-line to + // avoid shell injection from unquoted env var expansion. // // Pre-create optional dirs that claude-code profile references to // suppress WARN messages on stdout (corrupts SDK JSON stream). + // + // Uses `while read` instead of `mapfile` for macOS system bash (3.2) compat. const script = `#!/usr/bin/env bash set -euo pipefail mkdir -p "\${HOME}/.vscode" 2>/dev/null || true mkdir -p "\${HOME}/Library/Application Support/Code" 2>/dev/null || true touch "\${HOME}/.gitignore_global" 2>/dev/null || true +NONO_ARGS=() +if [[ -n "\${NONO_FLAGS_FILE:-}" && -f "\${NONO_FLAGS_FILE}" ]]; then + while IFS= read -r line; do + [[ -n "\$line" ]] && NONO_ARGS+=("\$line") + done < "\${NONO_FLAGS_FILE}" +fi exec nono run \\ --silent \\ --no-diagnostics \\ --profile claude-code \\ --allow "\${HOME}/.claude-personal" \\ - \${NONO_FLAGS:-} \\ + "\${NONO_ARGS[@]}" \\ -- claude "$@" `; - fs.writeFileSync(wrapperPath, script, "utf-8"); - fs.chmodSync(wrapperPath, 0o755); + fs.writeFileSync(wrapperPath, script, { mode: 0o700, encoding: "utf-8" }); return wrapperPath; } diff --git a/packages/autonav/src/tools/cross-nav.ts b/packages/autonav/src/tools/cross-nav.ts index 406ec5b..791ac5b 100644 --- a/packages/autonav/src/tools/cross-nav.ts +++ b/packages/autonav/src/tools/cross-nav.ts @@ -11,7 +11,7 @@ import { z } from "zod"; import { loadNavigator } from "../query-engine/navigator-loader.js"; -import { type Harness, collectText, defineTool } from "../harness/index.js"; +import { type Harness, collectText, defineTool, buildSandboxConfigForOperation } from "../harness/index.js"; const MAX_QUERY_DEPTH = 3; @@ -64,13 +64,20 @@ Specify the navigator by its directory path (relative or absolute).`, // Load target navigator const nav = loadNavigator(args.navigator); - // Run query via harness + // Build sandbox config from target navigator's config + const sandboxConfig = buildSandboxConfigForOperation( + nav.config, nav.navigatorPath, nav.knowledgeBasePath, "query" + ); + + // Run query via harness — inherit target nav's sandbox const session = harness.run( { model: "claude-haiku-4-5", maxTurns: 10, systemPrompt: nav.systemPrompt, cwd: nav.navigatorPath, + disallowedTools: ["Write", "Edit", "NotebookEdit"], + ...(sandboxConfig ? { sandbox: sandboxConfig } : {}), }, args.question ); diff --git a/packages/autonav/src/tools/related-navs.ts b/packages/autonav/src/tools/related-navs.ts index 63c369d..be0f215 100644 --- a/packages/autonav/src/tools/related-navs.ts +++ b/packages/autonav/src/tools/related-navs.ts @@ -10,7 +10,7 @@ import { z } from "zod"; import { loadNavigator } from "../query-engine/navigator-loader.js"; import { resolveNavigatorPath } from "../registry.js"; -import { type Harness, collectText, defineTool, type ToolDefinition } from "../harness/index.js"; +import { type Harness, collectText, defineTool, type ToolDefinition, buildSandboxConfigForOperation } from "../harness/index.js"; const MAX_QUERY_DEPTH = 3; @@ -86,12 +86,19 @@ export function createRelatedNavsMcpServer( try { const target = loadNavigator(navPath); + // Build sandbox config from target navigator's config + const sandboxConfig = buildSandboxConfigForOperation( + target.config, target.navigatorPath, target.knowledgeBasePath, "query" + ); + const session = harness.run( { model: "claude-haiku-4-5", maxTurns: 10, systemPrompt: target.systemPrompt, cwd: target.navigatorPath, + disallowedTools: ["Write", "Edit", "NotebookEdit"], + ...(sandboxConfig ? { sandbox: sandboxConfig } : {}), }, args.question ); diff --git a/packages/communication-layer/src/index.ts b/packages/communication-layer/src/index.ts index 0487b3d..63ec0cc 100644 --- a/packages/communication-layer/src/index.ts +++ b/packages/communication-layer/src/index.ts @@ -57,6 +57,7 @@ export { export { NavigatorConfigSchema, KnowledgePackMetadataSchema, + DENIED_SANDBOX_COMMANDS, type NavigatorConfig, type KnowledgePackMetadata, createNavigatorConfig, diff --git a/packages/communication-layer/src/schemas/config.ts b/packages/communication-layer/src/schemas/config.ts index 8e5dd66..f49669f 100644 --- a/packages/communication-layer/src/schemas/config.ts +++ b/packages/communication-layer/src/schemas/config.ts @@ -1,6 +1,54 @@ import { z } from 'zod'; import { PROTOCOL_VERSION } from '../version.js'; +// ── Security schemas for sandbox config fields ────────────────────────── + +/** Validated filesystem path — rejects traversal and null bytes. */ +const SafePathSchema = z.string() + .min(1, 'Path cannot be empty') + .refine((p) => !p.includes('\0'), 'Path cannot contain null bytes') + .refine( + (p) => !p.includes('..'), + 'Path traversal (..) is not allowed — use absolute paths or paths relative to navigator root' + ); + +/** + * Commands that MUST NOT be allowed in sandbox configs. + * These provide escape hatches that would bypass sandbox enforcement. + */ +export const DENIED_SANDBOX_COMMANDS = new Set([ + // Shells + 'bash', 'sh', 'zsh', 'fish', 'csh', 'tcsh', 'ksh', 'dash', + // Privilege escalation + 'sudo', 'su', 'doas', + // Execution wrappers + 'env', 'nohup', 'strace', 'ltrace', 'dtrace', + // Permission modification + 'chmod', 'chown', 'chgrp', + // Destructive deletion + 'rm', 'rmdir', + // Raw disk / mount + 'dd', 'mount', 'umount', + // Process control + 'pkill', 'kill', 'killall', + // Interpreters (shell escape) + 'python', 'python3', 'node', 'ruby', 'perl', + // Network tools + 'nc', 'ncat', 'netcat', 'socat', +]); + +/** Validated command name — rejects denied commands and paths. */ +const SafeCommandSchema = z.string() + .min(1, 'Command cannot be empty') + .refine( + (cmd) => !DENIED_SANDBOX_COMMANDS.has(cmd.toLowerCase()), + (cmd) => ({ message: `Command "${cmd}" is denied — it could bypass sandbox enforcement` }) + ) + .refine( + (cmd) => !cmd.includes('/'), + 'Commands must be bare names (no paths) — the sandbox resolves them from PATH' + ); + /** * Navigator Configuration Schema * @@ -79,7 +127,7 @@ export const NavigatorConfigSchema = z.object({ * Additional directories this navigator needs access to (absolute or relative to nav root). * Used to sandbox the navigator to only the directories it manages. */ - workingDirectories: z.array(z.string()).optional().describe( + workingDirectories: z.array(SafePathSchema).optional().describe( 'Additional directories this navigator needs access to (absolute or relative to nav root)' ), @@ -104,9 +152,9 @@ export const NavigatorConfigSchema = z.object({ */ permissions: z.object({ /** CLI tools nono should permit (e.g., ["linear", "gh"]) */ - allowedCommands: z.array(z.string()).optional().describe('CLI tools the sandbox permits'), + allowedCommands: z.array(SafeCommandSchema).optional().describe('CLI tools the sandbox permits'), /** Extra paths the navigator needs access to (read-only) */ - allowedPaths: z.array(z.string()).optional().describe('Extra read paths for the sandbox'), + allowedPaths: z.array(SafePathSchema).optional().describe('Extra read paths for the sandbox'), }).optional().describe('Permission grants across all operations'), /** @@ -129,41 +177,41 @@ export const NavigatorConfigSchema = z.object({ enabled: z.boolean(), accessLevel: z.enum(['readonly', 'readwrite']).optional(), blockNetwork: z.boolean().optional(), - allowedCommands: z.array(z.string()).optional(), - extraReadPaths: z.array(z.string()).optional(), - extraWritePaths: z.array(z.string()).optional(), + allowedCommands: z.array(SafeCommandSchema).optional(), + extraReadPaths: z.array(SafePathSchema).optional(), + extraWritePaths: z.array(SafePathSchema).optional(), }).default({ enabled: true }), update: z.object({ enabled: z.boolean(), accessLevel: z.enum(['readonly', 'readwrite']).optional(), blockNetwork: z.boolean().optional(), - allowedCommands: z.array(z.string()).optional(), - extraReadPaths: z.array(z.string()).optional(), - extraWritePaths: z.array(z.string()).optional(), + allowedCommands: z.array(SafeCommandSchema).optional(), + extraReadPaths: z.array(SafePathSchema).optional(), + extraWritePaths: z.array(SafePathSchema).optional(), }).default({ enabled: true }), chat: z.object({ enabled: z.boolean(), accessLevel: z.enum(['readonly', 'readwrite']).optional(), blockNetwork: z.boolean().optional(), - allowedCommands: z.array(z.string()).optional(), - extraReadPaths: z.array(z.string()).optional(), - extraWritePaths: z.array(z.string()).optional(), + allowedCommands: z.array(SafeCommandSchema).optional(), + extraReadPaths: z.array(SafePathSchema).optional(), + extraWritePaths: z.array(SafePathSchema).optional(), }).default({ enabled: true }), memento: z.object({ enabled: z.boolean(), accessLevel: z.enum(['readonly', 'readwrite']).optional(), blockNetwork: z.boolean().optional(), - allowedCommands: z.array(z.string()).optional(), - extraReadPaths: z.array(z.string()).optional(), - extraWritePaths: z.array(z.string()).optional(), + allowedCommands: z.array(SafeCommandSchema).optional(), + extraReadPaths: z.array(SafePathSchema).optional(), + extraWritePaths: z.array(SafePathSchema).optional(), }).default({ enabled: false }), standup: z.object({ enabled: z.boolean(), accessLevel: z.enum(['readonly', 'readwrite']).optional(), blockNetwork: z.boolean().optional(), - allowedCommands: z.array(z.string()).optional(), - extraReadPaths: z.array(z.string()).optional(), - extraWritePaths: z.array(z.string()).optional(), + allowedCommands: z.array(SafeCommandSchema).optional(), + extraReadPaths: z.array(SafePathSchema).optional(), + extraWritePaths: z.array(SafePathSchema).optional(), }).default({ enabled: true }), }).optional().describe('Per-operation sandbox profiles'),