Skip to content

fix(sandbox): add opt-in fail-closed cap-drop enforcement#4707

Merged
cv merged 4 commits into
mainfrom
fix/3280-cap-drop-strict-mode
Jun 3, 2026
Merged

fix(sandbox): add opt-in fail-closed cap-drop enforcement#4707
cv merged 4 commits into
mainfrom
fix/3280-cap-drop-strict-mode

Conversation

@Dongni-Yang

@Dongni-Yang Dongni-Yang commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

On hosts without CAP_SETPCAP (GitHub runners, Brev shadecloud, Colossus Ubuntu
24.04, Docker Desktop, WSL), drop_capabilities cannot run capsh --drop, so
dangerous bounding-set caps survive and the agent boots with a weaker posture
than the security model assumes. This adds an opt-in fail-closed guard
(NEMOCLAW_REQUIRE_CAP_DROP=1) that verifies the agent process tree's actual
bounding set and refuses to start if dangerous capabilities remain.

Related Issue

Refs #3280

Changes

  • scripts/lib/sandbox-init.sh:
    • enforce_cap_drop_if_required runs on every path out of
      drop_capabilities and verifies the actual CapBnd rather than trusting
      the NEMOCLAW_CAPS_DROPPED sentinel (closes a strict-mode bypass).
    • A single DANGEROUS_CAPS list feeds both the capsh --drop arguments and
      the verifier, so the drop-set and the verify-set (all 10 caps) cannot drift.
    • A non-empty but unparseable CapBnd is treated as unverifiable: refuse in
      strict mode, warn in default mode — never a raw bash arithmetic error.
    • setpriv-based privilege step-down strips the remaining load-bearing caps
      atomically with the setuid transition (falls back to gosu when
      setpriv/CAP_SETPCAP are unavailable).
    • Default stays warn-and-continue (zero regression — the inverse of the
      reverted fix(sandbox): refuse to start when bounding-set cap drop fails (#4264) #4266); operators set NEMOCLAW_REQUIRE_CAP_DROP=1 to refuse-to-start.
  • test/sandbox-init.test.ts: repro + strict-mode + sentinel-bypass +
    malformed-CapBnd coverage, seam-driven (NEMOCLAW_PROC_STATUS) and
    deterministic.
  • docs/security/best-practices.mdx, docs/deployment/sandbox-hardening.mdx:
    document NEMOCLAW_REQUIRE_CAP_DROP and the verified scope.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes

Signed-off-by: Dongni Yang dongniy@nvidia.com

Summary by CodeRabbit

  • New Features

    • Opt-in strict verification via NEMOCLAW_REQUIRE_CAP_DROP=1: agent can refuse to start if dangerous Linux capabilities remain.
  • Behavior

    • More accurate detection and clearer diagnostics of residual bounding-set capabilities; improved sandbox step-down argument handling.
  • Tests

    • Added deterministic tests covering warn-and-continue, strict-mode refusal, and various verification edge cases.
  • Documentation

    • Updated docs explaining capability-drop behavior and the strict verification option.

When a host does not grant CAP_SETPCAP (GitHub runners, Brev shadecloud,
Colossus Ubuntu 24.04, Docker Desktop, WSL), drop_capabilities cannot run
`capsh --drop`, so dangerous bounding-set caps survive and the agent boots
with a weaker posture than the security model assumes.

PR #4266 made this fall-through default fail-closed and was reverted within
hours (#4341) because it broke every CAP_SETPCAP-less host. Reintroduce the
protection as the inverse: the default stays warn-and-continue (zero
regression — no host loses the ability to boot), and operators who require the
hardened posture opt in with NEMOCLAW_REQUIRE_CAP_DROP=1 to refuse start when
the bounding-set drop could not happen.

Scope is the agent process tree only. The `nemoclaw connect` shell inherits
the container's OCI bounding set and is not affected; tightening that needs
cap_drop at sandbox-create, tracked upstream in NVIDIA/OpenShell#1452 — which
is why this is Refs (not Closes) #3280.

Tests drive both fall-through branches with a stubbed capsh (red before this
change, green after), decode the QA-reported CapBnd fixture deterministically,
and assert the default still warns-and-continues (no regression).

Local pre-commit Test (CLI) hook skipped via --no-verify (operator-approved):
it fails only on an offline-blocked messaging-bridge e2e test unrelated to this
change; sandbox-init tests pass 32/32. CI re-runs all hooks on the PR.

Refs #3280

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Dongni Yang <dongniy@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds opt-in fail-closed enforcement for Linux capability bounding-set drops: new decoder helper, enforce_cap_drop_if_required() which can refuse startup when NEMOCLAW_REQUIRE_CAP_DROP=1, integrates into drop-capability fallback paths, updates reporting and step-down quoting, and adds tests and docs.

Changes

Capability enforcement in sandbox initialization

Layer / File(s) Summary
Enforce capability drop guard implementation
scripts/lib/sandbox-init.sh
Added DANGEROUS_CAPS and dangerous_caps_drop_list(); drop_capabilities() now builds the capsh --drop args from the table, reports residual CapBnd info on fallback, and always invokes enforce_cap_drop_if_required(); added dangerous_caps_in_capbnd() and enforce_cap_drop_if_required(); report_residual_capabilities() now delegates to the decoder; adjusted setpriv argument quoting in init_step_down_prefixes().
Regression and strict-mode test coverage
test/sandbox-init.test.ts
Added deterministic CapBnd fixtures, a capsh stub and NEMOCLAW_PROC_STATUS seam; tests for non-strict warn-and-continue, unit decoding assertions for dangerous_caps_in_capbnd, and multiple NEMOCLAW_REQUIRE_CAP_DROP=1 scenarios (refusal and continuation, missing capsh, unreadable or malformed ProcStatus, pre-set NEMOCLAW_CAPS_DROPPED).
Documentation: deployment & security guidance
docs/deployment/sandbox-hardening.mdx, docs/security/best-practices.mdx
Document bounding-set drop as best-effort and add the NEMOCLAW_REQUIRE_CAP_DROP=1 opt-in fail-closed behavior and scope notes.

Sequence Diagram(s)

sequenceDiagram
  participant Entrypoint as sandbox-init.sh
  participant capsh as capsh (capability tool)
  participant Enforce as enforce_cap_drop_if_required
  participant ProcStatus as /proc/self/status (or NEMOCLAW_PROC_STATUS)

  Entrypoint->>capsh: run `capsh --drop` built from DANGEROUS_CAPS
  capsh-->>Entrypoint: success or failure (CAP_SETPCAP missing)
  Entrypoint->>Enforce: enforce_cap_drop_if_required()
  Enforce->>ProcStatus: read CapBnd
  ProcStatus-->>Enforce: return CapBnd hex
  Enforce->>Enforce: dangerous_caps_in_capbnd(parse CapBnd)
  Enforce-->>Entrypoint: return OK or stderr refusal + exit(1)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Suggested labels

fix, Sandbox, v0.0.57

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 I nudge the sandbox, ears held tight,
I sniff the caps that hide from light.
One toggle says the gate's now sealed,
Or else a gentle warning pealed.
A tiny hop — keep systems right.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately describes the main change: adding an opt-in fail-closed enforcement mechanism for Linux capability dropping in the sandbox initialization.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3280-cap-drop-strict-mode

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: test-e2e-gateway-isolation, openclaw-onboard-security-posture-e2e, hermes-root-entrypoint-smoke-e2e
Optional E2E: test-non-root-sandbox-smoke, hermes-onboard-security-posture-e2e

Dispatch hint: openclaw-onboard-security-posture-e2e,hermes-root-entrypoint-smoke-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • test-e2e-gateway-isolation (medium): Most directly covers the changed security boundary: builds/loads the production sandbox image and runs test/e2e-gateway-isolation.sh, including gateway isolation and issue [Nemoclaw] [All Platforms] Sandbox allows dangerous capabilities in bounding set despite empty effective set #3280 dangerous capability bounding-set assertions through the real sandbox-init.sh code path.
  • openclaw-onboard-security-posture-e2e (high): Validates a real OpenClaw onboard and runtime security posture after changing shared entrypoint capability/drop logic. This catches startup regressions in the default warn-and-continue path and verifies the live sandbox still meets security-posture assertions.
  • hermes-root-entrypoint-smoke-e2e (medium): sandbox-init.sh is shared by Hermes entrypoint startup/step-down paths. The Hermes root entrypoint smoke builds the real Hermes image and verifies root entrypoint startup and gateway-user execution, which could regress from the capsh/setpriv and quoting changes.

Optional E2E

  • test-non-root-sandbox-smoke (low): Useful adjacent confidence for entrypoint behavior under --security-opt no-new-privileges, where privilege/capability assumptions differ and setpriv/gosu fallback behavior is relevant. Not merge-blocking because this PR primarily changes the root/cap-drop path and strict mode is opt-in.
  • hermes-onboard-security-posture-e2e (high): Broader live Hermes onboard security-posture validation if additional confidence is desired beyond the Hermes root-entrypoint smoke. Optional because the direct OpenClaw security-posture and Hermes entrypoint smoke jobs already cover the highest-risk paths.

New E2E recommendations

  • strict capability-drop enforcement (high): Existing E2E verifies normal dangerous-cap dropping, but there is no dedicated image-level E2E that launches the production entrypoint with NEMOCLAW_REQUIRE_CAP_DROP=1 and proves both outcomes: fail-closed when dangerous CapBnd bits remain or CapBnd is unverifiable, and successful startup when the bounding set is clean. The new unit tests mock this, but an E2E should exercise the actual container entrypoint and runtime capability environment.
    • Suggested test: Add a strict-capability-drop mode case to test/e2e-gateway-isolation.sh or a new sandbox image E2E that runs the production image with NEMOCLAW_REQUIRE_CAP_DROP=1 under controlled Docker capability settings.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: openclaw-onboard-security-posture-e2e,hermes-root-entrypoint-smoke-e2e

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw
Optional scenario E2E: ubuntu-repo-cloud-hermes, wsl-repo-cloud-openclaw

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • ubuntu-repo-cloud-openclaw: scripts/lib/sandbox-init.sh changes sandbox entrypoint capability dropping and privilege step-down behavior. The Ubuntu repo OpenClaw scenario is the smallest primary scenario that builds/starts the sandbox from the PR checkout and exercises sandbox startup, gateway health, and shell access.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Optional scenario E2E

  • ubuntu-repo-cloud-hermes: Optional adjacent coverage for the same shared sandbox-init changes on the Hermes agent path, including Hermes-specific startup behavior that may consume the shared privilege step-down prefixes.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes
  • wsl-repo-cloud-openclaw: Optional platform coverage because the capability-drop fallback comments and behavior explicitly call out WSL/CAP_SETPCAP-less hosts; WSL is a special-runner scenario, so it is optional rather than primary.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=wsl-repo-cloud-openclaw

Relevant changed files

  • scripts/lib/sandbox-init.sh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/sandbox-init.test.ts (1)

463-487: ⚡ Quick win

Exercise report_residual_capabilities() directly here.

This test reimplements the decode loop instead of invoking the production helper, so it can still pass if scripts/lib/sandbox-init.sh changes the inspected-cap list or output format. Stubbing awk and calling report_residual_capabilities would keep the regression anchored to the real contract.

💡 Higher-fidelity test shape
-          "cap_bnd_hex=$(awk '/^CapBnd:/{print $2}' \"$TMP/status\")",
-          "val=$((16#$cap_bnd_hex))",
-          "present_caps=''",
-          "for entry in \\",
-          '  "21:cap_sys_admin" \\',
-          '  "19:cap_sys_ptrace" \\',
-          '  "13:cap_net_raw" \\',
-          '  "1:cap_dac_override" \\',
-          '  "10:cap_net_bind_service"; do',
-          '  bit="${entry%%:*}"',
-          '  name="${entry#*:}"',
-          "  if [ $(((val >> bit) & 1)) -ne 0 ]; then",
-          '    present_caps="${present_caps:+$present_caps,}$name"',
-          "  fi",
-          "done",
-          'echo "RESIDUAL: ${present_caps}"',
+          "awk() { printf '00000004a82c35fb\\n'; }",
+          "report_residual_capabilities 2>&1",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/sandbox-init.test.ts` around lines 463 - 487, The test currently
reimplements the capability decode loop; instead, stub the system helper (awk)
to return the CapBnd fixture and invoke the production helper
report_residual_capabilities from scripts/lib/sandbox-init.sh so the test
exercises the real decoding contract; modify the test to export a TMP status
file with "CapBnd: <hex>", set PATH or mock awk to return that field, call
report_residual_capabilities and assert its output contains the same "RESIDUAL:
..." string, and remove the inline loop so future changes to
report_residual_capabilities or the inspected-cap list will be caught by this
test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/lib/sandbox-init.sh`:
- Around line 271-284: The refusal banner in enforce_cap_drop_if_required()
always blames "no CAP_SETPCAP" even when the real problem is missing capsh;
update the heredoc to use the passed-in reason variable (Reason: ${reason}) and
adjust the explanatory text to not hard-code "no CAP_SETPCAP" — instead mention
that the drop was skipped and include the specific cause (e.g., capsh
unavailable or lacking CAP_SETPCAP) so strict-mode users see the correct
diagnosis; ensure references to NEMOCLAW_REQUIRE_CAP_DROP and the
enforce_cap_drop_if_required function are preserved.

---

Nitpick comments:
In `@test/sandbox-init.test.ts`:
- Around line 463-487: The test currently reimplements the capability decode
loop; instead, stub the system helper (awk) to return the CapBnd fixture and
invoke the production helper report_residual_capabilities from
scripts/lib/sandbox-init.sh so the test exercises the real decoding contract;
modify the test to export a TMP status file with "CapBnd: <hex>", set PATH or
mock awk to return that field, call report_residual_capabilities and assert its
output contains the same "RESIDUAL: ..." string, and remove the inline loop so
future changes to report_residual_capabilities or the inspected-cap list will be
caught by this test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1c4c6b41-75b3-4e2b-b17e-f1a354c1e720

📥 Commits

Reviewing files that changed from the base of the PR and between 325ed77 and fbb1788.

📒 Files selected for processing (2)
  • scripts/lib/sandbox-init.sh
  • test/sandbox-init.test.ts

Comment thread scripts/lib/sandbox-init.sh Outdated
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 3 worth checking, 0 nice ideas
Since last review: 4 prior items resolved, 2 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: scripts/lib/sandbox-init.sh capability-drop fallback and strict-mode gate: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `DANGEROUS_CAPS` is now the runtime source for both `capsh --drop` and verification, but `enforce_cap_drop_if_required()` reads `${NEMOCLAW_PROC_STATUS:-/proc/self/status}`.
  • Strict mode trusts an environment-overridable CapBnd source (scripts/lib/sandbox-init.sh:333): The opt-in fail-closed gate is intended to verify the actual process bounding set, but production code reads `${NEMOCLAW_PROC_STATUS:-/proc/self/status}`. If the entrypoint environment can be influenced, `NEMOCLAW_REQUIRE_CAP_DROP=1` can be paired with `NEMOCLAW_PROC_STATUS` pointing at a fake status file with a clean CapBnd, allowing strict mode to trust attacker-controlled evidence instead of the real `/proc/self/status`.
    • Recommendation: Keep the status-file override out of the production decision path. For example, make `enforce_cap_drop_if_required()` always read `/proc/self/status`, and factor CapBnd parsing into a pure helper that tests call directly, or gate the override behind an explicit test-only condition that cannot be set in real sandbox entrypoints.
    • Evidence: `enforce_cap_drop_if_required()` sets `local status_path="${NEMOCLAW_PROC_STATUS:-/proc/self/status}"` before deciding whether to exit. `report_residual_capabilities()` uses the same override. The tests intentionally use this environment variable as a fixture seam.
  • Capability verification tests do not isolate each dangerous bit or exercise a real runtime path (test/sandbox-init.test.ts:417): The unit tests now use a QA CapBnd fixture whose expected output includes all 10 dangerous capabilities, which improves the prior 5-cap gap. However, the fixture is dense: it contains many bits at once. A wrong bit number can still pass if the mistaken bit is also set in that fixture, and the behavior is not validated in a real sandbox/capsh process tree.
    • Recommendation: Add single-bit CapBnd fixtures for every entry in `DANGEROUS_CAPS` and assert strict mode refuses for each one independently. Also identify a targeted runtime/integration validation that exercises the real entrypoint/capsh/proc-status behavior without relying only on the `NEMOCLAW_PROC_STATUS` unit-test seam.
    • Evidence: `QA_CAPBND = "00000004a82c35fb"` is used for the strict refusal and decoder tests, and `QA_DANGEROUS` lists all 10 names. The test set covers unreadable/malformed status and clean/residual combined fixtures, but not individual one-bit mappings or a real runtime drop.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: scripts/lib/sandbox-init.sh capability-drop fallback and strict-mode gate: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `DANGEROUS_CAPS` is now the runtime source for both `capsh --drop` and verification, but `enforce_cap_drop_if_required()` reads `${NEMOCLAW_PROC_STATUS:-/proc/self/status}`.
  • Strict mode trusts an environment-overridable CapBnd source (scripts/lib/sandbox-init.sh:333): The opt-in fail-closed gate is intended to verify the actual process bounding set, but production code reads `${NEMOCLAW_PROC_STATUS:-/proc/self/status}`. If the entrypoint environment can be influenced, `NEMOCLAW_REQUIRE_CAP_DROP=1` can be paired with `NEMOCLAW_PROC_STATUS` pointing at a fake status file with a clean CapBnd, allowing strict mode to trust attacker-controlled evidence instead of the real `/proc/self/status`.
    • Recommendation: Keep the status-file override out of the production decision path. For example, make `enforce_cap_drop_if_required()` always read `/proc/self/status`, and factor CapBnd parsing into a pure helper that tests call directly, or gate the override behind an explicit test-only condition that cannot be set in real sandbox entrypoints.
    • Evidence: `enforce_cap_drop_if_required()` sets `local status_path="${NEMOCLAW_PROC_STATUS:-/proc/self/status}"` before deciding whether to exit. `report_residual_capabilities()` uses the same override. The tests intentionally use this environment variable as a fixture seam.
  • Capability verification tests do not isolate each dangerous bit or exercise a real runtime path (test/sandbox-init.test.ts:417): The unit tests now use a QA CapBnd fixture whose expected output includes all 10 dangerous capabilities, which improves the prior 5-cap gap. However, the fixture is dense: it contains many bits at once. A wrong bit number can still pass if the mistaken bit is also set in that fixture, and the behavior is not validated in a real sandbox/capsh process tree.
    • Recommendation: Add single-bit CapBnd fixtures for every entry in `DANGEROUS_CAPS` and assert strict mode refuses for each one independently. Also identify a targeted runtime/integration validation that exercises the real entrypoint/capsh/proc-status behavior without relying only on the `NEMOCLAW_PROC_STATUS` unit-test seam.
    • Evidence: `QA_CAPBND = "00000004a82c35fb"` is used for the strict refusal and decoder tests, and `QA_DANGEROUS` lists all 10 names. The test set covers unreadable/malformed status and clean/residual combined fixtures, but not individual one-bit mappings or a real runtime drop.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

Address review on #3280:
- enforce_cap_drop_if_required verifies the actual bounding set on every
  path instead of trusting the NEMOCLAW_CAPS_DROPPED sentinel, closing a
  strict-mode bypass; the refusal reason is derived from verified state.
- Extract dangerous_caps_in_capbnd() helper + NEMOCLAW_PROC_STATUS test
  seam; add sentinel-bypass and decode regression tests (35 pass).
- Quote setpriv --reuid/--regid/--bounding-set array elements (SC2191).
- Document NEMOCLAW_REQUIRE_CAP_DROP in best-practices and
  sandbox-hardening.

Refs #3280
Signed-off-by: Dongni Yang <dongniy@nvidia.com>
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
docs/security/best-practices.mdx (1)

297-297: ⚡ Quick win

Split Line 297 to one sentence per source line.

Line 297 currently packs multiple sentences into one line, which breaks the docs diff-readability rule.

Suggested edit
-To make the drop fail-closed instead of best-effort, set `NEMOCLAW_REQUIRE_CAP_DROP=1` in the entrypoint environment. The agent then refuses to start unless the agent process tree's bounding set is verified free of the dangerous capabilities, so it will not boot on hosts that cannot perform the drop (no `CAP_SETPCAP`, or `capsh` missing). This is opt-in because such hosts are common (many cloud VMs, Docker Desktop, WSL); leaving it unset preserves the best-effort default. The check covers the agent process tree only — a `nemoclaw connect` shell is spawned by the container runtime outside that tree and is not affected (tracked in [NVIDIA/OpenShell#1452](https://github.com/NVIDIA/OpenShell/issues/1452)).
+To make the drop fail-closed instead of best-effort, set `NEMOCLAW_REQUIRE_CAP_DROP=1` in the entrypoint environment.
+The agent then refuses to start unless the agent process tree's bounding set is verified free of the dangerous capabilities, so it does not boot on hosts that cannot perform the drop (no `CAP_SETPCAP`, or `capsh` missing).
+This is opt-in because such hosts are common (many cloud VMs, Docker Desktop, WSL); leaving it unset preserves the best-effort default.
+The check covers the agent process tree only — a `nemoclaw connect` shell is spawned by the container runtime outside that tree and is not affected (tracked in [NVIDIA/OpenShell#1452](https://github.com/NVIDIA/OpenShell/issues/1452)).

As per coding guidelines, "One sentence per line in source (makes diffs readable)."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/security/best-practices.mdx` at line 297, The paragraph containing the
explanation about NEMOCLAW_REQUIRE_CAP_DROP should be split so each sentence is
on its own source line to satisfy the "one sentence per line" docs rule; locate
the paragraph that mentions NEMOCLAW_REQUIRE_CAP_DROP, the agent refusing to
start unless the agent process tree's bounding set is verified free of dangerous
capabilities, the examples of hosts where this fails (cloud VMs, Docker Desktop,
WSL), and the note about the check covering only the agent process tree and the
nemoclaw connect shell (linked NVIDIA/OpenShell#1452), then break that single
long line into multiple lines so each independent sentence is its own line.
docs/deployment/sandbox-hardening.mdx (1)

42-43: ⚡ Quick win

Replace clause colons with sentence breaks.

Line 42 and Line 43 use colons as general punctuation between clauses; this page style reserves colons for introducing lists.

Suggested edit
-The bounding-set drop is best effort: if `capsh` or `CAP_SETPCAP` is unavailable the entrypoint logs a warning and continues with the runtime-provided capability set, and if `setpriv` is unavailable it falls back to `gosu`.
-To make the drop fail-closed instead, set `NEMOCLAW_REQUIRE_CAP_DROP=1` in the entrypoint environment: the agent then refuses to start unless the agent process tree's bounding set is verified free of the dangerous capabilities.
+The bounding-set drop is best effort.
+If `capsh` or `CAP_SETPCAP` is unavailable, the entrypoint logs a warning and continues with the runtime-provided capability set, and if `setpriv` is unavailable, it falls back to `gosu`.
+To make the drop fail-closed instead, set `NEMOCLAW_REQUIRE_CAP_DROP=1` in the entrypoint environment.
+The agent then refuses to start unless the agent process tree's bounding set is verified free of the dangerous capabilities.

As per coding guidelines, "Colons should only introduce a list. Flag colons used as general punctuation between clauses."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deployment/sandbox-hardening.mdx` around lines 42 - 43, The two
sentences in the docs mention multiple clauses joined with colons; replace those
colons with proper sentence breaks so colons are only used to introduce lists
per style guidelines. Specifically, in the paragraph referencing capsh,
CAP_SETPCAP, setpriv, gosu and the environment variable
NEMOCLAW_REQUIRE_CAP_DROP, split the clauses into separate sentences (e.g., end
the first clause after "runtime-provided capability set." and start a new
sentence for the fallback to gosu; similarly end the first sentence about the
bounding-set drop and start a new sentence for how to make it fail-closed) so
the colons are removed and grammar reads as discrete sentences.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/sandbox-init.test.ts`:
- Around line 407-458: The test "warns but does NOT refuse to start when
CAP_SETPCAP is unavailable (issue `#3280`)" can inherit NEMOCLAW_REQUIRE_CAP_DROP
from the test runner because runWithLib merges process.env with the provided
opts.env; update the runWithLib invocation in this test to explicitly include
NEMOCLAW_REQUIRE_CAP_DROP: "" in the env override (same approach used by the
other strict-mode test) so the test runs in non-strict mode regardless of the
external environment.

---

Nitpick comments:
In `@docs/deployment/sandbox-hardening.mdx`:
- Around line 42-43: The two sentences in the docs mention multiple clauses
joined with colons; replace those colons with proper sentence breaks so colons
are only used to introduce lists per style guidelines. Specifically, in the
paragraph referencing capsh, CAP_SETPCAP, setpriv, gosu and the environment
variable NEMOCLAW_REQUIRE_CAP_DROP, split the clauses into separate sentences
(e.g., end the first clause after "runtime-provided capability set." and start a
new sentence for the fallback to gosu; similarly end the first sentence about
the bounding-set drop and start a new sentence for how to make it fail-closed)
so the colons are removed and grammar reads as discrete sentences.

In `@docs/security/best-practices.mdx`:
- Line 297: The paragraph containing the explanation about
NEMOCLAW_REQUIRE_CAP_DROP should be split so each sentence is on its own source
line to satisfy the "one sentence per line" docs rule; locate the paragraph that
mentions NEMOCLAW_REQUIRE_CAP_DROP, the agent refusing to start unless the agent
process tree's bounding set is verified free of dangerous capabilities, the
examples of hosts where this fails (cloud VMs, Docker Desktop, WSL), and the
note about the check covering only the agent process tree and the nemoclaw
connect shell (linked NVIDIA/OpenShell#1452), then break that single long line
into multiple lines so each independent sentence is its own line.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2114b6a2-7880-4500-8f1f-9671f8d3da0f

📥 Commits

Reviewing files that changed from the base of the PR and between fbb1788 and 0ed7c51.

📒 Files selected for processing (4)
  • docs/deployment/sandbox-hardening.mdx
  • docs/security/best-practices.mdx
  • scripts/lib/sandbox-init.sh
  • test/sandbox-init.test.ts

Comment thread test/sandbox-init.test.ts
- test: also unset NEMOCLAW_REQUIRE_CAP_DROP in the warn-and-continue
  case so it cannot inherit a strict-mode value from the runner env.
- docs: wrap the NEMOCLAW_REQUIRE_CAP_DROP paragraphs one sentence per
  line in best-practices.mdx and sandbox-hardening.mdx (docs readability).

Refs #3280
Signed-off-by: Dongni Yang <dongniy@nvidia.com>
Adversarial review found the strict gate verified only 5 of the 10 caps
capsh drops, so a host retaining cap_setfcap/cap_mknod/cap_sys_chroot/
cap_fsetid/cap_audit_write passed NEMOCLAW_REQUIRE_CAP_DROP=1 as "clean".

- Single DANGEROUS_CAPS list feeds both the capsh --drop arguments and
  dangerous_caps_in_capbnd, so the drop-set and verify-set cannot drift.
- Reject a non-empty but malformed CapBnd (return nonzero) and treat it
  as unverifiable: refuse in strict mode, warn in default mode, instead
  of leaking a raw bash arithmetic error.
- Tests: expect all 10 caps; add strict-refuse and default-warn cases
  for an unparseable CapBnd.
- Docs: strict mode refuses based on the verified bounding set, not on
  whether the host can perform the drop.

Refs #3280
Signed-off-by: Dongni Yang <dongniy@nvidia.com>
@Dongni-Yang Dongni-Yang added the v0.0.58 Release target label Jun 3, 2026
@wscurran wscurran requested a review from cv June 3, 2026 16:48
@wscurran wscurran added area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery area: security Security controls, permissions, secrets, or hardening bug-fix PR fixes a bug or regression labels Jun 3, 2026
@wscurran

wscurran commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR about adding an opt-in fail-closed guard to enforce cap-drop in the sandbox, which improves the security posture of the agent. This proposes a bug fix for the sandbox that affects the security model.


Related open PRs:


Related open issues:

@cv cv merged commit 5db9432 into main Jun 3, 2026
34 checks passed
@cv cv deleted the fix/3280-cap-drop-strict-mode branch June 3, 2026 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery area: security Security controls, permissions, secrets, or hardening bug-fix PR fixes a bug or regression v0.0.58 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants