fix(sandbox): preserve mutable OpenClaw config perms after doctor --fix (#4538) by yimoj · Pull Request #4610 · NVIDIA/NemoClaw

yimoj · 2026-06-01T08:15:43Z

Summary

openclaw doctor --fix enforces a single-user 700/600 state layout, which silently breaks NemoClaw's mutable config contract: the gateway UID shares the sandbox group, so /sandbox/.openclaw must stay setgid + group-writable (2770) and openclaw.json group-writable (660), or gateway-side config writes (control-UI toggles, etc.) EACCES. This preserves that contract after NemoClaw's own rebuild repair step and adds a detect/repair path for sandboxes a user already tightened.

Related Issue

Fixes #4538

Changes

Add shields.inspectMutableConfigPerms / repairMutableConfigPerms — pure contract logic in src/lib/shields/mutable-config-perms.ts, sandbox-bound wrappers in src/lib/shields/index.ts. Repair reuses unlockAgentConfig and refuses to weaken an active shields-up lock or a corrupt-state sandbox.
rebuild re-normalizes mutable perms right after its post-upgrade openclaw doctor --fix step.
nemoclaw <name> doctor gains a Config permissions check that detects the 700/600 drift and warns; nemoclaw <name> doctor --fix repairs it without a rebuild. --fix is rejected together with --json so JSON automation can't trigger a silent permission mutation.
Document that a sandbox restart self-heals the drift (the startup normalizer already restores the contract), and document the supported gateway write probe (setpriv … --init-groups / gosu gateway, not su).
Tests: contract predicates, inspect/repair orchestration, the doctor check rendering, and a shell-level regression that a tree tightened to 700/600 is restored to 2770/660 on startup.

Type of Change

Code change with doc updates

Verification

npm test passes (all failures isolated to the host umask 027 and the unrelated flaky CLI-dispatch tests; the affected files pass in isolation with plugin deps installed)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
Targeted suites green: src/lib/shields/, src/lib/actions/sandbox/doctor-config-perms.test.ts, test/repro-2681-group-writable.test.ts

Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

New Features
- Added a --fix option to nemoclaw doctor to detect and (when requested) repair OpenClaw sandbox config permission drift; --fix is mutually exclusive with --json.
Bug Fixes
- Doctor and rebuild now detect overly restrictive config permissions, attempt post-restore repair/verification, and surface incomplete steps with remediation hints; sandbox restart re-applies the mutable contract.
Documentation
- Expanded command reference and troubleshooting with usage, warnings, non-rebuild repair flow, and probing guidance (prefer setpriv/gosu).
Tests
- Added unit and regression tests covering detection, repair, verification, unlock/symlink edge cases, and rebuild behavior.

…ix (NVIDIA#4538) OpenClaw's `openclaw doctor --fix` enforces a single-user 700/600 state layout, which silently tightens NemoClaw's mutable config contract: the gateway UID shares the `sandbox` group, so `/sandbox/.openclaw` must stay setgid + group-writable (2770) and `openclaw.json` group-writable (660), or gateway-side config writes (control-UI toggles, etc.) EACCES. - Add `shields.inspectMutableConfigPerms` / `repairMutableConfigPerms` (pure logic in `shields/mutable-config-perms.ts`, sandbox-bound wrappers in `shields/index.ts`). Repair reuses `unlockAgentConfig` and refuses to weaken an active shields-up lock. - Re-normalize mutable perms in `rebuild` right after its post-upgrade `openclaw doctor --fix` step. - Add a `Config permissions` check to `nemoclaw <name> doctor` that detects the 700/600 drift; `nemoclaw <name> doctor --fix` repairs it without a rebuild. `--fix` is rejected with `--json` so JSON automation cannot trigger a silent permission mutation. - Note in the startup normalizer and docs that a restart self-heals the drift, and document the supported gateway write probe (gosu / setpriv --init-groups, not su). Adds unit coverage for the contract predicates, inspect/repair orchestration, and the doctor check, plus a shell-level regression that a tree tightened to 700/600 is restored to 2770/660 on startup. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

github-actions · 2026-06-01T08:15:53Z

This repository limits contributors to 10 open pull requests. Please close or merge existing PRs before opening new ones.

coderabbitai · 2026-06-01T08:15:57Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b5fbfc7b-c627-44cd-b588-df56d08981b3

📥 Commits

Reviewing files that changed from the base of the PR and between eb46838 and 534a58a.

📒 Files selected for processing (5)

docs/reference/commands-nemohermes.mdx
docs/reference/commands.mdx
docs/reference/troubleshooting.mdx
scripts/nemoclaw-start.sh
src/lib/cli/public-display-defaults.ts

✅ Files skipped from review due to trivial changes (4)

docs/reference/commands-nemohermes.mdx
src/lib/cli/public-display-defaults.ts
scripts/nemoclaw-start.sh
docs/reference/commands.mdx

🚧 Files skipped from review as they are similar to previous changes (1)

docs/reference/troubleshooting.mdx

📝 Walkthrough

Walkthrough

Adds a mutually-exclusive --fix flag to sandbox doctor, implements OpenClaw mutable-config permission inspection and repair, exposes shields wrappers, wires repair into rebuild/post-restore, adds unit and regression tests, and updates docs and script comments describing detection and normalization of tightened 700/600 → expected 2770/660 permissions.

Changes

Mutable Config Permission Detection and Repair

Layer / File(s)	Summary
Mutable config permission contract and predicates `src/lib/shields/mutable-config-perms.ts`	Defines required directory (setgid + group-writable `2770`) and file (`660`) mode constants, exported types for inspection/repair results, `parseStatModeOwner`, and strict dir/file predicates.
Shields: unlock helper & wrappers `src/lib/shields/index.ts`	Embeds a Python no-symlink-follow unlock script, implements `unlockConfigPathsNoSymlinkFollow`, replaces inline per-file unlock loop, and exposes `inspectMutableConfigPerms` / `repairMutableConfigPerms`.
Shields tests `src/lib/shields/mutable-config-perms.test.ts`	Adds predicate/parsing tests and scenario tests for inspection and repair behavior, posture gating, sensitive files, and error paths.
Doctor check factory `src/lib/actions/sandbox/doctor-config-perms.ts`	Adds `buildConfigPermsCheck` (injected `inspect`/`repair`) with probe-failure handling, `--fix` repair + re-inspect flow, and final-status composition.
Doctor unit tests `src/lib/actions/sandbox/doctor-config-perms.test.ts`	Comprehensive unit tests covering applies=false, probe errors, intact/tightened scenarios, repair success/failure, skipped repair, and re-inspection failure handling.
CLI flag and doctor orchestration `src/commands/sandbox/doctor.ts`, `src/lib/actions/sandbox/doctor.ts`, `src/lib/cli/public-display-defaults.ts`	Adds mutually-exclusive `--fix` flag (exclusive with `--json`), updates usage/examples, parses `--fix` in `runSandboxDoctor`, rejects `--fix + --json`, wires the config-perms check into the doctor checks, and advertises `--fix` in CLI metadata.
Rebuild post-upgrade permission repair `src/lib/actions/sandbox/rebuild.ts`	After post-restore doctor/seed steps, calls `shields.repairMutableConfigPerms` and logs skipped/verified/applied/incomplete outcomes; treats unverified permission repair as an incomplete rebuild and prints remediation hints.
Docs, scripts comments, and regression tests `docs/reference/commands.mdx`, `docs/reference/troubleshooting.mdx`, `docs/reference/commands-nemohermes.mdx`, `scripts/lib/sandbox-init.sh`, `scripts/nemoclaw-start.sh`, `test/*`	Documents `--fix` behavior, expected permission contract, restart/rebuild recovery notes, verification probe guidance (use `setpriv --init-groups`/`gosu`), expands comments about normalization self-healing, adjusts test harness docker fake, and adds regression tests verifying normalization restores setgid/group-writable modes and that unlock refuses symlinked targets.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

NVIDIA/NemoClaw#4155: Touches unlockAgentConfig/shields unlock behavior and overlaps at the same unlock/permission responsibility.
NVIDIA/NemoClaw#4707: Edits init_step_down_prefixes() region in sandbox-init scripts overlapping verification/step-down logic.

Suggested labels

fix, Sandbox

Suggested reviewers

ericksoa
cjagwani

"🐰 A tiny hop, a careful fix,
setgid returns, no permission tricks.
--fix hums softly, scripts stand guard,
symlinks refused, the checks on guard.
Configs restored — the rabbit's glad."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main fix: preserving mutable OpenClaw config permissions after running doctor --fix, directly addressing issue `#4538`.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

yimoj · 2026-06-02T02:35:35Z

@coderabbitai full review

coderabbitai · 2026-06-02T02:35:40Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

docs/reference/troubleshooting.mdx (1)

870-870: ⚡ Quick win

Replace colon with a period or em dash.

This line uses a colon to introduce an explanation rather than a list, which violates the formatting rule that colons should only introduce lists.

📝 Suggested fix

-Do not probe with `su -s /bin/sh gateway ...`: `su` does not initialize the gateway's supplementary groups the same way, so a group-write probe can spuriously report `EACCES` even when the mutable contract is intact.
+Do not probe with `su -s /bin/sh gateway ...`.
+`su` does not initialize the gateway's supplementary groups the same way, so a group-write probe can spuriously report `EACCES` even when the mutable contract is intact.

As per coding guidelines: "Colons should only introduce a list. Flag colons used as general punctuation between clauses."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/troubleshooting.mdx` at line 870, Replace the colon after the
clause "Do not probe with `su -s /bin/sh gateway ...`" with a period or em dash
so the sentence reads as a standalone statement (e.g., "Do not probe with `su -s
/bin/sh gateway ...`. `su` does not initialize..." or use an em dash), ensuring
the colon is not used as general punctuation in the troubleshooting note.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/doctor-config-perms.ts`:
- Around line 84-97: The failure detail string currently ignores after.reason
when inspect(sandboxName) throws and after.reason is set; update the
construction of the detail (the return block using before, after, repairResult)
to include after.reason as a fallback before falling back to "unknown" — e.g.,
when fixed is false and after.applies is true with no repairResult.errors,
append or use after.reason (from MutableConfigPermsInspection) so the message
becomes `repair incomplete: ${after.applies ? (after.issues.join("; ") ||
repairResult.errors.join("; ") || after.reason) : repairResult.errors.join("; ")
|| after.reason || "unknown"}` (adjust formatting to match code style),
referencing the inspect call, the after variable, and repairResult to locate the
change.
- Around line 33-38: The inspect(sandboxName) call currently swallows exceptions
and returns null; change the catch to capture the thrown error (e) and return a
diagnostic check object indicating a warning/failure with the error message so
the probe is surfaced instead of skipped—e.g., in the catch for
inspect(sandboxName) create and return a Check/Inspection result (matching the
surrounding return type used by doctor-config-perms) with status "warn" or
"fail" and include e.message (and e.stack if available) in the details; keep
using MutableConfigPermsInspection/inspect names to locate the code path to
modify.

In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 853-861: The current handling of
shields.repairMutableConfigPerms(sandboxName) logs skipped or incomplete repairs
but still allows the rebuild to succeed; change the control flow in the
surrounding rebuild logic (where permRepair is computed) so that if
permRepair.applied === false OR permRepair.verified === false you treat the
rebuild as failed/incomplete: surface an explicit error to the user (using
log/console.error) and return or set the overall rebuild result to an error
state instead of continuing to success. Update any result/exit code or the
function that returns the rebuild outcome so callers see a non-success status
when repairMutableConfigPerms does not verify successfully (refer to
permRepair.applied, permRepair.verified, and permRepair.errors and the
surrounding rebuild function that currently prints success).

In `@src/lib/shields/mutable-config-perms.ts`:
- Around line 61-77: The predicates are too permissive; instead of testing
selected bits they must validate the full normalized mode equals the exact
contract (dir: 02770 / octal 0o2770, file: 00660 / octal 0o0660). Normalize the
input with mode.padStart(4,'0') (or parse to an int with
Number.parseInt(mode.padStart(4,'0'),8)) and then compare against the exact
expected value in dirSatisfiesMutableContract and fileSatisfiesMutableContract
(use '2770'/'0660' or numeric 0o2770/0o0660) so only exact matches pass.

---

Nitpick comments:
In `@docs/reference/troubleshooting.mdx`:
- Line 870: Replace the colon after the clause "Do not probe with `su -s /bin/sh
gateway ...`" with a period or em dash so the sentence reads as a standalone
statement (e.g., "Do not probe with `su -s /bin/sh gateway ...`. `su` does not
initialize..." or use an em dash), ensuring the colon is not used as general
punctuation in the troubleshooting note.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d85c2253-5f6a-4178-87f8-ce03299953ca

📥 Commits

Reviewing files that changed from the base of the PR and between ef1d4af and ef73178.

📒 Files selected for processing (14)

docs/reference/commands.mdx
docs/reference/troubleshooting.mdx
scripts/lib/sandbox-init.sh
scripts/nemoclaw-start.sh
src/commands/sandbox/doctor.ts
src/lib/actions/sandbox/doctor-config-perms.test.ts
src/lib/actions/sandbox/doctor-config-perms.ts
src/lib/actions/sandbox/doctor.ts
src/lib/actions/sandbox/rebuild.ts
src/lib/cli/public-display-defaults.ts
src/lib/shields/index.ts
src/lib/shields/mutable-config-perms.test.ts
src/lib/shields/mutable-config-perms.ts
test/repro-2681-group-writable.test.ts

…utable-openclaw-perms # Conflicts: # src/lib/shields/index.ts

…#4538) - Validate the exact 2770/660 contract in the mutable-config predicates so owner-tightened (2670) and world-widened (2777/666) modes are flagged, not just selected group bits (CodeRabbit). - doctor: surface an unexpected inspection-probe exception as a warn check instead of silently dropping the check; preserve the re-inspection failure reason in the repair-incomplete detail (CodeRabbit). - rebuild: run the mutable-permission repair LAST in the OpenClaw post-restore sequence — after the WeChat seed helper, which atomically rewrites openclaw.json 0600 and would otherwise undo the just-verified repair (Codex). - rebuild: when the final result is incomplete, report a failed state restore and an unverified permission repair independently so neither recovery hint is lost when both occur (Codex). Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

yimoj · 2026-06-02T03:04:58Z

Pushed merge with upstream/main (resolved the shields/index.ts import conflict) plus fixes for all review feedback:

Exact contract validation — dir/fileSatisfiesMutableContract now require the full normalized 2770/660, so owner-tightened (2670) and world-widened (2777/666) modes are flagged, not just group bits.
doctor probe — an unexpected inspection exception now renders a warn check instead of silently disappearing; the re-inspection failure reason is preserved in the repair incomplete detail.
rebuild ordering — the mutable-permission repair now runs after the WeChat seed helper (which atomically rewrites openclaw.json 0600), so the verified repair is no longer undone.
rebuild summary — a failed state restore and an unverified permission repair are now reported independently, so neither recovery hint is lost when both occur.

@coderabbitai full review

coderabbitai · 2026-06-02T03:05:16Z

@yimoj Thanks for the thorough update! The fixes look well-considered — particularly tightening the contract predicates to exact 2770/660 (catching both under-permissioned and over-permissioned modes), surfacing inspection exceptions as warn rather than swallowing them silently, and fixing the rebuild ordering so the post-WeChat-seed repair isn't immediately undone. Triggering a full review now.

(´• ω •`) ♡

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 886-899: Wrap the call to
shields.repairMutableConfigPerms(sandboxName) in a try/catch so unexpected
throws from validateName or getShieldsPosture (which may run
recoverExpiredAutoRestoreInline) do not abort the rebuild; on catch set
mutablePermsRepairUnverified = true and emit a warning/error log that includes
the caught error message/stack, then continue (do not rethrow). Keep existing
success/verified/verified=false handling for the normal permRepair return value
(repairMutableConfigPermsCore / unlockAgentConfig remain unchanged).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6e1277ae-e963-4508-a222-d429394bc960

📥 Commits

Reviewing files that changed from the base of the PR and between ef73178 and fe70a38.

📒 Files selected for processing (10)

docs/reference/commands.mdx
docs/reference/troubleshooting.mdx
scripts/nemoclaw-start.sh
src/lib/actions/sandbox/doctor-config-perms.test.ts
src/lib/actions/sandbox/doctor-config-perms.ts
src/lib/actions/sandbox/rebuild.ts
src/lib/shields/index.ts
src/lib/shields/mutable-config-perms.test.ts
src/lib/shields/mutable-config-perms.ts
test/repro-2681-group-writable.test.ts

💤 Files with no reviewable changes (1)

test/repro-2681-group-writable.test.ts

✅ Files skipped from review due to trivial changes (2)

scripts/nemoclaw-start.sh
docs/reference/commands.mdx

🚧 Files skipped from review as they are similar to previous changes (4)

src/lib/shields/index.ts
docs/reference/troubleshooting.mdx
src/lib/shields/mutable-config-perms.ts
src/lib/shields/mutable-config-perms.test.ts

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/reference/commands.mdx`:
- Around line 484-491: Update the doctor command synopsis block and flag table
so the example is copyable and accurately shows the flags are mutually
exclusive: replace the console code fence (with `$` prompt) with a bash code
fence containing "nemoclaw my-assistant doctor [--json | --fix]" and update the
`--fix` flag description to append "Mutually exclusive with `--json`"; ensure
this change targets the synopsis and the table entries for the doctor command
(the doctor command example and the `--json`/`--fix` flag rows).

In `@docs/reference/troubleshooting.mdx`:
- Around line 855-868: Change the two "console" fenced blocks to copyable shell
blocks by switching the fences from ```console to ```bash (or ```sh) and remove
the leading "$" prompt characters; specifically update the "nemoclaw <sandbox>
doctor --fix" block and the verification block that uses setpriv/gosu so the
commands are plain, copyable lines (e.g., use "nemoclaw <sandbox> doctor --fix",
"setpriv --reuid=... -- sh -c 'echo ok >> /sandbox/.openclaw/openclaw.json'",
and the gosu variant) and ensure the file path /sandbox/.openclaw/openclaw.json
and the utilities setpriv/gosu are preserved as shown.

In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 888-891: The current branch treats every permRepair.applied ===
false as a benign skip; update the handling in the rebuild flow so that after
calling repairMutableConfigPerms() you check permRepair.reason and only
log-and-continue for the explicit "shields-up" (intentional locked) reason,
while any other non-applied reasons should mark the rebuild as incomplete (e.g.,
call the existing downgrade/fail path or set the rebuild status to failure).
Concretely, in the block that inspects permRepair from
repairMutableConfigPerms(), replace the unconditional no-op/log with a
conditional that compares permRepair.reason to the explicit shields-up sentinel
used by repairMutableConfigPerms(), logging and continuing only for that
sentinel and invoking the downgrade/error path for all other reasons so the
mutable OpenClaw contract restoration is treated as incomplete.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ec644f24-c385-45ac-b652-41c5055e0eaf

📥 Commits

Reviewing files that changed from the base of the PR and between 96946a1 and fe70a38.

📒 Files selected for processing (14)

docs/reference/commands.mdx
docs/reference/troubleshooting.mdx
scripts/lib/sandbox-init.sh
scripts/nemoclaw-start.sh
src/commands/sandbox/doctor.ts
src/lib/actions/sandbox/doctor-config-perms.test.ts
src/lib/actions/sandbox/doctor-config-perms.ts
src/lib/actions/sandbox/doctor.ts
src/lib/actions/sandbox/rebuild.ts
src/lib/cli/public-display-defaults.ts
src/lib/shields/index.ts
src/lib/shields/mutable-config-perms.test.ts
src/lib/shields/mutable-config-perms.ts
test/repro-2681-group-writable.test.ts

- rebuild: guard `shields.repairMutableConfigPerms()` in try/catch so an unexpected throw from the wrapper (validateName / getShieldsPosture inline auto-restore) cannot abort the rebuild — treat it as unverified and continue. - rebuild: distinguish skip reasons — a "locked" (shields-up) skip stays benign, but an "unreadable" shields-state skip means the contract may still be broken, so it now downgrades the rebuild result. Add a `skipReason` discriminant to MutableConfigRepairResult to avoid brittle reason-string matching. - docs: make the `doctor` synopsis copyable (`bash` fence, no `$`) and show `--json | --fix` are mutually exclusive; switch the troubleshooting repair and gateway-probe blocks to copyable `bash` fences. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

yimoj · 2026-06-02T04:09:33Z

Addressed all four review threads in commit 07c1f54ab (replied inline and resolved each):

rebuild — guard against wrapper throws: shields.repairMutableConfigPerms() is now wrapped in try/catch; an unexpected throw is treated as unverified and the rebuild continues.
rebuild — skip-reason handling: added a skipReason discriminant; only the intentional shields-up (locked) skip is benign, while an unreadable corrupt-shields-state skip now downgrades the rebuild.
docs (commands): doctor synopsis is a copyable bash block showing [--json | --fix] are mutually exclusive.
docs (troubleshooting): repair and setpriv/gosu probe blocks switched to copyable bash fences.

@coderabbitai review

coderabbitai · 2026-06-02T04:09:49Z

@yimoj Thanks for the follow-up — all four changes look solid:

Rebuild guard: wrapping shields.repairMutableConfigPerms() in try/catch and treating unexpected throws as unverified keeps the rebuild from hard-failing on a permission-repair edge case, which is the right failure mode here.
skipReason discriminant: distinguishing the intentional locked skip from the unreadable corrupt-state skip is an important correctness improvement — silently treating a corrupt shields state as benign would have hidden real problems.
Docs copyable blocks: switching both the doctor synopsis and the setpriv/gosu probe to fenced bash blocks makes them immediately copy-pasteable, which is a nice UX improvement for an operations-focused doc.

Triggering the review now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

prekshivyas · 2026-06-02T15:51:12Z

@yimoj thanks for the upstream merge + review fixes here. Heads-up: it's still showing CONFLICTING against main as of the latest head — looks like a fresh conflict landed after your merge (main's been moving quickly during the v0.0.57 push). Mind re-syncing once more? Happy to review as soon as it's clean. Thanks!

…utable-openclaw-perms # Conflicts: # docs/reference/commands.mdx # src/lib/actions/sandbox/doctor.ts

yimoj · 2026-06-03T04:25:39Z

@prekshivyas resolved.

wscurran · 2026-06-03T18:12:25Z

✨
Related open issues:

#4538 [All Platforms][Sandbox] NemoClaw mutable sandbox breaks group-writable openclaw.json contract after openclaw doctor --fix (gateway cannot persist config)

…utable-openclaw-perms # Conflicts: # src/commands/sandbox/doctor.ts

@cv

…4628 review) Address @cv's PR NVIDIA#4628 review feedback: the 700/600 heal was too generic. If a future caller routes a mutable-sandbox OpenClaw config dir (which uses the 2770/660 contract per NVIDIA#4538 / PR NVIDIA#4610) through `ensureConfigDir`, the walker would silently normalize it to 700/600 — exactly the NVIDIA#4538 EACCES regression. Add three predicates: - `hostNemoclawDir()` — canonical `${HOME}/.nemoclaw` - `isHostNemoclawRoot(dirPath)` — gate for the sibling sweep - `isMutableSandboxConfigPath(targetPath)` — gate for the file-level heal Wire them through: - `ensureConfigDir`: skip dir-mode chmod when path is mutable-sandbox. Run `healRootLevelFiles` only when path IS the host nemoclaw root. - `readConfigFile`: skip per-file chmod when path is mutable-sandbox. Tests: - 2 new scope-boundary tests (negative: arbitrary dir, no heal; positive: host nemoclaw root, heal fires). - Existing heal tests (multi-sibling + symlink positive control) updated to use the new `withHome` helper so they place files under `<HOME>/.nemoclaw` instead of an arbitrary tmp dir — preserves the NVIDIA#4546 acceptance while respecting the new scope boundary. 18/18 tests pass. Co-authored-by: kagura-agent <kagura.agent.ai@gmail.com> Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

yimoj · 2026-06-04T04:26:07Z

@coderabbitai resume

coderabbitai · 2026-06-04T04:26:14Z

✅ Action performed

Reviews resumed.

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

NVIDIA#4538) Two acceptance gaps from NVIDIA#4538 not closed by the original PR: - Troubleshooting: contrast the three relevant perm states (mutable default 2770/660, shields-up locked 444/755 root, the 700/600 drift) so issue ask NVIDIA#3, "docs cover both mutable and locked-down", is actually answered. Prior copy only documented mutable. - mutable-config-perms: explain why NemoClaw uses 2770/660 vs the issue's expected 2775/664 (gateway shares the sandbox group, so the "other" bit is intentionally dropped). The predicate test already rejects 664; the rationale belonged in code. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

…VIDIA#4538) Third acceptance gap from NVIDIA#4538 audit: inspect/repair asymmetry. `unlockAgentConfig` (the repair callback) re-permissions both `openclaw.json` AND the sensitive-file set (`.config-hash`, etc.). `inspectMutableConfigPerms` only stat'd dir + main config, so the user-facing `nemoclaw <name> doctor` reported "Config permissions OK" even when `.config-hash` had drifted to `644 root:root`. Running `--fix` would then silently mutate a file the check never named. - Add optional `sensitiveFiles` to `MutableConfigTarget`. - Iterate sensitive files in inspect, applying the same file contract; tolerate missing files (e.g. `.config-hash` before the first shields-up cycle). - `resolveAgentConfig` already populates `sensitiveFiles`, so the wrapper in `shields/index.ts` needs no change — structural typing carries the field through. - Two new tests: drift detection + missing-file tolerance. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

cjagwani

Approving. Pushed two follow-ups to close the audit gaps: a docs paragraph on the three perm states (mutable 2770/660, shields-up 444/755 root, the 700/600 drift) plus a comment on why 660 not 664, and a fix so inspect actually catches .config-hash drift since --fix was silently mutating it. Two new tests cover the inspect side.

Verified on a brev T4 box: 2770/660 fresh, drifts to 700/600 after openclaw doctor --fix, comes back to 2770/660 after nemoclaw doctor --fix.

## Summary - Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the dev announcement from discussion #4877. - Fills the source-doc gaps found during release-prep review across inference, policy tiers, command behavior, security boundaries, Hermes dashboard/tooling, runtime context, and troubleshooting. - Refreshes generated agent skills under `.agents/skills/` from the current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`. ## Source summary - #4037 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents system-only runtime context that stays out of visible chat. - #4875 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents try-first sandbox network/filesystem guidance and clearer failure classification. - #4788 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents shared OpenClaw device-approval policy for startup and connect. - #4768 -> `docs/reference/network-policies.mdx`, `docs/network-policy/integration-policy-examples.mdx`, `docs/get-started/quickstart.mdx`, `docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`: Documents `weather`, `public-reference`, and Hermes managed-tool gateway preset behavior. - #3788 and #4864 -> `docs/reference/network-policies.mdx`, `docs/reference/commands.mdx`: Documents non-interactive policy-tier fail-fast behavior and interactive prompt fallback. - #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware default sandbox resolution for `list`, `status`, and `tunnel` commands. - #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel status` behavior. - #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy preset descriptions in `policy-list`. - #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents package-managed OpenShell gateway service and Docker-driver gateway-marker behavior. - #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent gateway/dashboard cleanup isolation by sandbox name and port. - #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU patch rollback behavior. - #4610 -> `docs/reference/troubleshooting.mdx`, `docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission guidance aligned and removes skipped experimental wording. - #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling for custom `onboard --from <Dockerfile>` contexts in generated skills. - #4870 -> `docs/reference/commands.mdx`, `docs/manage-sandboxes/runtime-controls.mdx`: Documents `NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage. - #4641 -> `docs/inference/inference-options.mdx`, `docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM platform-digest pulls and served-model id adoption. - #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash coverage. - #4852 -> `docs/inference/use-local-inference.mdx`, `docs/reference/troubleshooting.mdx`: Documents Ollama model fit filtering, 16K context floor, cold-load retry, and failed-model exclusion. - #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents API-family sync, Hermes `api_mode`, and Bedrock Runtime exception. - #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents Nemotron managed-inference native tool-search fallback. - #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents interactive multimodal input prompting. - #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass normalization in generated troubleshooting coverage. - #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents prebuilt Hermes dashboard assets and TUI recovery without runtime rebuilds. - #4854 -> `docs/inference/switch-inference-providers.mdx`, `docs/reference/commands.mdx`: Documents Hermes proxy API-key placeholder preservation during inference switches. - #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`, `.agents/skills/`: Keeps messaging enrollment behavior aligned with manifest-hook implementation. - #4771 -> `docs/security/best-practices.mdx`, `docs/security/credential-storage.mdx`: Documents Hermes placeholder-only secret boundary for sandbox-visible runtime files. - #4787 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents expanded memory scanner examples for OpenAI project keys and Slack app-level tokens. - #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill install mirroring into the agent home directory. - #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep structure and generated `.agents/skills/` refresh as the template for this release. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run` - `npm run docs` - `git diff --check` - skip-term scan across `docs/`, `.agents/skills/`, and `skills/` - `npm run build:cli` - `npm run typecheck:cli` - Commit and pre-push hook suites, including markdownlint, gitleaks, env-var docs gate, docs-to-skills verification, and skills YAML tests  ## Summary by CodeRabbit ## Release Notes * **New Features** * DeepSeek-V4-Flash now available as default inference model for DGX Station. * Hermes dashboard improved with dedicated port and OAuth-authenticated tool gateway selection. * Added weather and public-reference policy presets for expanded agent capabilities. * Enhanced Ollama model selection with GPU memory filtering and automatic retry for timeouts. * **Bug Fixes** * Improved policy tier validation to prevent invalid configurations. * Better sandbox cleanup scoping by port to prevent conflicts across deployments. * Added GPU patch failure recovery with automatic rollback. * **Documentation** * Expanded troubleshooting guides for inference, security, and sandbox lifecycle. * Added .dockerignore best practices for custom deployments.  --------- Co-authored-by: Carlos Villela <cvillela@nvidia.com>

github-actions Bot closed this Jun 1, 2026

yimoj reopened this Jun 2, 2026

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread src/lib/actions/sandbox/doctor-config-perms.ts

Comment thread src/lib/actions/sandbox/doctor-config-perms.ts Outdated

Comment thread src/lib/actions/sandbox/rebuild.ts Outdated

Comment thread src/lib/shields/mutable-config-perms.ts

yimoj added 2 commits June 2, 2026 02:50

Merge remote-tracking branch 'upstream/main' into fix/4538-preserve-m…

b4140af

…utable-openclaw-perms # Conflicts: # src/lib/shields/index.ts

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread src/lib/actions/sandbox/rebuild.ts

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread docs/reference/commands.mdx Outdated

Comment thread docs/reference/troubleshooting.mdx Outdated

Comment thread src/lib/actions/sandbox/rebuild.ts Outdated

yimoj added the v0.0.57 Release target label Jun 2, 2026

cv assigned cjagwani Jun 2, 2026

cv added v0.0.58 Release target and removed v0.0.57 Release target labels Jun 3, 2026

Merge remote-tracking branch 'upstream/main' into fix/4538-preserve-m…

570ec7e

…utable-openclaw-perms # Conflicts: # docs/reference/commands.mdx # src/lib/actions/sandbox/doctor.ts

wscurran added area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery area: security Security controls, permissions, secrets, or hardening bug-fix PR fixes a bug or regression labels Jun 3, 2026

Merge remote-tracking branch 'upstream/main' into fix/4538-preserve-m…

578fa85

…utable-openclaw-perms # Conflicts: # src/commands/sandbox/doctor.ts

Merge branch 'main' into fix/4538-preserve-mutable-openclaw-perms

50bfc7f

cv added v0.0.59 Release target and removed v0.0.58 Release target labels Jun 4, 2026

Merge upstream/main into fix/4538-preserve-mutable-openclaw-perms

6385501

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

yimoj and others added 4 commits June 4, 2026 05:07

chore: remove generated Claude lock file

865bab3

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

Merge branch 'main' into fix/4538-preserve-mutable-openclaw-perms

eb46838

cjagwani requested a review from cv June 4, 2026 20:51

Merge branch 'main' into fix/4538-preserve-mutable-openclaw-perms

4dc9156

cv added v0.0.60 Release target and removed v0.0.59 Release target labels Jun 4, 2026

Merge branch 'main' into fix/4538-preserve-mutable-openclaw-perms

534a58a

cjagwani approved these changes Jun 5, 2026

View reviewed changes

cv merged commit 9eb020e into NVIDIA:main Jun 5, 2026
29 checks passed

miyoungc mentioned this pull request Jun 6, 2026

docs: refresh v0.0.60 release notes #4879

Merged

coderabbitai Bot mentioned this pull request Jun 8, 2026

refactor(messaging): move build setup to manifest hooks #4949

Draft

12 tasks

Conversation

yimoj commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

yimoj commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yimoj commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yimoj commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Uh oh!

prekshivyas commented Jun 2, 2026

Uh oh!

yimoj commented Jun 3, 2026

Uh oh!

wscurran commented Jun 3, 2026

Uh oh!

yimoj commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026

Uh oh!

cjagwani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yimoj commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading