fix(sandbox): preserve mutable OpenClaw config perms after doctor --fix (#4538)#4610
Conversation
…ix (NVIDIA#4538) OpenClaw's `openclaw doctor --fix` enforces a single-user 700/600 state layout, which silently tightens NemoClaw's mutable config contract: the gateway UID shares the `sandbox` group, so `/sandbox/.openclaw` must stay setgid + group-writable (2770) and `openclaw.json` group-writable (660), or gateway-side config writes (control-UI toggles, etc.) EACCES. - Add `shields.inspectMutableConfigPerms` / `repairMutableConfigPerms` (pure logic in `shields/mutable-config-perms.ts`, sandbox-bound wrappers in `shields/index.ts`). Repair reuses `unlockAgentConfig` and refuses to weaken an active shields-up lock. - Re-normalize mutable perms in `rebuild` right after its post-upgrade `openclaw doctor --fix` step. - Add a `Config permissions` check to `nemoclaw <name> doctor` that detects the 700/600 drift; `nemoclaw <name> doctor --fix` repairs it without a rebuild. `--fix` is rejected with `--json` so JSON automation cannot trigger a silent permission mutation. - Note in the startup normalizer and docs that a restart self-heals the drift, and document the supported gateway write probe (gosu / setpriv --init-groups, not su). Adds unit coverage for the contract predicates, inspect/repair orchestration, and the doctor check, plus a shell-level regression that a tree tightened to 700/600 is restored to 2770/660 on startup. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
This repository limits contributors to 10 open pull requests. Please close or merge existing PRs before opening new ones. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (4)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a mutually-exclusive ChangesMutable Config Permission Detection and Repair
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
docs/reference/troubleshooting.mdx (1)
870-870: ⚡ Quick winReplace colon with a period or em dash.
This line uses a colon to introduce an explanation rather than a list, which violates the formatting rule that colons should only introduce lists.
📝 Suggested fix
-Do not probe with `su -s /bin/sh gateway ...`: `su` does not initialize the gateway's supplementary groups the same way, so a group-write probe can spuriously report `EACCES` even when the mutable contract is intact. +Do not probe with `su -s /bin/sh gateway ...`. +`su` does not initialize the gateway's supplementary groups the same way, so a group-write probe can spuriously report `EACCES` even when the mutable contract is intact.As per coding guidelines: "Colons should only introduce a list. Flag colons used as general punctuation between clauses."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/troubleshooting.mdx` at line 870, Replace the colon after the clause "Do not probe with `su -s /bin/sh gateway ...`" with a period or em dash so the sentence reads as a standalone statement (e.g., "Do not probe with `su -s /bin/sh gateway ...`. `su` does not initialize..." or use an em dash), ensuring the colon is not used as general punctuation in the troubleshooting note.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/actions/sandbox/doctor-config-perms.ts`:
- Around line 84-97: The failure detail string currently ignores after.reason
when inspect(sandboxName) throws and after.reason is set; update the
construction of the detail (the return block using before, after, repairResult)
to include after.reason as a fallback before falling back to "unknown" — e.g.,
when fixed is false and after.applies is true with no repairResult.errors,
append or use after.reason (from MutableConfigPermsInspection) so the message
becomes `repair incomplete: ${after.applies ? (after.issues.join("; ") ||
repairResult.errors.join("; ") || after.reason) : repairResult.errors.join("; ")
|| after.reason || "unknown"}` (adjust formatting to match code style),
referencing the inspect call, the after variable, and repairResult to locate the
change.
- Around line 33-38: The inspect(sandboxName) call currently swallows exceptions
and returns null; change the catch to capture the thrown error (e) and return a
diagnostic check object indicating a warning/failure with the error message so
the probe is surfaced instead of skipped—e.g., in the catch for
inspect(sandboxName) create and return a Check/Inspection result (matching the
surrounding return type used by doctor-config-perms) with status "warn" or
"fail" and include e.message (and e.stack if available) in the details; keep
using MutableConfigPermsInspection/inspect names to locate the code path to
modify.
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 853-861: The current handling of
shields.repairMutableConfigPerms(sandboxName) logs skipped or incomplete repairs
but still allows the rebuild to succeed; change the control flow in the
surrounding rebuild logic (where permRepair is computed) so that if
permRepair.applied === false OR permRepair.verified === false you treat the
rebuild as failed/incomplete: surface an explicit error to the user (using
log/console.error) and return or set the overall rebuild result to an error
state instead of continuing to success. Update any result/exit code or the
function that returns the rebuild outcome so callers see a non-success status
when repairMutableConfigPerms does not verify successfully (refer to
permRepair.applied, permRepair.verified, and permRepair.errors and the
surrounding rebuild function that currently prints success).
In `@src/lib/shields/mutable-config-perms.ts`:
- Around line 61-77: The predicates are too permissive; instead of testing
selected bits they must validate the full normalized mode equals the exact
contract (dir: 02770 / octal 0o2770, file: 00660 / octal 0o0660). Normalize the
input with mode.padStart(4,'0') (or parse to an int with
Number.parseInt(mode.padStart(4,'0'),8)) and then compare against the exact
expected value in dirSatisfiesMutableContract and fileSatisfiesMutableContract
(use '2770'/'0660' or numeric 0o2770/0o0660) so only exact matches pass.
---
Nitpick comments:
In `@docs/reference/troubleshooting.mdx`:
- Line 870: Replace the colon after the clause "Do not probe with `su -s /bin/sh
gateway ...`" with a period or em dash so the sentence reads as a standalone
statement (e.g., "Do not probe with `su -s /bin/sh gateway ...`. `su` does not
initialize..." or use an em dash), ensuring the colon is not used as general
punctuation in the troubleshooting note.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d85c2253-5f6a-4178-87f8-ce03299953ca
📒 Files selected for processing (14)
docs/reference/commands.mdxdocs/reference/troubleshooting.mdxscripts/lib/sandbox-init.shscripts/nemoclaw-start.shsrc/commands/sandbox/doctor.tssrc/lib/actions/sandbox/doctor-config-perms.test.tssrc/lib/actions/sandbox/doctor-config-perms.tssrc/lib/actions/sandbox/doctor.tssrc/lib/actions/sandbox/rebuild.tssrc/lib/cli/public-display-defaults.tssrc/lib/shields/index.tssrc/lib/shields/mutable-config-perms.test.tssrc/lib/shields/mutable-config-perms.tstest/repro-2681-group-writable.test.ts
…utable-openclaw-perms # Conflicts: # src/lib/shields/index.ts
…#4538) - Validate the exact 2770/660 contract in the mutable-config predicates so owner-tightened (2670) and world-widened (2777/666) modes are flagged, not just selected group bits (CodeRabbit). - doctor: surface an unexpected inspection-probe exception as a warn check instead of silently dropping the check; preserve the re-inspection failure reason in the repair-incomplete detail (CodeRabbit). - rebuild: run the mutable-permission repair LAST in the OpenClaw post-restore sequence — after the WeChat seed helper, which atomically rewrites openclaw.json 0600 and would otherwise undo the just-verified repair (Codex). - rebuild: when the final result is incomplete, report a failed state restore and an unverified permission repair independently so neither recovery hint is lost when both occur (Codex). Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
Pushed merge with upstream/main (resolved the
@coderabbitai full review |
|
(´• ω •`) ♡ ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 886-899: Wrap the call to
shields.repairMutableConfigPerms(sandboxName) in a try/catch so unexpected
throws from validateName or getShieldsPosture (which may run
recoverExpiredAutoRestoreInline) do not abort the rebuild; on catch set
mutablePermsRepairUnverified = true and emit a warning/error log that includes
the caught error message/stack, then continue (do not rethrow). Keep existing
success/verified/verified=false handling for the normal permRepair return value
(repairMutableConfigPermsCore / unlockAgentConfig remain unchanged).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6e1277ae-e963-4508-a222-d429394bc960
📒 Files selected for processing (10)
docs/reference/commands.mdxdocs/reference/troubleshooting.mdxscripts/nemoclaw-start.shsrc/lib/actions/sandbox/doctor-config-perms.test.tssrc/lib/actions/sandbox/doctor-config-perms.tssrc/lib/actions/sandbox/rebuild.tssrc/lib/shields/index.tssrc/lib/shields/mutable-config-perms.test.tssrc/lib/shields/mutable-config-perms.tstest/repro-2681-group-writable.test.ts
💤 Files with no reviewable changes (1)
- test/repro-2681-group-writable.test.ts
✅ Files skipped from review due to trivial changes (2)
- scripts/nemoclaw-start.sh
- docs/reference/commands.mdx
🚧 Files skipped from review as they are similar to previous changes (4)
- src/lib/shields/index.ts
- docs/reference/troubleshooting.mdx
- src/lib/shields/mutable-config-perms.ts
- src/lib/shields/mutable-config-perms.test.ts
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/reference/commands.mdx`:
- Around line 484-491: Update the doctor command synopsis block and flag table
so the example is copyable and accurately shows the flags are mutually
exclusive: replace the console code fence (with `$` prompt) with a bash code
fence containing "nemoclaw my-assistant doctor [--json | --fix]" and update the
`--fix` flag description to append "Mutually exclusive with `--json`"; ensure
this change targets the synopsis and the table entries for the doctor command
(the doctor command example and the `--json`/`--fix` flag rows).
In `@docs/reference/troubleshooting.mdx`:
- Around line 855-868: Change the two "console" fenced blocks to copyable shell
blocks by switching the fences from ```console to ```bash (or ```sh) and remove
the leading "$" prompt characters; specifically update the "nemoclaw <sandbox>
doctor --fix" block and the verification block that uses setpriv/gosu so the
commands are plain, copyable lines (e.g., use "nemoclaw <sandbox> doctor --fix",
"setpriv --reuid=... -- sh -c 'echo ok >> /sandbox/.openclaw/openclaw.json'",
and the gosu variant) and ensure the file path /sandbox/.openclaw/openclaw.json
and the utilities setpriv/gosu are preserved as shown.
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 888-891: The current branch treats every permRepair.applied ===
false as a benign skip; update the handling in the rebuild flow so that after
calling repairMutableConfigPerms() you check permRepair.reason and only
log-and-continue for the explicit "shields-up" (intentional locked) reason,
while any other non-applied reasons should mark the rebuild as incomplete (e.g.,
call the existing downgrade/fail path or set the rebuild status to failure).
Concretely, in the block that inspects permRepair from
repairMutableConfigPerms(), replace the unconditional no-op/log with a
conditional that compares permRepair.reason to the explicit shields-up sentinel
used by repairMutableConfigPerms(), logging and continuing only for that
sentinel and invoking the downgrade/error path for all other reasons so the
mutable OpenClaw contract restoration is treated as incomplete.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ec644f24-c385-45ac-b652-41c5055e0eaf
📒 Files selected for processing (14)
docs/reference/commands.mdxdocs/reference/troubleshooting.mdxscripts/lib/sandbox-init.shscripts/nemoclaw-start.shsrc/commands/sandbox/doctor.tssrc/lib/actions/sandbox/doctor-config-perms.test.tssrc/lib/actions/sandbox/doctor-config-perms.tssrc/lib/actions/sandbox/doctor.tssrc/lib/actions/sandbox/rebuild.tssrc/lib/cli/public-display-defaults.tssrc/lib/shields/index.tssrc/lib/shields/mutable-config-perms.test.tssrc/lib/shields/mutable-config-perms.tstest/repro-2681-group-writable.test.ts
- rebuild: guard `shields.repairMutableConfigPerms()` in try/catch so an unexpected throw from the wrapper (validateName / getShieldsPosture inline auto-restore) cannot abort the rebuild — treat it as unverified and continue. - rebuild: distinguish skip reasons — a "locked" (shields-up) skip stays benign, but an "unreadable" shields-state skip means the contract may still be broken, so it now downgrades the rebuild result. Add a `skipReason` discriminant to MutableConfigRepairResult to avoid brittle reason-string matching. - docs: make the `doctor` synopsis copyable (`bash` fence, no `$`) and show `--json | --fix` are mutually exclusive; switch the troubleshooting repair and gateway-probe blocks to copyable `bash` fences. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
Addressed all four review threads in commit
@coderabbitai review |
|
Triggering the review now. ✅ Actions performedReview triggered.
|
|
@yimoj thanks for the upstream merge + review fixes here. Heads-up: it's still showing |
…utable-openclaw-perms # Conflicts: # docs/reference/commands.mdx # src/lib/actions/sandbox/doctor.ts
|
@prekshivyas resolved. |
…utable-openclaw-perms # Conflicts: # src/commands/sandbox/doctor.ts
…4628 review) Address @cv's PR NVIDIA#4628 review feedback: the 700/600 heal was too generic. If a future caller routes a mutable-sandbox OpenClaw config dir (which uses the 2770/660 contract per NVIDIA#4538 / PR NVIDIA#4610) through `ensureConfigDir`, the walker would silently normalize it to 700/600 — exactly the NVIDIA#4538 EACCES regression. Add three predicates: - `hostNemoclawDir()` — canonical `${HOME}/.nemoclaw` - `isHostNemoclawRoot(dirPath)` — gate for the sibling sweep - `isMutableSandboxConfigPath(targetPath)` — gate for the file-level heal Wire them through: - `ensureConfigDir`: skip dir-mode chmod when path is mutable-sandbox. Run `healRootLevelFiles` only when path IS the host nemoclaw root. - `readConfigFile`: skip per-file chmod when path is mutable-sandbox. Tests: - 2 new scope-boundary tests (negative: arbitrary dir, no heal; positive: host nemoclaw root, heal fires). - Existing heal tests (multi-sibling + symlink positive control) updated to use the new `withHome` helper so they place files under `<HOME>/.nemoclaw` instead of an arbitrary tmp dir — preserves the NVIDIA#4546 acceptance while respecting the new scope boundary. 18/18 tests pass. Co-authored-by: kagura-agent <kagura.agent.ai@gmail.com> Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
@coderabbitai resume |
✅ Action performedReviews resumed. |
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
NVIDIA#4538) Two acceptance gaps from NVIDIA#4538 not closed by the original PR: - Troubleshooting: contrast the three relevant perm states (mutable default 2770/660, shields-up locked 444/755 root, the 700/600 drift) so issue ask NVIDIA#3, "docs cover both mutable and locked-down", is actually answered. Prior copy only documented mutable. - mutable-config-perms: explain why NemoClaw uses 2770/660 vs the issue's expected 2775/664 (gateway shares the sandbox group, so the "other" bit is intentionally dropped). The predicate test already rejects 664; the rationale belonged in code. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
…VIDIA#4538) Third acceptance gap from NVIDIA#4538 audit: inspect/repair asymmetry. `unlockAgentConfig` (the repair callback) re-permissions both `openclaw.json` AND the sensitive-file set (`.config-hash`, etc.). `inspectMutableConfigPerms` only stat'd dir + main config, so the user-facing `nemoclaw <name> doctor` reported "Config permissions OK" even when `.config-hash` had drifted to `644 root:root`. Running `--fix` would then silently mutate a file the check never named. - Add optional `sensitiveFiles` to `MutableConfigTarget`. - Iterate sensitive files in inspect, applying the same file contract; tolerate missing files (e.g. `.config-hash` before the first shields-up cycle). - `resolveAgentConfig` already populates `sensitiveFiles`, so the wrapper in `shields/index.ts` needs no change — structural typing carries the field through. - Two new tests: drift detection + missing-file tolerance. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
cjagwani
left a comment
There was a problem hiding this comment.
Approving. Pushed two follow-ups to close the audit gaps: a docs paragraph on the three perm states (mutable 2770/660, shields-up 444/755 root, the 700/600 drift) plus a comment on why 660 not 664, and a fix so inspect actually catches .config-hash drift since --fix was silently mutating it. Two new tests cover the inspect side.
Verified on a brev T4 box: 2770/660 fresh, drifts to 700/600 after openclaw doctor --fix, comes back to 2770/660 after nemoclaw doctor --fix.
## Summary - Adds the `v0.0.60` section to `docs/about/release-notes.mdx` using the dev announcement from discussion #4877. - Fills the source-doc gaps found during release-prep review across inference, policy tiers, command behavior, security boundaries, Hermes dashboard/tooling, runtime context, and troubleshooting. - Refreshes generated agent skills under `.agents/skills/` from the current Fern docs output and upgrades Fern from `5.44.3` to `5.45.0`. ## Source summary - #4037 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents system-only runtime context that stays out of visible chat. - #4875 -> `docs/reference/architecture.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx`: Documents try-first sandbox network/filesystem guidance and clearer failure classification. - #4788 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents shared OpenClaw device-approval policy for startup and connect. - #4768 -> `docs/reference/network-policies.mdx`, `docs/network-policy/integration-policy-examples.mdx`, `docs/get-started/quickstart.mdx`, `docs/get-started/quickstart-hermes.mdx`, `docs/reference/commands.mdx`: Documents `weather`, `public-reference`, and Hermes managed-tool gateway preset behavior. - #3788 and #4864 -> `docs/reference/network-policies.mdx`, `docs/reference/commands.mdx`: Documents non-interactive policy-tier fail-fast behavior and interactive prompt fallback. - #4756 and #4866 -> `docs/reference/commands.mdx`: Documents env-aware default sandbox resolution for `list`, `status`, and `tunnel` commands. - #4320 -> `docs/reference/commands.mdx`: Documents `$$nemoclaw tunnel status` behavior. - #4328 -> `docs/reference/commands.mdx`: Documents line-scoped policy preset descriptions in `policy-list`. - #4580 and #4748 -> `docs/reference/architecture.mdx`: Documents package-managed OpenShell gateway service and Docker-driver gateway-marker behavior. - #4598 -> `docs/manage-sandboxes/lifecycle.mdx`: Documents concurrent gateway/dashboard cleanup isolation by sandbox name and port. - #4777 -> `docs/reference/troubleshooting.mdx`: Documents Docker GPU patch rollback behavior. - #4610 -> `docs/reference/troubleshooting.mdx`, `docs/reference/commands.mdx`: Keeps mutable OpenClaw config permission guidance aligned and removes skipped experimental wording. - #4868 -> `docs/reference/commands.mdx`: Keeps `.dockerignore` handling for custom `onboard --from <Dockerfile>` contexts in generated skills. - #4870 -> `docs/reference/commands.mdx`, `docs/manage-sandboxes/runtime-controls.mdx`: Documents `NEMOCLAW_MINIMAL_BOOTSTRAP` and generated skill coverage. - #4641 -> `docs/inference/inference-options.mdx`, `docs/reference/troubleshooting.mdx`: Documents local NVIDIA NIM platform-digest pulls and served-model id adoption. - #4810 and #4867 -> `docs/inference/inference-options.mdx`: Documents stable NGC managed-vLLM image lineage and DGX Station DeepSeek V4 Flash coverage. - #4852 -> `docs/inference/use-local-inference.mdx`, `docs/reference/troubleshooting.mdx`: Documents Ollama model fit filtering, 16K context floor, cold-load retry, and failed-model exclusion. - #4847 -> `docs/inference/switch-inference-providers.mdx`: Documents API-family sync, Hermes `api_mode`, and Bedrock Runtime exception. - #4800 -> `docs/inference/tool-calling-reliability.mdx`: Documents Nemotron managed-inference native tool-search fallback. - #4333 -> `docs/inference/switch-inference-providers.mdx`: Documents interactive multimodal input prompting. - #4086 -> `docs/reference/troubleshooting.mdx`: Keeps proxy bypass normalization in generated troubleshooting coverage. - #4811 and #4855 -> `docs/get-started/quickstart-hermes.mdx`: Documents prebuilt Hermes dashboard assets and TUI recovery without runtime rebuilds. - #4854 -> `docs/inference/switch-inference-providers.mdx`, `docs/reference/commands.mdx`: Documents Hermes proxy API-key placeholder preservation during inference switches. - #4248 -> `docs/manage-sandboxes/messaging-channels.mdx`, `.agents/skills/`: Keeps messaging enrollment behavior aligned with manifest-hook implementation. - #4771 -> `docs/security/best-practices.mdx`, `docs/security/credential-storage.mdx`: Documents Hermes placeholder-only secret boundary for sandbox-visible runtime files. - #4787 -> `docs/security/best-practices.mdx`, `docs/about/release-notes.mdx`: Documents expanded memory scanner examples for OpenAI project keys and Slack app-level tokens. - #4848 -> `docs/reference/commands.mdx`: Documents OpenClaw skill install mirroring into the agent home directory. - #4790 -> `docs/about/release-notes.mdx`: Uses the prior release-prep structure and generated `.agents/skills/` refresh as the template for this release. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run` - `npm run docs` - `git diff --check` - skip-term scan across `docs/`, `.agents/skills/`, and `skills/` - `npm run build:cli` - `npm run typecheck:cli` - Commit and pre-push hook suites, including markdownlint, gitleaks, env-var docs gate, docs-to-skills verification, and skills YAML tests <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * DeepSeek-V4-Flash now available as default inference model for DGX Station. * Hermes dashboard improved with dedicated port and OAuth-authenticated tool gateway selection. * Added weather and public-reference policy presets for expanded agent capabilities. * Enhanced Ollama model selection with GPU memory filtering and automatic retry for timeouts. * **Bug Fixes** * Improved policy tier validation to prevent invalid configurations. * Better sandbox cleanup scoping by port to prevent conflicts across deployments. * Added GPU patch failure recovery with automatic rollback. * **Documentation** * Expanded troubleshooting guides for inference, security, and sandbox lifecycle. * Added .dockerignore best practices for custom deployments. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Summary
openclaw doctor --fixenforces a single-user700/600state layout, which silently breaks NemoClaw's mutable config contract: the gateway UID shares thesandboxgroup, so/sandbox/.openclawmust stay setgid + group-writable (2770) andopenclaw.jsongroup-writable (660), or gateway-side config writes (control-UI toggles, etc.)EACCES. This preserves that contract after NemoClaw's own rebuild repair step and adds a detect/repair path for sandboxes a user already tightened.Related Issue
Fixes #4538
Changes
shields.inspectMutableConfigPerms/repairMutableConfigPerms— pure contract logic insrc/lib/shields/mutable-config-perms.ts, sandbox-bound wrappers insrc/lib/shields/index.ts. Repair reusesunlockAgentConfigand refuses to weaken an active shields-up lock or a corrupt-state sandbox.rebuildre-normalizes mutable perms right after its post-upgradeopenclaw doctor --fixstep.nemoclaw <name> doctorgains aConfig permissionscheck that detects the700/600drift and warns;nemoclaw <name> doctor --fixrepairs it without a rebuild.--fixis rejected together with--jsonso JSON automation can't trigger a silent permission mutation.setpriv … --init-groups/gosu gateway, notsu).doctorcheck rendering, and a shell-level regression that a tree tightened to700/600is restored to2770/660on startup.Type of Change
Verification
npm testpasses (all failures isolated to the host umask027and the unrelated flaky CLI-dispatch tests; the affected files pass in isolation with plugin deps installed)src/lib/shields/,src/lib/actions/sandbox/doctor-config-perms.test.ts,test/repro-2681-group-writable.test.tsSigned-off-by: Yimo Jiang yimoj@nvidia.com
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests