fix(sandbox): recover late dashboard tool-scope approvals via doctor (#4616) by yimoj · Pull Request #4936 · NVIDIA/NemoClaw

yimoj · 2026-06-08T04:01:06Z

Summary

Dashboard-only OpenClaw users can't create cron jobs (and other agent tool calls fail) because a late device scope upgrade never gets approved: the agent tool subprocess connects to ws://127.0.0.1:<dashboardPort>, the gateway closes with 1006 abnormal closure, and OpenShell logs a policy denial to 127.0.0.1:<dashboardPort>. The existing allowlisted approval pass only runs during nemoclaw <sandbox> connect, which dashboard users never invoke — so there is no recovery path. This adds a bounded recovery + diagnostic through doctor.

Related Issue

Fixes #4616

Changes

Reusable recovery pass. Extracted the connect-time allowlisted approval pass into a shared auto-pair-approval module and reuse it from doctor --fix, so dashboard-only users can approve pending tool-scope upgrades without an SSH connect. The allowlist is unchanged and narrow (openclaw-control-ui + webchat/cli, operator.pairing/read/write only); unknown clients are never approved.
doctor tool-scope diagnostic (read-only). Surfaces pending allowlisted upgrades, the 1006 / scope upgrade pending approval / loopback 127.0.0.1:<dashboardPort> policy-denial log signatures, and whether the in-sandbox auto-pair watcher is alive — so the failure reads as device-scope, not as a cron/package problem. Current device-list state is authoritative over the never-truncated logs; logs are read bounded from the end. Gated to OpenClaw sandboxes (Hermes has device_pairing: false).
Newline-safe in-sandbox exec. OpenShell sandbox exec rejects multi-line arguments, which had silently broken the existing connect-time recovery too. Both passes now base64-wrap their payload and decode it inside the sandbox, so the recovery actually runs.

Type of Change

Code change (feature, bug fix, or refactor)

Verification

npm test passes (7413 passed, 25 skipped, 0 failed)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
codex review --uncommitted clean (no actionable findings)

Verified end-to-end on a live dashboard-only OpenClaw sandbox: an openclaw agent tool call triggered a stuck pending allowlisted scope upgrade (agent fell back to embedded/fallbackFrom: gateway); read-only doctor reported [fail] Tool-call device scope: 1 pending allowlisted tool-scope upgrade(s) blocking OpenClaw tool calls; the in-sandbox auto-pair watcher is not running with a doctor --fix hint; and doctor --fix cleared it (pending 1 → 0, [fail] → [ok]).

Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

Release Notes

New Features
- Added OpenClaw device-scope diagnostics to detect stuck approval workflows and suggest remediation paths.
- sandbox doctor --fix now automatically approves pending allowlisted tool-scope upgrades.
Improvements
- Sandbox connect flow now includes device approval diagnostics during initialization.
Tests
- Added comprehensive test coverage for device-scope approval and diagnostic workflows.

…VIDIA#4616) Dashboard-only OpenClaw users hit "gateway closed (1006 abnormal closure)" and policy denials to 127.0.0.1:<dashboardPort> when an agent tool call (cron creation, exec, code search) triggers a device scope upgrade that the in-sandbox auto-pair watcher never approves — typically because the watcher has already exited. The existing recovery only runs during `nemoclaw <sandbox> connect`, which dashboard users never invoke, so the pending upgrade sticks and tools fall back to embedded mode or fail outright. - Extract the connect-time allowlisted approval pass into a shared `auto-pair-approval` module and reuse it from `doctor --fix`, giving dashboard-only users a bounded recovery path without an SSH connect. The allowlist is unchanged (openclaw-control-ui + webchat/cli, narrow scopes). - Add a read-only `doctor` tool-scope diagnostic that surfaces pending allowlisted upgrades, the gateway 1006 / scope-upgrade-pending / loopback policy-denial log signatures, and whether the auto-pair watcher is alive, so the failure reads as device-scope, not as a cron/package problem. Current device-list state is authoritative over (never-truncated) log signatures; logs are read bounded from the end. Gated to OpenClaw sandboxes. - Base64-wrap in-sandbox exec payloads: OpenShell `sandbox exec` rejects multi-line arguments, which had silently broken the connect-time recovery too. This makes both the connect and doctor recovery passes actually run. Verified end-to-end on a live dashboard-only OpenClaw sandbox: triggered a stuck pending allowlisted scope upgrade (agent fell back to embedded), saw `doctor` report it as a failure with a `doctor --fix` hint, and confirmed `doctor --fix` cleared it (pending 1 -> 0, [fail] -> [ok]). Signed-off-by: Yimo Jiang <yimoj@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-08T04:01:19Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dfaa6218-9d1d-4497-a9fa-4c18b30b19b4

📥 Commits

Reviewing files that changed from the base of the PR and between c8be25d and 3080a0f.

📒 Files selected for processing (9)

src/commands/sandbox/doctor.ts
src/lib/actions/sandbox/auto-pair-approval.test.ts
src/lib/actions/sandbox/auto-pair-approval.ts
src/lib/actions/sandbox/connect.ts
src/lib/actions/sandbox/doctor-tool-scope.test.ts
src/lib/actions/sandbox/doctor-tool-scope.ts
src/lib/actions/sandbox/doctor.ts
test/sandbox-connect-inference/auto-pair-approval.test.ts
test/sandbox-connect-inference/helpers.ts

📝 Walkthrough

Walkthrough

This PR introduces a bounded auto-pair approval pass for pending OpenClaw device-scope upgrades and integrates diagnostic/repair logic into the sandbox doctor command. It extracts connect's inline approval logic into a reusable module, adds tool-scope probe building and interpretation for doctor diagnostics, and conditionally runs the approval pass during doctor --fix when pending allowlisted requests are detected. Test infrastructure is updated to handle base64-wrapped approval scripts.

Changes

Device-Scope Auto-Pair Approval and Doctor Integration

Layer / File(s)	Summary
Auto-pair Approval Module Foundation `src/lib/actions/sandbox/auto-pair-approval.ts`	Exposes configuration bounds (`AUTO_PAIR_MAX_APPROVALS`, `AUTO_PAIR_APPROVAL_TIMEOUT_MS`), result typing (`AutoPairApprovalResult`), and functions to wrap shell scripts (base64 encoding), read approval policy modules, generate in-sandbox approval scripts with optional summary markers, and execute them via openshell sandbox exec with non-throwing error handling and capture-mode parsing.
Auto-pair Approval Tests `src/lib/actions/sandbox/auto-pair-approval.test.ts`	Validates script generation (device-approve flow, policy embedding, conditional summary markers), sandbox wrapping (newline removal, round-trip execution), policy module reading, and end-to-end integration with stubbed openclaw CLI (allowlist filtering, approval counts, env var stripping).
Doctor Tool-Scope Diagnostics Module `src/lib/actions/sandbox/doctor-tool-scope.ts`	Probe builder assembles in-sandbox script to list devices, classify pending requests via embedded policy, scan logs for failure signatures, detect watcher liveness, and emit marked JSON. Parser extracts and normalizes probe data. Interpreter converts probe into prioritized doctor checks: fail (with `--fix` hint) for allowlisted pending, warn for non-allowlisted pending or unreadable device list, ok when healthy. Orchestrator conditionally runs approval pass on `--fix` with readable backlog, re-probes, and computes repair-based checks.
Doctor Tool-Scope Tests `src/lib/actions/sandbox/doctor-tool-scope.test.ts`	Validates probe script generation (device listing, log scanning, liveness detection), marker-based JSON parsing, probe interpretation across unavailable/readable/unreadable states, and repair flow (conditional approval pass and re-probe under `--fix`).
Connect Inline-to-Module Refactoring `src/lib/actions/sandbox/connect.ts`	Removes inline auto-pair approval helper and policy-reading logic, delegates to imported `runSandboxAutoPairApprovalPass`, updates imports (removes shellQuote, adds auto-pair-approval), preserves approval timing in connect flow.
Doctor Command Tool-Scope Integration `src/lib/actions/sandbox/doctor.ts`, `src/commands/sandbox/doctor.ts`	Doctor imports auto-pair approval and tool-scope check builders. Adds `sandboxReachable` state flag gating live-only probes, expands `--fix` help text to mention device-scope upgrades, conditionally runs OpenClaw-only tool-scope diagnostics and optional approval pass when sandbox is reachable and agent is openclaw, appends resulting checks to report.
Test Infrastructure: Base64-Wrapped Script Support `test/sandbox-connect-inference/helpers.ts`, `test/sandbox-connect-inference/auto-pair-approval.test.ts`	Adds `decodeWrappedSandboxScript` utility, updates sandbox-connect inference tests to decode and match approval scripts within base64-wrapped payloads, updates OpenShell exec stub to tolerate wrapped approval-pass commands.

Sequence Diagram

sequenceDiagram
  participant DoctorCLI as doctor --fix
  participant DoctorCmd as doctor.ts
  participant ToolScope as buildToolScopeChecks
  participant Sandbox as Sandbox
  participant Approval as Auto-pair Approval
  DoctorCLI->>DoctorCmd: Run with --fix
  DoctorCmd->>ToolScope: buildToolScopeChecks(sandboxName, wantsFix=true)
  ToolScope->>Sandbox: Execute probe script (list devices, scan logs)
  Sandbox-->>ToolScope: Marked JSON with pending count
  ToolScope->>ToolScope: parseToolScopeProbe → ToolScopeProbe
  alt Allowlisted pending backlog detected
    ToolScope->>Approval: runSandboxAutoPairApprovalPass(capture=true)
    Approval->>Sandbox: Execute approval script
    Sandbox-->>Approval: Summary marker with approved count
    Approval-->>ToolScope: AutoPairApprovalResult
    ToolScope->>Sandbox: Execute probe script again (re-probe)
    Sandbox-->>ToolScope: Updated marked JSON
    ToolScope->>ToolScope: parseToolScopeProbe → ToolScopeProbe (delta)
  end
  ToolScope->>ToolScope: interpretToolScopeProbe → DoctorToolScopeCheck[]
  ToolScope-->>DoctorCmd: Checks (fail→fixed, warn, ok)
  DoctorCmd-->>DoctorCLI: Report with checks

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

NVIDIA/NemoClaw#4907: Updates sandbox-connect inference test harness to decode base64-wrapped approval-pass scripts, directly supporting the new auto-pair approval script wrapping in this PR.
NVIDIA/NemoClaw#4788: Introduces centralized approval policy module (scripts/lib/openclaw_device_approval_policy.py) with approval_request_decision and gateway_approval_env helpers that this PR embeds and executes in sandbox probes and approval scripts.
NVIDIA/NemoClaw#4786: Modifies the same in-sandbox auto-pair approval flow (allowlist filtering and env var stripping), with this PR implementing it via the new reusable auto-pair-approval.ts module.

Suggested labels

fix, Sandbox, area: cli, integration: openclaw

Suggested reviewers

prekshivyas
cjagwani

Poem

🐰 Pending approvals blocked the way,
But doctor's probe can save the day!
Wrap scripts in base64's embrace,
Fix device scopes, find their place,
Tool calls flow where gateways meet,
OpenClaw's repair is now complete! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: adding recovery for dashboard tool-scope approvals via the doctor command, directly addressing issue `#4616`.
Linked Issues check	✅ Passed	Changes align with `#4616` objectives: extract approval logic into reusable module, add doctor diagnostics for pending allowlisted upgrades, enable dashboard users to approve pending tool-scope upgrades via doctor --fix.
Out of Scope Changes check	✅ Passed	All changes directly support the `#4616` objectives: auto-pair-approval module, doctor-tool-scope diagnostics, connect refactoring, doctor integration, test infrastructure, and base64-wrapping for shell safety.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

wscurran · 2026-06-08T14:26:11Z

✨
Related open issues:

#4616 Cron job creation by agent does not work on DGX Spark

cv added the v0.0.61 Release target label Jun 8, 2026

wscurran added area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior labels Jun 8, 2026

cv approved these changes Jun 8, 2026

View reviewed changes

cv merged commit 3a7a60d into NVIDIA:main Jun 8, 2026
39 checks passed

coderabbitai Bot mentioned this pull request Jun 9, 2026

fix(onboard): pre-approve gateway scope upgrades after onboard and recover #4763

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): recover late dashboard tool-scope approvals via doctor (#4616)#4936

fix(sandbox): recover late dashboard tool-scope approvals via doctor (#4616)#4936
cv merged 1 commit into
NVIDIA:mainfrom
yimoj:fix/4616-dashboard-tool-scope-recovery

yimoj commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

wscurran commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yimoj commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

wscurran commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yimoj commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading