Skip to content

fix(sandbox): recover late dashboard tool-scope approvals via doctor (#4616)#4936

Merged
cv merged 1 commit into
NVIDIA:mainfrom
yimoj:fix/4616-dashboard-tool-scope-recovery
Jun 8, 2026
Merged

fix(sandbox): recover late dashboard tool-scope approvals via doctor (#4616)#4936
cv merged 1 commit into
NVIDIA:mainfrom
yimoj:fix/4616-dashboard-tool-scope-recovery

Conversation

@yimoj

@yimoj yimoj commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Dashboard-only OpenClaw users can't create cron jobs (and other agent tool calls fail) because a late device scope upgrade never gets approved: the agent tool subprocess connects to ws://127.0.0.1:<dashboardPort>, the gateway closes with 1006 abnormal closure, and OpenShell logs a policy denial to 127.0.0.1:<dashboardPort>. The existing allowlisted approval pass only runs during nemoclaw <sandbox> connect, which dashboard users never invoke — so there is no recovery path. This adds a bounded recovery + diagnostic through doctor.

Related Issue

Fixes #4616

Changes

  • Reusable recovery pass. Extracted the connect-time allowlisted approval pass into a shared auto-pair-approval module and reuse it from doctor --fix, so dashboard-only users can approve pending tool-scope upgrades without an SSH connect. The allowlist is unchanged and narrow (openclaw-control-ui + webchat/cli, operator.pairing/read/write only); unknown clients are never approved.
  • doctor tool-scope diagnostic (read-only). Surfaces pending allowlisted upgrades, the 1006 / scope upgrade pending approval / loopback 127.0.0.1:<dashboardPort> policy-denial log signatures, and whether the in-sandbox auto-pair watcher is alive — so the failure reads as device-scope, not as a cron/package problem. Current device-list state is authoritative over the never-truncated logs; logs are read bounded from the end. Gated to OpenClaw sandboxes (Hermes has device_pairing: false).
  • Newline-safe in-sandbox exec. OpenShell sandbox exec rejects multi-line arguments, which had silently broken the existing connect-time recovery too. Both passes now base64-wrap their payload and decode it inside the sandbox, so the recovery actually runs.

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

  • npm test passes (7413 passed, 25 skipped, 0 failed)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • codex review --uncommitted clean (no actionable findings)

Verified end-to-end on a live dashboard-only OpenClaw sandbox: an openclaw agent tool call triggered a stuck pending allowlisted scope upgrade (agent fell back to embedded/fallbackFrom: gateway); read-only doctor reported [fail] Tool-call device scope: 1 pending allowlisted tool-scope upgrade(s) blocking OpenClaw tool calls; the in-sandbox auto-pair watcher is not running with a doctor --fix hint; and doctor --fix cleared it (pending 1 → 0, [fail][ok]).


Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

Release Notes

  • New Features

    • Added OpenClaw device-scope diagnostics to detect stuck approval workflows and suggest remediation paths.
    • sandbox doctor --fix now automatically approves pending allowlisted tool-scope upgrades.
  • Improvements

    • Sandbox connect flow now includes device approval diagnostics during initialization.
  • Tests

    • Added comprehensive test coverage for device-scope approval and diagnostic workflows.

…VIDIA#4616)

Dashboard-only OpenClaw users hit "gateway closed (1006 abnormal closure)"
and policy denials to 127.0.0.1:<dashboardPort> when an agent tool call
(cron creation, exec, code search) triggers a device scope upgrade that the
in-sandbox auto-pair watcher never approves — typically because the watcher
has already exited. The existing recovery only runs during
`nemoclaw <sandbox> connect`, which dashboard users never invoke, so the
pending upgrade sticks and tools fall back to embedded mode or fail outright.

- Extract the connect-time allowlisted approval pass into a shared
  `auto-pair-approval` module and reuse it from `doctor --fix`, giving
  dashboard-only users a bounded recovery path without an SSH connect. The
  allowlist is unchanged (openclaw-control-ui + webchat/cli, narrow scopes).
- Add a read-only `doctor` tool-scope diagnostic that surfaces pending
  allowlisted upgrades, the gateway 1006 / scope-upgrade-pending / loopback
  policy-denial log signatures, and whether the auto-pair watcher is alive,
  so the failure reads as device-scope, not as a cron/package problem.
  Current device-list state is authoritative over (never-truncated) log
  signatures; logs are read bounded from the end. Gated to OpenClaw sandboxes.
- Base64-wrap in-sandbox exec payloads: OpenShell `sandbox exec` rejects
  multi-line arguments, which had silently broken the connect-time recovery
  too. This makes both the connect and doctor recovery passes actually run.

Verified end-to-end on a live dashboard-only OpenClaw sandbox: triggered a
stuck pending allowlisted scope upgrade (agent fell back to embedded), saw
`doctor` report it as a failure with a `doctor --fix` hint, and confirmed
`doctor --fix` cleared it (pending 1 -> 0, [fail] -> [ok]).

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dfaa6218-9d1d-4497-a9fa-4c18b30b19b4

📥 Commits

Reviewing files that changed from the base of the PR and between c8be25d and 3080a0f.

📒 Files selected for processing (9)
  • src/commands/sandbox/doctor.ts
  • src/lib/actions/sandbox/auto-pair-approval.test.ts
  • src/lib/actions/sandbox/auto-pair-approval.ts
  • src/lib/actions/sandbox/connect.ts
  • src/lib/actions/sandbox/doctor-tool-scope.test.ts
  • src/lib/actions/sandbox/doctor-tool-scope.ts
  • src/lib/actions/sandbox/doctor.ts
  • test/sandbox-connect-inference/auto-pair-approval.test.ts
  • test/sandbox-connect-inference/helpers.ts

📝 Walkthrough

Walkthrough

This PR introduces a bounded auto-pair approval pass for pending OpenClaw device-scope upgrades and integrates diagnostic/repair logic into the sandbox doctor command. It extracts connect's inline approval logic into a reusable module, adds tool-scope probe building and interpretation for doctor diagnostics, and conditionally runs the approval pass during doctor --fix when pending allowlisted requests are detected. Test infrastructure is updated to handle base64-wrapped approval scripts.

Changes

Device-Scope Auto-Pair Approval and Doctor Integration

Layer / File(s) Summary
Auto-pair Approval Module Foundation
src/lib/actions/sandbox/auto-pair-approval.ts
Exposes configuration bounds (AUTO_PAIR_MAX_APPROVALS, AUTO_PAIR_APPROVAL_TIMEOUT_MS), result typing (AutoPairApprovalResult), and functions to wrap shell scripts (base64 encoding), read approval policy modules, generate in-sandbox approval scripts with optional summary markers, and execute them via openshell sandbox exec with non-throwing error handling and capture-mode parsing.
Auto-pair Approval Tests
src/lib/actions/sandbox/auto-pair-approval.test.ts
Validates script generation (device-approve flow, policy embedding, conditional summary markers), sandbox wrapping (newline removal, round-trip execution), policy module reading, and end-to-end integration with stubbed openclaw CLI (allowlist filtering, approval counts, env var stripping).
Doctor Tool-Scope Diagnostics Module
src/lib/actions/sandbox/doctor-tool-scope.ts
Probe builder assembles in-sandbox script to list devices, classify pending requests via embedded policy, scan logs for failure signatures, detect watcher liveness, and emit marked JSON. Parser extracts and normalizes probe data. Interpreter converts probe into prioritized doctor checks: fail (with --fix hint) for allowlisted pending, warn for non-allowlisted pending or unreadable device list, ok when healthy. Orchestrator conditionally runs approval pass on --fix with readable backlog, re-probes, and computes repair-based checks.
Doctor Tool-Scope Tests
src/lib/actions/sandbox/doctor-tool-scope.test.ts
Validates probe script generation (device listing, log scanning, liveness detection), marker-based JSON parsing, probe interpretation across unavailable/readable/unreadable states, and repair flow (conditional approval pass and re-probe under --fix).
Connect Inline-to-Module Refactoring
src/lib/actions/sandbox/connect.ts
Removes inline auto-pair approval helper and policy-reading logic, delegates to imported runSandboxAutoPairApprovalPass, updates imports (removes shellQuote, adds auto-pair-approval), preserves approval timing in connect flow.
Doctor Command Tool-Scope Integration
src/lib/actions/sandbox/doctor.ts, src/commands/sandbox/doctor.ts
Doctor imports auto-pair approval and tool-scope check builders. Adds sandboxReachable state flag gating live-only probes, expands --fix help text to mention device-scope upgrades, conditionally runs OpenClaw-only tool-scope diagnostics and optional approval pass when sandbox is reachable and agent is openclaw, appends resulting checks to report.
Test Infrastructure: Base64-Wrapped Script Support
test/sandbox-connect-inference/helpers.ts, test/sandbox-connect-inference/auto-pair-approval.test.ts
Adds decodeWrappedSandboxScript utility, updates sandbox-connect inference tests to decode and match approval scripts within base64-wrapped payloads, updates OpenShell exec stub to tolerate wrapped approval-pass commands.

Sequence Diagram

sequenceDiagram
  participant DoctorCLI as doctor --fix
  participant DoctorCmd as doctor.ts
  participant ToolScope as buildToolScopeChecks
  participant Sandbox as Sandbox
  participant Approval as Auto-pair Approval
  DoctorCLI->>DoctorCmd: Run with --fix
  DoctorCmd->>ToolScope: buildToolScopeChecks(sandboxName, wantsFix=true)
  ToolScope->>Sandbox: Execute probe script (list devices, scan logs)
  Sandbox-->>ToolScope: Marked JSON with pending count
  ToolScope->>ToolScope: parseToolScopeProbe → ToolScopeProbe
  alt Allowlisted pending backlog detected
    ToolScope->>Approval: runSandboxAutoPairApprovalPass(capture=true)
    Approval->>Sandbox: Execute approval script
    Sandbox-->>Approval: Summary marker with approved count
    Approval-->>ToolScope: AutoPairApprovalResult
    ToolScope->>Sandbox: Execute probe script again (re-probe)
    Sandbox-->>ToolScope: Updated marked JSON
    ToolScope->>ToolScope: parseToolScopeProbe → ToolScopeProbe (delta)
  end
  ToolScope->>ToolScope: interpretToolScopeProbe → DoctorToolScopeCheck[]
  ToolScope-->>DoctorCmd: Checks (fail→fixed, warn, ok)
  DoctorCmd-->>DoctorCLI: Report with checks
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4907: Updates sandbox-connect inference test harness to decode base64-wrapped approval-pass scripts, directly supporting the new auto-pair approval script wrapping in this PR.
  • NVIDIA/NemoClaw#4788: Introduces centralized approval policy module (scripts/lib/openclaw_device_approval_policy.py) with approval_request_decision and gateway_approval_env helpers that this PR embeds and executes in sandbox probes and approval scripts.
  • NVIDIA/NemoClaw#4786: Modifies the same in-sandbox auto-pair approval flow (allowlist filtering and env var stripping), with this PR implementing it via the new reusable auto-pair-approval.ts module.

Suggested labels

fix, Sandbox, area: cli, integration: openclaw

Suggested reviewers

  • prekshivyas
  • cjagwani

Poem

🐰 Pending approvals blocked the way,
But doctor's probe can save the day!
Wrap scripts in base64's embrace,
Fix device scopes, find their place,
Tool calls flow where gateways meet,
OpenClaw's repair is now complete!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding recovery for dashboard tool-scope approvals via the doctor command, directly addressing issue #4616.
Linked Issues check ✅ Passed Changes align with #4616 objectives: extract approval logic into reusable module, add doctor diagnostics for pending allowlisted upgrades, enable dashboard users to approve pending tool-scope upgrades via doctor --fix.
Out of Scope Changes check ✅ Passed All changes directly support the #4616 objectives: auto-pair-approval module, doctor-tool-scope diagnostics, connect refactoring, doctor integration, test infrastructure, and base64-wrapping for shell safety.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@cv cv added the v0.0.61 Release target label Jun 8, 2026
@wscurran wscurran added area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior labels Jun 8, 2026
@wscurran

wscurran commented Jun 8, 2026

Copy link
Copy Markdown
Contributor


Related open issues:

@cv cv merged commit 3a7a60d into NVIDIA:main Jun 8, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior v0.0.61 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cron job creation by agent does not work on DGX Spark

3 participants