Skip to content

fix(openclaw): recover replaced scope upgrade approvals#5210

Merged
cv merged 10 commits into
mainfrom
fix/issue-4462-manifest-hook-approval
Jun 11, 2026
Merged

fix(openclaw): recover replaced scope upgrade approvals#5210
cv merged 10 commits into
mainfrom
fix/issue-4462-manifest-hook-approval

Conversation

@sandl99

@sandl99 sandl99 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR narrows the openclaw devices approve compatibility recovery so a failed scope-upgrade approval can recover when OpenClaw replaces the original pending request with a same-device request that includes operator.admin. PR #4949 is not treated as the root cause: it changed the build/startup path enough to expose this latent approval replacement behavior, while the actual bug is the approval wrapper returning failure after OpenClaw leaves only the replacement request pending.

Changes

  • Pass the failed approve command output into the existing post-failure recovery path.
  • Detect only same-device replacement requests reported by the CLI as scope upgrade pending approval or pairing required.
  • Match replacement request IDs with token boundaries instead of substring matching.
  • Persist pending.json and paired.json through same-directory atomic replaces.
  • Merge existing approvedScopes and scopes, approve only the expected operator scope closure, strip operator.admin, and keep token scopes aligned.
  • Keep the focused guard regression in test/nemoclaw-start.test.ts and ratchet its legacy size budget down after removing comment-only scaffolding.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Additional verification run locally:

  • bash -n scripts/nemoclaw-start.sh
  • npx @biomejs/biome check test/nemoclaw-start.test.ts ci/test-file-size-budget.json
  • npm run test-size:check
  • npx vitest --run test/nemoclaw-start.test.ts

Note: npx prek run --files ci/test-file-size-budget.json scripts/nemoclaw-start.sh test/nemoclaw-start.test.ts passed formatting, shellcheck, repository checks, and test-size budget, but its CLI hook also launched broader live E2E tests and failed on unrelated live sandbox/onboard cleanup/timeouts. It is not marked as passing above.


Signed-off-by: San Dang sdang@nvidia.com

Summary by CodeRabbit

  • Bug Fixes
    • More reliable device approval handling with deterministic recovery of pairing/pending state, stricter scope allowlist enforcement, and safer replacement-selection logic.
  • Tests
    • Updated tests to simulate approval failure and verify guarded recovery behavior; strengthened startup/entrypoint assertions (tokens, file locks, stderr).
  • Chores
    • Reduced test file-size budget for a specific test.

sandl99 added 5 commits June 11, 2026 13:15
Remove OpenClaw device identity and pending approval state after build-time plugin and doctor commands so runtime sandboxes start with fresh device auth state. This preserves the existing gateway token/proxy cleanup and adds a provisioning regression test.
Recover allowlisted OpenClaw scope-upgrade approvals when the approve CLI call replaces the original request with a broader gateway request. The wrapper now applies only the original captured same-device request, adds the read/write closure without operator.admin, removes the replacement pending request, and preserves gateway env for the caller.
@sandl99 sandl99 self-assigned this Jun 11, 2026
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Captures full stdout/stderr from openclaw devices approve into NEMOCLAW_APPROVE_OUTPUT and refactors the embedded Python recovery to use that output plus persisted paired.json/pending.json to deterministically reconcile and persist device-scoped approvals.

Changes

Approval command scope deadlock recovery

Layer / File(s) Summary
Output wiring and env
scripts/nemoclaw-start.sh
Captured openclaw devices approve stdout/stderr is exported into the Python heredoc as NEMOCLAW_APPROVE_OUTPUT; the script reads this into an approve_output variable for later checks.
Deterministic reconciliation and persistence
scripts/nemoclaw-start.sh
Recovery script adds an atomic save(name,value) helper, derives requested from the before snapshot and current granted scopes from paired.json (considers approvedScopes or scopes), supports early-success when already satisfied, enforces operator allowlist/subset rules and requires operator.pairing present, selects a compatible replacement pending entry by deviceId/operator.admin/scope-subset rules and disambiguates via approve_output mention, normalizes approved scopes (operator.pairing/read/write), updates paired_entry["scopes"]/["approvedScopes"] (and operator token scopes when present), removes impacted pending entries, and persists pending.json and paired.json.
Tests, direct-root wrapper, and CI budget
test/nemoclaw-start.test.ts, scripts/nemoclaw-start.sh, ci/test-file-size-budget.json
Adds Vitest scenario-driven case exercising guarded devices approve recovery across CASE_REPLACEMENT_ID variations; adjusts direct-root wrapper simulation (index-based extraction of write_runtime_shell_env, adds lock_rc_files that chmods .bashrc/.profile to 0444, explicit stubs, and ordered helper calls), strengthens post-run assertions for continuation, stderr contents, .config-hash mode 660, generated proxy env export of OPENCLAW_GATEWAY_TOKEN, and updates legacyMaxLines budget.

Sequence Diagram

sequenceDiagram
  participant CLI as openclaw_CLI
  participant ApproveCmd as approve_wrapper
  participant Recon as PYAPPROVEAFTER
  participant Pending as pending.json
  participant Paired as paired.json
  CLI->>ApproveCmd: run openclaw devices approve <requestId>
  ApproveCmd->>ApproveCmd: execute approval and capture combined stdout stderr
  ApproveCmd->>Recon: invoke PYAPPROVEAFTER with NEMOCLAW_APPROVE_OUTPUT
  Recon->>Pending: read pending.json to load before and candidates
  Recon->>Paired: read paired.json for current paired_entry
  Recon->>Recon: parse approve_output and derive requested and current scopes
  Recon->>Recon: apply allowlist and subset validation
  Recon->>Pending: select and validate replacement pending by deviceId scopes approve_output
  Recon->>Paired: update paired_entry scopes and approvedScopes and tokens if present
  Recon->>Pending: remove original and replacement pending entries
  Recon->>Recon: save persist pending.json and paired.json
  Recon->>CLI: return success or failure
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5203: e2e test change detecting OpenClaw scope-upgrade/pairing prompts and invoking nemoclaw ... recover.

Suggested labels

bug-fix

Suggested reviewers

  • cv
  • ericksoa

Poem

🐰 I watched the CLI hiccup and stall,
The scopes would block the pairing call.
I piped the output into Python’s care,
It matched the pending note and fixed the pair.
Now tokens hum and devices scurry on their way.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(openclaw): recover replaced scope upgrade approvals' directly and clearly summarizes the main change—enabling recovery of failed scope-upgrade approvals when OpenClaw replaces the original request.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-4462-manifest-hook-approval

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: issue-4462-scope-upgrade-approval-e2e
Optional E2E: issue-4462-gateway-pinned-approval-characterization-e2e, device-auth-health-e2e

Dispatch hint: issue-4462-scope-upgrade-approval-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

Optional E2E

  • issue-4462-gateway-pinned-approval-characterization-e2e (high (~60 min; Docker/OpenShell sandbox; NVIDIA_API_KEY)): Useful diagnostic adjacency for the newly changed compatibility boundary: it characterizes legacy gateway-pinned approval behavior and recovers through the proxy-env guard if needed, but the workflow comments describe it as diagnostic rather than the primary fix gate.
  • device-auth-health-e2e (medium-high (~30 min; Docker/OpenShell sandbox; NVIDIA_API_KEY)): Adjacent confidence for device-auth/gateway health behavior after a change to in-sandbox device approval and gateway-token handling.

New E2E recommendations

  • deterministic OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 replacement-request recovery (high): Existing E2E coverage validates real scope-upgrade approval and includes a diagnostic characterization path, but the newly added repair branch depends on a precise nonzero approve result with exactly one same-device admin-shaped replacement pending request. A deterministic live-sandbox E2E or scenario should force that state and assert approved scopes/tokens exclude operator.admin and both pending requests are removed.
    • Suggested test: Add a deterministic E2E scenario for constrained replacement-request recovery after openclaw devices approve returns nonzero.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: issue-4462-scope-upgrade-approval-e2e

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: ubuntu-repo-cloud-openclaw
Optional Vitest E2E scenarios: ubuntu-repo-docker-post-reboot-recovery

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • ubuntu-repo-cloud-openclaw: scripts/nemoclaw-start.sh changes the in-sandbox OpenClaw proxy-env guard and gateway/device approval compatibility path. The canonical live-supported Ubuntu repo/current Docker cloud OpenClaw scenario is the smallest typed Vitest scenario that onboards an OpenClaw sandbox from this script path and validates the resulting gateway/sandbox state.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Optional Vitest E2E scenarios

  • ubuntu-repo-docker-post-reboot-recovery: Adjacent live-supported OpenClaw scenario that exercises the same Ubuntu repo/current Docker onboarding surface with an additional lifecycle recovery/status phase. Useful if reviewers want extra confidence that the startup/proxy-env changes do not regress recovery-oriented state preservation, but it is not the primary target of this PR.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-docker-post-reboot-recovery

Relevant changed files

  • scripts/nemoclaw-start.sh

@sandl99 sandl99 added the nightly-e2e Nightly E2E test failures label Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 5 worth checking, 1 nice ideas
Since last review: 1 prior item resolved, 6 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: openclaw devices approve nonzero-after-applied compatibility recovery: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: scripts/nemoclaw-start.sh treats nonzero approve as success when the original request is no longer pending and paired scopes include the requested scopes, then emits openclaw-approve-applied-after-nonzero.
  • Source-of-truth review needed: openclaw devices approve replacement-request state-file recovery: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: scripts/nemoclaw-start.sh filters replacement candidates, strips operator.admin, rewrites pending.json and paired.json, and emits openclaw-approve-recovered-replacement.
  • Approval-state recovery still needs authorization-negative coverage (scripts/nemoclaw-start.sh:2366): The replacement recovery directly rewrites OpenClaw device authorization state after a failed approve. The implementation has useful guardrails, including same-device checks, existing operator.pairing, allowlisted final scopes, request-id boundary matching, and operator.admin stripping, but the denial cases most relevant to authorization bypass risk are not covered by the changed test.
    • Recommendation: Add focused guard tests where the replacement has a different deviceId, the original request includes operator.admin or another out-of-allowlist scope, the paired device lacks existing operator.pairing, paired/token state is malformed, multiple replacement candidates exist with opaque output, and the replacement contains an extra non-allowlisted scope. Assert no pending-entry removal, no paired/token scope grant, and the original nonzero approve result is returned.
    • Evidence: scripts/nemoclaw-start.sh filters replacement candidates and mutates pending.json and paired.json. test/nemoclaw-start.test.ts covers the positive replacement path, replacement-10 substring mismatch, and opaque single-candidate recovery, but not different-device, disallowed-scope, missing-pairing, malformed-state, ambiguous-candidate, or extra-scope negatives.
  • Two-file device-state repair can leave partial recovery on write failure (scripts/nemoclaw-start.sh:2402): The recovery removes the original and replacement pending requests, then saves pending.json before saving paired.json. If the pending write succeeds and the paired write fails, OpenClaw can be left without the pending approval evidence and without durable paired/token scopes.
    • Recommendation: Prefer an ordering or rollback strategy that only removes pending entries after paired state is durably persisted, or add failure-injection coverage proving the state remains recoverable when paired.json cannot be written after pending.json was updated.
    • Evidence: The Python recovery builds updated pending and paired objects, runs pending.pop(...), updates paired_entry, then calls save("pending.json", pending) followed by save("paired.json", paired). The per-file atomic replace does not make the two-file state transition transactional.
  • Nonzero-after-applied compatibility path remains under-tested (scripts/nemoclaw-start.sh:2354): The branch that treats a nonzero approve as success when OpenClaw already removed the pending request and persisted the requested paired scopes changes command success semantics in an authorization path. The changed unit test does not exercise this branch.
    • Recommendation: Add a proxy-env guard fixture where failed approve output returns nonzero, the original pending request is absent, and paired.json already contains the requested scopes. Verify the wrapper returns success with openclaw-approve-applied-after-nonzero, does not mutate unrelated pending or paired entries, and leaves token scopes consistent.
    • Evidence: scripts/nemoclaw-start.sh emits the compatibility marker openclaw-approve-applied-after-nonzero. The visible changed test only asserts openclaw-approve-recovered-replacement.

🌱 Nice ideas

  • Preserve lightweight unit coverage that approve does not poison later gateway commands (test/nemoclaw-start.test.ts:916): The prior OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 guard sequence asserted that after devices approve, the parent shell still had OPENCLAW_GATEWAY_URL, OPENCLAW_GATEWAY_PORT, and OPENCLAW_GATEWAY_TOKEN for a later openclaw agent command. The current focused replacement test only checks the approve subprocess env is unset. Existing E2E coverage checks the runtime path, but the local shell-wrapper contract is no longer pinned in this unit test.
    • Recommendation: Extend the guard fixture to run a subsequent openclaw agent --agent main after the failed or recovered approve and assert it receives OPENCLAW_GATEWAY_URL, OPENCLAW_GATEWAY_PORT, and OPENCLAW_GATEWAY_TOKEN.
    • Evidence: The changed test checks ARGS=devices approve request-1 --json URL=unset PORT=unset TOKEN=unset; it no longer includes the earlier shell-env and later-agent assertions.
Consider writing more tests for
  • **Runtime validation** — Add a proxy-env guard test where the replacement request has a different deviceId; verify pending.json and paired.json remain unchanged and the original nonzero approve result is returned.. The changed behavior lives in sandbox/bootstrap shell and directly affects OpenClaw authorization state. Existing repository E2E coverage maps to the linked OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 runtime path, and the changed unit test is useful for focused recovery logic, but security-critical denial and failure-injection branches still need deterministic coverage.
  • **Runtime validation** — Add a proxy-env guard test where the original requested scopes include operator.admin or another out-of-allowlist scope; verify no paired/token scope grant and no pending-entry removal.. The changed behavior lives in sandbox/bootstrap shell and directly affects OpenClaw authorization state. Existing repository E2E coverage maps to the linked OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 runtime path, and the changed unit test is useful for focused recovery logic, but security-critical denial and failure-injection branches still need deterministic coverage.
  • **Runtime validation** — Add a proxy-env guard test where the paired device lacks existing operator.pairing; verify recovery fails without mutating pending.json or paired.json.. The changed behavior lives in sandbox/bootstrap shell and directly affects OpenClaw authorization state. Existing repository E2E coverage maps to the linked OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 runtime path, and the changed unit test is useful for focused recovery logic, but security-critical denial and failure-injection branches still need deterministic coverage.
  • **Runtime validation** — Add a proxy-env guard test where paired.json or token state is malformed; verify the wrapper does not crash after partial mutation and returns the original nonzero result.. The changed behavior lives in sandbox/bootstrap shell and directly affects OpenClaw authorization state. Existing repository E2E coverage maps to the linked OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 runtime path, and the changed unit test is useful for focused recovery logic, but security-critical denial and failure-injection branches still need deterministic coverage.
  • **Runtime validation** — Add a proxy-env guard test with multiple same-device admin-shaped replacement candidates and opaque output; verify recovery is rejected without mutation.. The changed behavior lives in sandbox/bootstrap shell and directly affects OpenClaw authorization state. Existing repository E2E coverage maps to the linked OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 runtime path, and the changed unit test is useful for focused recovery logic, but security-critical denial and failure-injection branches still need deterministic coverage.
  • **Nonzero-after-applied compatibility path remains under-tested** — Add a proxy-env guard fixture where failed approve output returns nonzero, the original pending request is absent, and paired.json already contains the requested scopes. Verify the wrapper returns success with openclaw-approve-applied-after-nonzero, does not mutate unrelated pending or paired entries, and leaves token scopes consistent.
  • **Preserve lightweight unit coverage that approve does not poison later gateway commands** — Extend the guard fixture to run a subsequent openclaw agent --agent main after the failed or recovered approve and assert it receives OPENCLAW_GATEWAY_URL, OPENCLAW_GATEWAY_PORT, and OPENCLAW_GATEWAY_TOKEN.
  • **openclaw devices approve nonzero-after-applied compatibility recovery** — Missing deterministic coverage for this exact branch in test/nemoclaw-start.test.ts; existing E2E characterization partly covers failure-after-apply behavior but does not pin the local marker and non-mutation contract.. scripts/nemoclaw-start.sh treats nonzero approve as success when the original request is no longer pending and paired scopes include the requested scopes, then emits openclaw-approve-applied-after-nonzero.
Since last review details

Current findings:

  • Source-of-truth review needed: openclaw devices approve nonzero-after-applied compatibility recovery: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: scripts/nemoclaw-start.sh treats nonzero approve as success when the original request is no longer pending and paired scopes include the requested scopes, then emits openclaw-approve-applied-after-nonzero.
  • Source-of-truth review needed: openclaw devices approve replacement-request state-file recovery: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: scripts/nemoclaw-start.sh filters replacement candidates, strips operator.admin, rewrites pending.json and paired.json, and emits openclaw-approve-recovered-replacement.
  • Approval-state recovery still needs authorization-negative coverage (scripts/nemoclaw-start.sh:2366): The replacement recovery directly rewrites OpenClaw device authorization state after a failed approve. The implementation has useful guardrails, including same-device checks, existing operator.pairing, allowlisted final scopes, request-id boundary matching, and operator.admin stripping, but the denial cases most relevant to authorization bypass risk are not covered by the changed test.
    • Recommendation: Add focused guard tests where the replacement has a different deviceId, the original request includes operator.admin or another out-of-allowlist scope, the paired device lacks existing operator.pairing, paired/token state is malformed, multiple replacement candidates exist with opaque output, and the replacement contains an extra non-allowlisted scope. Assert no pending-entry removal, no paired/token scope grant, and the original nonzero approve result is returned.
    • Evidence: scripts/nemoclaw-start.sh filters replacement candidates and mutates pending.json and paired.json. test/nemoclaw-start.test.ts covers the positive replacement path, replacement-10 substring mismatch, and opaque single-candidate recovery, but not different-device, disallowed-scope, missing-pairing, malformed-state, ambiguous-candidate, or extra-scope negatives.
  • Two-file device-state repair can leave partial recovery on write failure (scripts/nemoclaw-start.sh:2402): The recovery removes the original and replacement pending requests, then saves pending.json before saving paired.json. If the pending write succeeds and the paired write fails, OpenClaw can be left without the pending approval evidence and without durable paired/token scopes.
    • Recommendation: Prefer an ordering or rollback strategy that only removes pending entries after paired state is durably persisted, or add failure-injection coverage proving the state remains recoverable when paired.json cannot be written after pending.json was updated.
    • Evidence: The Python recovery builds updated pending and paired objects, runs pending.pop(...), updates paired_entry, then calls save("pending.json", pending) followed by save("paired.json", paired). The per-file atomic replace does not make the two-file state transition transactional.
  • Nonzero-after-applied compatibility path remains under-tested (scripts/nemoclaw-start.sh:2354): The branch that treats a nonzero approve as success when OpenClaw already removed the pending request and persisted the requested paired scopes changes command success semantics in an authorization path. The changed unit test does not exercise this branch.
    • Recommendation: Add a proxy-env guard fixture where failed approve output returns nonzero, the original pending request is absent, and paired.json already contains the requested scopes. Verify the wrapper returns success with openclaw-approve-applied-after-nonzero, does not mutate unrelated pending or paired entries, and leaves token scopes consistent.
    • Evidence: scripts/nemoclaw-start.sh emits the compatibility marker openclaw-approve-applied-after-nonzero. The visible changed test only asserts openclaw-approve-recovered-replacement.
  • Preserve lightweight unit coverage that approve does not poison later gateway commands (test/nemoclaw-start.test.ts:916): The prior OpenClaw CLI scope-upgrade approval deadlocks and forces openclaw agent into embedded fallback #4462 guard sequence asserted that after devices approve, the parent shell still had OPENCLAW_GATEWAY_URL, OPENCLAW_GATEWAY_PORT, and OPENCLAW_GATEWAY_TOKEN for a later openclaw agent command. The current focused replacement test only checks the approve subprocess env is unset. Existing E2E coverage checks the runtime path, but the local shell-wrapper contract is no longer pinned in this unit test.
    • Recommendation: Extend the guard fixture to run a subsequent openclaw agent --agent main after the failed or recovered approve and assert it receives OPENCLAW_GATEWAY_URL, OPENCLAW_GATEWAY_PORT, and OPENCLAW_GATEWAY_TOKEN.
    • Evidence: The changed test checks ARGS=devices approve request-1 --json URL=unset PORT=unset TOKEN=unset; it no longer includes the earlier shell-env and later-agent assertions.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@sandl99 sandl99 requested a review from cv June 11, 2026 08:58
@sandl99 sandl99 added the VRDC Issues and PRs submitted by NVIDIA VRDC test team. label Jun 11, 2026
@cv

cv commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

@sandl99 anything in #5210 (comment) worth addressing?

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27335657020
Target ref: fix/issue-4462-manifest-hook-approval
Requested jobs: issue-4462-scope-upgrade-approval-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
issue-4462-scope-upgrade-approval-e2e ✅ success

@sandl99

sandl99 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@sandl99 anything in #5210 (comment) worth addressing?

Checking

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 2331-2333: The save() helper currently writes JSON in-place with
Path.write_text(), risking truncated files; change it to write to a
same-directory temporary file and atomically replace the target with
os.replace() to avoid data loss (e.g., create temp file next to (root / name),
write JSON, fsync if possible, close, then os.replace(temp_path, root/name));
update calls that persist pending.json and paired.json (save and any direct
writes around lines referencing pending/paired) to use this atomic-replace
pattern so incomplete writes cannot leave broken JSON.
- Around line 2346-2377: The code that builds paired_scopes currently picks
either "approvedScopes" or "scopes" and discards the other, which can drop valid
scopes and later overwrite persisted fields; change the initial paired_scopes
assignment (the call to scope_set in the paired_scopes variable) to merge both
fields (e.g., union of scope_set(paired_entry, "approvedScopes") and
scope_set(paired_entry, "scopes") or equivalent) so all existing approved and
current scopes are preserved, then use that merged paired_scopes for the
"operator.pairing" gate and subsequent approved computation and writes to
paired_entry and tokens.operator.scopes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5f35aecd-96c7-4b83-8164-2a5400527683

📥 Commits

Reviewing files that changed from the base of the PR and between 463d59f and 7f3edff.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh

Comment thread scripts/nemoclaw-start.sh
Comment thread scripts/nemoclaw-start.sh Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 958-972: The test currently only asserts approvedScopes; extend it
to verify the full persisted recovery state by asserting that pending (from
pendingFile) no longer contains the replacement entry (i.e., pending.replacement
is removed or pending is empty), and that paired["device-1"].scopes and
paired["device-1"].tokens.operator.scopes match the expected full scope list
(the same list you assert for approvedScopes when shouldRecover is true,
otherwise unchanged), ensuring the replacement requestId check remains. Locate
these checks around the existing assertions using the variables paired, pending,
pairedFile, and pendingFile and add expectations to validate pending was cleared
and all persisted scope lists are aligned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 52f37ca2-70db-40a6-ab4d-e10b06d84da3

📥 Commits

Reviewing files that changed from the base of the PR and between 9237256 and e8ae2b2.

📒 Files selected for processing (3)
  • ci/test-file-size-budget.json
  • scripts/nemoclaw-start.sh
  • test/nemoclaw-start.test.ts
✅ Files skipped from review due to trivial changes (1)
  • ci/test-file-size-budget.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/nemoclaw-start.sh

Comment thread test/nemoclaw-start.test.ts
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 27336846822
Target ref: fix/issue-4462-manifest-hook-approval
Requested jobs: issue-4462-scope-upgrade-approval-e2e,issue-4462-gateway-pinned-approval-characterization-e2e
Summary: 1 passed, 1 failed, 0 skipped

Job Result
issue-4462-gateway-pinned-approval-characterization-e2e ✅ success
issue-4462-scope-upgrade-approval-e2e ❌ failure

Failed jobs: issue-4462-scope-upgrade-approval-e2e. Check run artifacts for logs.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/nemoclaw-start.test.ts (1)

941-950: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the no-requestId case gated by the real approve-failure signatures.

Line 941 emits only gateway connect failed: G when CASE_REPLACEMENT_ID is empty, but Line 950 still expects recovery. That lets this test pass even if the implementation starts recovering from arbitrary devices approve failures instead of only the intended scope upgrade pending approval / pairing required failures. If you want a missing-requestId recovery case, keep one of those phrases and just omit the (requestId: …) suffix; otherwise this branch should stay non-recovering.

Suggested test tightening
-if [ -n "${CASE_REPLACEMENT_ID:-}" ]; then echo "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval (requestId: ${CASE_REPLACEMENT_ID})" >&2; else echo "gateway connect failed: G" >&2; fi
+if [ -n "${CASE_REPLACEMENT_ID:-}" ]; then
+  echo "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval (requestId: ${CASE_REPLACEMENT_ID})" >&2
+else
+  echo "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval" >&2
+fi
...
-        ["", true],
+        ["", true],
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/nemoclaw-start.test.ts` around lines 941 - 950, The shell stub that
emits error output when CASE_REPLACEMENT_ID is empty currently prints a short
"gateway connect failed: G" which lets the test (the loop over
replacementId/shouldRecover) incorrectly treat the empty-ID case as recoverable;
change that branch to emit one of the real approve-failure signatures without
the "(requestId: …)" suffix (e.g. "gateway connect failed:
GatewayClientRequestError: scope upgrade pending approval") so the empty
replacementId case still matches the intended non-requestId variant, or
alternatively change the test data so the "" case has shouldRecover=false;
update the echo in the CASE_REPLACEMENT_ID-empty branch accordingly to match the
approve-failure wording used by the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 941-950: The shell stub that emits error output when
CASE_REPLACEMENT_ID is empty currently prints a short "gateway connect failed:
G" which lets the test (the loop over replacementId/shouldRecover) incorrectly
treat the empty-ID case as recoverable; change that branch to emit one of the
real approve-failure signatures without the "(requestId: …)" suffix (e.g.
"gateway connect failed: GatewayClientRequestError: scope upgrade pending
approval") so the empty replacementId case still matches the intended
non-requestId variant, or alternatively change the test data so the "" case has
shouldRecover=false; update the echo in the CASE_REPLACEMENT_ID-empty branch
accordingly to match the approve-failure wording used by the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 17d878ef-4875-42e5-92b8-073182906344

📥 Commits

Reviewing files that changed from the base of the PR and between 01b1407 and e9075a0.

📒 Files selected for processing (2)
  • scripts/nemoclaw-start.sh
  • test/nemoclaw-start.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/nemoclaw-start.sh

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27337599055
Target ref: fix/issue-4462-manifest-hook-approval
Requested jobs: issue-4462-scope-upgrade-approval-e2e,issue-4462-gateway-pinned-approval-characterization-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
issue-4462-gateway-pinned-approval-characterization-e2e ✅ success
issue-4462-scope-upgrade-approval-e2e ✅ success

@sandl99

sandl99 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@sandl99 anything in #5210 (comment) worth addressing?

Done. This PR to fix
issue-4462-gateway-pinned-approval-characterization-e2e and issue-4462-scope-upgrade-approval-e2e failure, some hidden bug which is exposed after #4949 @cv

@cv cv merged commit 3d5dab6 into main Jun 11, 2026
110 of 115 checks passed
@cv cv deleted the fix/issue-4462-manifest-hook-approval branch June 11, 2026 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nightly-e2e Nightly E2E test failures VRDC Issues and PRs submitted by NVIDIA VRDC test team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants