fix(openclaw): recover replaced scope upgrade approvals#5210
Conversation
Remove OpenClaw device identity and pending approval state after build-time plugin and doctor commands so runtime sandboxes start with fresh device auth state. This preserves the existing gateway token/proxy cleanup and adds a provisioning regression test.
Recover allowlisted OpenClaw scope-upgrade approvals when the approve CLI call replaces the original request with a broader gateway request. The wrapper now applies only the original captured same-device request, adds the read/write closure without operator.admin, removes the replacement pending request, and preserves gateway env for the caller.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughCaptures full stdout/stderr from ChangesApproval command scope deadlock recovery
Sequence DiagramsequenceDiagram
participant CLI as openclaw_CLI
participant ApproveCmd as approve_wrapper
participant Recon as PYAPPROVEAFTER
participant Pending as pending.json
participant Paired as paired.json
CLI->>ApproveCmd: run openclaw devices approve <requestId>
ApproveCmd->>ApproveCmd: execute approval and capture combined stdout stderr
ApproveCmd->>Recon: invoke PYAPPROVEAFTER with NEMOCLAW_APPROVE_OUTPUT
Recon->>Pending: read pending.json to load before and candidates
Recon->>Paired: read paired.json for current paired_entry
Recon->>Recon: parse approve_output and derive requested and current scopes
Recon->>Recon: apply allowlist and subset validation
Recon->>Pending: select and validate replacement pending by deviceId scopes approve_output
Recon->>Paired: update paired_entry scopes and approvedScopes and tokens if present
Recon->>Pending: remove original and replacement pending entries
Recon->>Recon: save persist pending.json and paired.json
Recon->>CLI: return success or failure
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
Vitest E2E Scenario RecommendationRequired Vitest E2E scenarios: Dispatch required Vitest E2E scenarios:
Full Vitest E2E advisor summaryVitest E2E Scenario AdvisorBase: Required Vitest E2E scenarios
Optional Vitest E2E scenarios
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 5 worth checking, 1 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
|
@sandl99 anything in #5210 (comment) worth addressing? |
Selective E2E Results — ✅ All requested jobs passedRun: 27335657020
|
Checking |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 2331-2333: The save() helper currently writes JSON in-place with
Path.write_text(), risking truncated files; change it to write to a
same-directory temporary file and atomically replace the target with
os.replace() to avoid data loss (e.g., create temp file next to (root / name),
write JSON, fsync if possible, close, then os.replace(temp_path, root/name));
update calls that persist pending.json and paired.json (save and any direct
writes around lines referencing pending/paired) to use this atomic-replace
pattern so incomplete writes cannot leave broken JSON.
- Around line 2346-2377: The code that builds paired_scopes currently picks
either "approvedScopes" or "scopes" and discards the other, which can drop valid
scopes and later overwrite persisted fields; change the initial paired_scopes
assignment (the call to scope_set in the paired_scopes variable) to merge both
fields (e.g., union of scope_set(paired_entry, "approvedScopes") and
scope_set(paired_entry, "scopes") or equivalent) so all existing approved and
current scopes are preserved, then use that merged paired_scopes for the
"operator.pairing" gate and subsequent approved computation and writes to
paired_entry and tokens.operator.scopes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5f35aecd-96c7-4b83-8164-2a5400527683
📒 Files selected for processing (1)
scripts/nemoclaw-start.sh
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 958-972: The test currently only asserts approvedScopes; extend it
to verify the full persisted recovery state by asserting that pending (from
pendingFile) no longer contains the replacement entry (i.e., pending.replacement
is removed or pending is empty), and that paired["device-1"].scopes and
paired["device-1"].tokens.operator.scopes match the expected full scope list
(the same list you assert for approvedScopes when shouldRecover is true,
otherwise unchanged), ensuring the replacement requestId check remains. Locate
these checks around the existing assertions using the variables paired, pending,
pairedFile, and pendingFile and add expectations to validate pending was cleared
and all persisted scope lists are aligned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 52f37ca2-70db-40a6-ab4d-e10b06d84da3
📒 Files selected for processing (3)
ci/test-file-size-budget.jsonscripts/nemoclaw-start.shtest/nemoclaw-start.test.ts
✅ Files skipped from review due to trivial changes (1)
- ci/test-file-size-budget.json
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/nemoclaw-start.sh
Selective E2E Results — ❌ Some jobs failedRun: 27336846822
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/nemoclaw-start.test.ts (1)
941-950:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep the no-
requestIdcase gated by the real approve-failure signatures.Line 941 emits only
gateway connect failed: GwhenCASE_REPLACEMENT_IDis empty, but Line 950 still expects recovery. That lets this test pass even if the implementation starts recovering from arbitrarydevices approvefailures instead of only the intendedscope upgrade pending approval/pairing requiredfailures. If you want a missing-requestIdrecovery case, keep one of those phrases and just omit the(requestId: …)suffix; otherwise this branch should stay non-recovering.Suggested test tightening
-if [ -n "${CASE_REPLACEMENT_ID:-}" ]; then echo "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval (requestId: ${CASE_REPLACEMENT_ID})" >&2; else echo "gateway connect failed: G" >&2; fi +if [ -n "${CASE_REPLACEMENT_ID:-}" ]; then + echo "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval (requestId: ${CASE_REPLACEMENT_ID})" >&2 +else + echo "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval" >&2 +fi ... - ["", true], + ["", true],🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/nemoclaw-start.test.ts` around lines 941 - 950, The shell stub that emits error output when CASE_REPLACEMENT_ID is empty currently prints a short "gateway connect failed: G" which lets the test (the loop over replacementId/shouldRecover) incorrectly treat the empty-ID case as recoverable; change that branch to emit one of the real approve-failure signatures without the "(requestId: …)" suffix (e.g. "gateway connect failed: GatewayClientRequestError: scope upgrade pending approval") so the empty replacementId case still matches the intended non-requestId variant, or alternatively change the test data so the "" case has shouldRecover=false; update the echo in the CASE_REPLACEMENT_ID-empty branch accordingly to match the approve-failure wording used by the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 941-950: The shell stub that emits error output when
CASE_REPLACEMENT_ID is empty currently prints a short "gateway connect failed:
G" which lets the test (the loop over replacementId/shouldRecover) incorrectly
treat the empty-ID case as recoverable; change that branch to emit one of the
real approve-failure signatures without the "(requestId: …)" suffix (e.g.
"gateway connect failed: GatewayClientRequestError: scope upgrade pending
approval") so the empty replacementId case still matches the intended
non-requestId variant, or alternatively change the test data so the "" case has
shouldRecover=false; update the echo in the CASE_REPLACEMENT_ID-empty branch
accordingly to match the approve-failure wording used by the test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 17d878ef-4875-42e5-92b8-073182906344
📒 Files selected for processing (2)
scripts/nemoclaw-start.shtest/nemoclaw-start.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/nemoclaw-start.sh
Selective E2E Results — ✅ All requested jobs passedRun: 27337599055
|
Done. This PR to fix |
Summary
This PR narrows the
openclaw devices approvecompatibility recovery so a failed scope-upgrade approval can recover when OpenClaw replaces the original pending request with a same-device request that includesoperator.admin. PR #4949 is not treated as the root cause: it changed the build/startup path enough to expose this latent approval replacement behavior, while the actual bug is the approval wrapper returning failure after OpenClaw leaves only the replacement request pending.Changes
scope upgrade pending approvalorpairing required.pending.jsonandpaired.jsonthrough same-directory atomic replaces.approvedScopesandscopes, approve only the expected operator scope closure, stripoperator.admin, and keep token scopes aligned.test/nemoclaw-start.test.tsand ratchet its legacy size budget down after removing comment-only scaffolding.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesnpm run docsbuilds without warnings (doc changes only)Additional verification run locally:
bash -n scripts/nemoclaw-start.shnpx @biomejs/biome check test/nemoclaw-start.test.ts ci/test-file-size-budget.jsonnpm run test-size:checknpx vitest --run test/nemoclaw-start.test.tsNote:
npx prek run --files ci/test-file-size-budget.json scripts/nemoclaw-start.sh test/nemoclaw-start.test.tspassed formatting, shellcheck, repository checks, and test-size budget, but its CLI hook also launched broader live E2E tests and failed on unrelated live sandbox/onboard cleanup/timeouts. It is not marked as passing above.Signed-off-by: San Dang sdang@nvidia.com
Summary by CodeRabbit