Skip to content

fix(state): preserve reporter-owned model metadata on openclaw.json rebuild restore#5205

Open
yimoj wants to merge 2 commits into
NVIDIA:mainfrom
yimoj:fix/5202-openclaw-config-restore-metadata
Open

fix(state): preserve reporter-owned model metadata on openclaw.json rebuild restore#5205
yimoj wants to merge 2 commits into
NVIDIA:mainfrom
yimoj:fix/5202-openclaw-config-restore-metadata

Conversation

@yimoj

@yimoj yimoj commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

After a NemoClaw upgrade + nemoclaw <name> rebuild, an OpenClaw sandbox's reporter-tuned openclaw.json model metadata reset to regenerated defaults. Fixing this required two layers: (1) the rebuild restore merge replaced matching models.providers entries wholesale, and (2) even after a correct merge, OpenClaw's own config-integrity protection reverted the restored config to its .last-good baseline. This PR fixes both, verified end-to-end through the real nemoclaw CLI on a live sandbox.

Related Issue

Fixes #5202

Changes

  • src/lib/state/openclaw-config-merge.ts: reconcile matching models.providers entries by ownership instead of replacing them wholesale. The fresh rebuild keeps runtime-owned routing/credentials (baseUrl, api, apiKey) and a model's identity (id, name); the backup restores user-owned non-secret tuning (reasoning, cost, contextWindow, maxTokens, compat, input). Fresh rebuild stays authoritative for the provider/model set and order; backup-only/id-less stale models are not resurrected; backup-only providers are inherited. No raw secrets restored.
  • src/lib/state/sandbox.ts: buildStateFileRestoreCommand now refreshes OpenClaw's .last-good recovery anchor for the openclaw.json merge restore — staged through a temp + atomic rename and failing closed before the live config is swapped. Without this, OpenClaw archives the restored config as openclaw.json.clobbered.<ts> and reverts to the freshly generated baseline.
  • Tests: reporter-shaped unit regression (openclaw-config-merge.test.ts), a command-string unit test for the anchor refresh + ordering + fail-closed (state-file-restore-command.test.ts), and an extended backup/restore integration test through the real backupSandboxState()/restoreSandboxState() path asserting both the merged metadata, mcp.servers retention, and the .last-good anchor (openclaw-config-snapshot.test.ts).

Notes on mcp.servers

The reporter also saw mcp.servers become {}. The fresh generated config emits no mcp section, so the backed-up mcp survives the generic object merge — confirmed live (fresh in-sandbox config showed mcp: null) and locked in by the tests. The merge layer needed no change for it; the .last-good revert was what dropped it end-to-end, which layer 2 fixes.

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

  • Reporter-workflow E2E on a live OpenClaw sandbox via the worktree CLI (no mocks):
    • node ./bin/nemoclaw.js onboard --non-interactive --yes --no-gpu --name i5202e2e — created a real OpenClaw sandbox (cloud inference, no GPU).
    • node ./bin/nemoclaw.js i5202e2e exec -- ... — tuned the in-sandbox /sandbox/.openclaw/openclaw.json to the reporter state (reasoning:true, custom cost, maxTokens:32768, compat, mcp.servers.filesystem).
    • node ./bin/nemoclaw.js i5202e2e rebuild --yes — the user-facing operation under test (backup → recreate → restore-merge → .last-good refresh).
    • Post-rebuild inspection of /sandbox/.openclaw/openclaw.json: kept reasoning:true, maxTokens:32768, cost {0.5,1.5,0.1,0.2}, compat, input ["text","image"], mcp.servers.filesystem; runtime fields fresh (baseUrl https://inference.local/v1, apiKey unused, fresh id/name); zero openclaw.json.clobbered.* files. Survived a subsequent container restart (OpenClaw integrity check re-ran). Reproduced cleanly across two consecutive fixed rebuilds.
  • Pre-fix, the same workflow showed the bug: restored config reverted to reasoning:false/maxTokens:4096/no compat/no mcp, with the correctly-merged config archived as openclaw.json.clobbered.* — proving the merge worked but OpenClaw reverted it.
  • npx vitest run on the affected unit + integration tests — passing.
  • tsc -p tsconfig.cli.json typecheck passes.
  • npx @biomejs/biome check clean on changed files.
  • codex review --uncommitted clean.
  • Tests added for new/changed behavior.
  • No secrets, API keys, or credentials committed; only the already-sanitized backup is merged.

Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced configuration restoration to better preserve user-configured model metadata, tuning settings, and MCP server configurations across system rebuilds and recovery operations.
    • Improved handling of credentials and secrets during backup and restore cycles to maintain security while preserving non-sensitive configuration.
  • Tests

    • Added comprehensive test coverage for configuration merge and state file restoration processes.

…ebuild restore

OpenClaw's openclaw.json is backed up and selectively merged on rebuild
(NVIDIA#5027), but matching `models.providers` entries were replaced wholesale by
the freshly generated provider block. After a version upgrade + rebuild the
user's tuned model metadata reset to regenerated defaults: `reasoning` flipped
to false, `cost` zeroed, `maxTokens` fell back to 4096, and `compat`
disappeared.

Reconcile matching provider entries by ownership instead: the fresh rebuild
keeps runtime-owned routing/credential fields (baseUrl, api, apiKey) and a
model's routing identity (id/name), while backed-up non-secret model tuning
(reasoning, cost, contextWindow, maxTokens, compat, input, …) is restored.
Provider/model sets and order stay authoritative from the fresh rebuild;
backup-only and id-less stale models are not resurrected. No raw secrets are
restored — only the already-sanitized backup is merged.

Adds a reporter-shaped regression at the merge-helper level and through the
real backupSandboxState/restoreSandboxState path (the functions
`nemoclaw <name> rebuild` invokes), covering provider model metadata and
top-level `mcp.servers`. The regression confirms `mcp.servers` is already
preserved by the generic object merge (the fresh config emits no mcp section),
so no restore/sanitization change was needed for it.

Fixes NVIDIA#5202

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a76623c9-b203-4036-8969-71517e80f701

📥 Commits

Reviewing files that changed from the base of the PR and between 80cdc3f and df06f5a.

📒 Files selected for processing (3)
  • src/lib/state/sandbox.ts
  • test/openclaw-config-snapshot.test.ts
  • test/state-file-restore-command.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/openclaw-config-snapshot.test.ts

📝 Walkthrough

Walkthrough

Refines OpenClaw config restore: updates ownership contract, adds provider/model reconciliation that enforces fresh runtime-owned fields while restoring backup-held model tuning and durable MCP server settings, exports/adjusts the state restore command for safe .last-good staging, and adds unit/integration tests and sandbox test helpers.

Changes

OpenClaw Config Restore Ownership and Merge

Layer / File(s) Summary
Ownership contract and runtime field definitions
src/lib/state/openclaw-config-merge.ts
OPENCLAW_CONFIG_RESTORE_OWNERSHIP updated with revised runtimeSections/managedChannels/currentGeneratedEntryMaps/backupDurableSections and new providerRuntimeOwnedFields/modelRuntimeOwnedFields.
Provider and model reconciliation merge helpers
src/lib/state/openclaw-config-merge.ts
New helpers reconcile provider and model entries by id, enforce runtime-owned fields from the fresh config, and merge provider models arrays; mergeOpenClawModels now uses provider reconciliation.
Config merge unit tests
src/lib/state/openclaw-config-merge.test.ts
Vitest case verifying merged result preserves fresh provider routing/credentials, keeps fresh model identity while restoring tuning/metadata from backup, and retains durable MCP server settings.
State restore command adjustments
src/lib/state/sandbox.ts, test/state-file-restore-command.test.ts
Exported buildStateFileRestoreCommand and updated its OpenClaw path to stage and atomically install dst.last-good from a temp file; added tests validating staging, symlink refusal, atomic install ordering, and hash refresh behavior.
Integration tests & sandbox helper
test/openclaw-config-snapshot.test.ts
Added writeFakeSandboxBins to emit fake openshell/ssh scripts; refactored existing test to use it and added #5202 integration test validating backup sanitization and restore precedence across rebuild.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5174: Related ownership-aware merge updates for mergeOpenClawRestoredConfig.
  • NVIDIA/NemoClaw#5177: Changes to mergeOpenClawRestoredConfig addressing provider/model ownership semantics.

Suggested labels

integration: openclaw, area: sandbox, bug-fix, v0.0.63

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 I hopped through backups, with whiskers all keen,
Kept the fresh keys bright and the tuning unseen,
MCP paths remembered, secrets stayed clean,
Rebuilds stay spry while the old wisdom's seen,
A gentle restore — snug, safe, and serene.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main change: preserving reporter-owned model metadata during openclaw.json rebuild/restore.
Linked Issues check ✅ Passed The PR addresses the core coding requirement from #5202: implement ownership-based merge logic to restore reporter-tuned model metadata (reasoning, cost, maxTokens, compat, etc.) while preserving fresh runtime fields.
Out of Scope Changes check ✅ Passed Changes are focused on model metadata restoration and .last-good anchor refresh required by #5202; MCP server preservation and backup/restore helpers support the core fix without introducing unrelated functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@yimoj yimoj added v0.0.64 Release target and removed v0.0.64 Release target labels Jun 11, 2026
…ld restore

Real-CLI E2E (onboard -> tune in-sandbox config -> `nemoclaw <name> rebuild`)
revealed a second layer of NVIDIA#5202 that the merge fix alone did not cover: even
when the restore writes the correctly merged config, OpenClaw's own
config-integrity protection reverts it.

OpenClaw guards openclaw.json with a `.last-good` recovery anchor. On its
integrity check it archives any live config that differs from `.last-good` as
`openclaw.json.clobbered.<ts>` and reverts to `.last-good`. The rebuild restore
wrote openclaw.json (and `.config-hash`) but never refreshed `.last-good`, so
OpenClaw reverted the restored user config to the freshly generated baseline
captured at first boot. The archived `.clobbered` file held the correct merged
config, confirming the merge worked but was undone downstream.

`buildStateFileRestoreCommand` now refreshes the `.last-good` anchor for the
openclaw.json merge restore, staged through a temp + atomic rename and failing
closed BEFORE the live config is swapped, so OpenClaw's integrity watcher never
observes a config that disagrees with its recovery anchor and a partial/failed
anchor write can never leave a stale revert target.

Verified through the real worktree CLI on a live OpenClaw sandbox: after
`node ./bin/nemoclaw.js <name> rebuild`, /sandbox/.openclaw/openclaw.json keeps
the reporter's reasoning/cost/maxTokens/compat/input and mcp.servers with no
`.clobbered` revert, and survives a subsequent container restart. Adds a
command-string unit test (anchor refresh + ordering + fail-closed) and extends
the backup/restore integration test to assert the `.last-good` anchor.

Fixes NVIDIA#5202

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
@yimoj yimoj added the v0.0.64 Release target label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.64 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Openclaw config partially restored after nemoclaw <name> rebuild after upgrading nemoclaw version

1 participant