Skip to content

refactor(messaging): persist session registry state as plans#4945

Open
sandl99 wants to merge 46 commits into
mainfrom
feat/messaging-plan-session-registry
Open

refactor(messaging): persist session registry state as plans#4945
sandl99 wants to merge 46 commits into
mainfrom
feat/messaging-plan-session-registry

Conversation

@sandl99

@sandl99 sandl99 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrate onboard session and sandbox registry messaging persistence from legacy flat fields to the manifest-backed messaging plan. This removes legacy conflict backfill paths and keeps channel config, active/disabled state, and rebuild/session resume behavior plan-driven.

Related Issue

Fixes #4947
Parent: #3896

Changes

  • Store session messaging state only as messagingPlan and registry messaging state only as messaging.plan.
  • Derive configured, active, disabled, and config values from plans across onboard, rebuild, inventory, doctor, status, and channel lifecycle flows.
  • Remove legacy conflict-detection backfill code and tests for messagingChannels, disabledChannels, and messagingChannelConfig.
  • Update unit, integration, and e2e fixtures to assert plan-backed registry/session state.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

Passed: npm run typecheck:cli, npm run test-size:check, npm run source-shape:check, npm run build:cli, shell syntax checks for edited e2e scripts, and focused Vitest coverage for the messaging/session/registry paths (23 files, 426 tests).

Attempted npm test, npx prek run --all-files, commit hooks, and push hooks. They fail in unrelated CLI tests: auto-pair timing in test/nemoclaw-start.test.ts, debug wrapper timeouts in test/secret-redaction.test.ts, policy disclosure timeout in test/policies.test.ts, and intermittent doctor gateway timeout tests in test/cli/doctor-gateway-token.test.ts.

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: San Dang sdang@nvidia.com

Summary by CodeRabbit

  • Refactor

    • Messaging moved from legacy channel arrays to a single schema-versioned messaging plan; registry and session persist structured messaging state and selection flows now use the plan.
    • Channel add/remove and rebuild flows updated to use plan-based state and simplified conflict handling.
  • Bug Fixes

    • Status/overlap reporting and conflict checks now rely on stored plans and run best-effort without failing the command.
  • Tests

    • Test suites, E2E scripts, and fixtures updated to the new messaging plan format.

sandl99 and others added 30 commits June 7, 2026 11:49
Stores the compiled SandboxMessagingPlan in the onboard session so that
resume runs can restore the plan to env without re-running enrollment
hooks (token paste, QR pairing). Fixes the gap where the registry entry
lost its `messaging` field on rebuild because the plan was only held in
a process env var that didn't survive across process restarts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…el-setup

Exports readMessagingPlanFromEnv and writePlanToEnv from
messaging-channel-setup.ts (which already owns MessagingSetupApplier)
to keep src/lib/onboard.ts from growing. Collapses the one-name
messaging-channel-setup require into a single line to free headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…an persistence

- Extract parseSandboxMessagingPlan to messaging-plan-session.ts to
  keep onboard-session.ts growth under the monolith threshold
- Guard plan restoration with sandbox-name + agent identity check so
  stale plans from renamed sandboxes or agent switches are not reused
- Add three behavior assertions in sandbox.test.ts: fresh setup
  persists env plan to session; matching plan is restored to env on
  non-interactive resume; mismatched sandbox name skips restoration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tive resume

rebuild.ts stages an authoritative plan from the registry (which reflects
post-stop/-start channel mutations) before calling onboard --resume.
Previously, the session plan restoration was unconditionally overwriting
that staged plan, causing stopped channels to reappear as active after
rebuild.

Now the handler checks the env first: if a plan is already staged
(rebuild path), it is used as-is. The session plan is only restored when
the env is empty, covering the plain process-restart resume case this PR
was originally targeting.

Also adds a test asserting the rebuild-path preference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- On non-interactive resume, restore plan from registry (always current
  after stop/start/add/remove) instead of stale session snapshot;
  env-first priority preserved so rebuild.ts staging still wins

- In rebuild.ts, persist the staged plan to the session alongside
  messagingChannels/disabledChannels/messagingChannelConfig so the
  session is fully consistent during the rebuild window

- Add getChannelsFromPlan, getDisabledChannelsFromPlan, and
  getMessagingChannelConfigFromPlan helpers in messaging-plan-session.ts
  so the next PR can replace the three individual session fields with
  plan-derived reads

- Move MessagingHostStateApplier re-export to messaging-channel-setup
  and getRegistrySandboxMessagingPlan helper to keep onboard.ts net-neutral

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…itecture (phase 4a)

- Add src/lib/messaging/applier/conflict-detection.ts with all core logic:
  ConflictRegistryEntry minimal interface, createMessagingConflictProbe factory
  (consolidates the duplicated tri-state probe pattern from onboard.ts,
  policy-channel.ts, and status-command-deps.ts), plan-to-request helpers
  (getActiveChannelIdsFromPlan, getCredentialHashesFromPlan,
  planToConflictChannelRequests), plan-preferred/legacy-fallback entry
  resolution (resolveActiveChannelsFromEntry, resolveCredentialHashesFromEntry),
  and pure detection functions (findConflictsInEntries, detectAllOverlapsInEntries,
  backfillLegacyEntryChannels)

- Rewrite messaging-conflict.ts as a thin public adapter: no detection logic
  inside, all three public exports delegate to applier functions, re-exports
  createMessagingConflictProbe so callers don't need to import from applier
  directly; removes getChannelDef/getChannelTokenKeys import (stored hashes
  are self-describing — no channel-constant lookup needed)

- Update onboard.ts conflict-check block: remove makeConflictProbe() and the
  MESSAGING_CHANNELS/hashCredential block; use createMessagingConflictProbe
  with injected openshell deps and findChannelConflictsFromPlan from plan

- Add plan-path tests: planToConflictChannelRequests grouping and exclusions,
  findChannelConflictsFromPlan matching/disabled/no-hash cases, plan-only
  registry entries in findChannelConflicts and findAllOverlaps

Closes #4392

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant createMessagingConflictProbe named import from
  messaging-conflict.ts (symbol is already re-exported on the line below;
  flagged by CodeQL unused-import rule)
- Fix checkGatewayLiveness in onboard.ts to use runOpenshell exit status
  instead of stdout length; a healthy gateway with no sandboxes returns
  empty output with status 0, which the previous check incorrectly treated
  as "down"

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Fix credentialHash gate: gate on credentialAvailable (set by the compiler
  for all real onboarding flows) instead of credentialHash (not yet populated),
  which previously caused the conflict check to be silently skipped for all
  manifest-plan onboarding; planToConflictChannelRequests now includes channels
  with credentialAvailable=true and no hash, falling through to unknown-token
  conservative detection
- Fix cross-channel hash contamination: getCredentialHashesFromPlan now accepts
  an optional channelId filter; conflictReasonForRequest and
  conflictReasonForPair use the new resolveChannelHashesFromEntry helper so
  Slack or Discord hashes cannot produce false positive unknown-token results
  when checking a Telegram request
- Fix legacy hash fallback: when a plan-backed entry exists but carries no
  credentialHash for the requested channel (migration in flight), fall back to
  providerCredentialHashes so matching-token detection is not silently lost
- Align active-channel semantics with plan-filter.ts: getActiveChannelIdsFromPlan
  now also checks channel.active && !channel.disabled, not only disabledChannels
- Split plan-driven tests into src/lib/messaging/applier/conflict-detection.test.ts
  to offset the messaging-conflict.test.ts monolith growth; adds WhatsApp no-op
  test, cross-channel isolation, credentialAvailable-gate, and legacy-fallback cases

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…e and populate credentialHash in plan bindings

Remove hardcoded PROVIDER_SUFFIXES/KNOWN_CHANNEL_IDS from conflict-detection.ts; backfillLegacyEntryChannels now receives the suffix map as an injected parameter. The adapter layer (messaging-conflict.ts) derives it from BUILT_IN_CHANNEL_MANIFESTS credentials so adding a channel to the manifest registry propagates automatically.

Also fix credential-binding-engine to hash the env var value into credentialHash when credentialAvailable is true, enabling matching-token conflict detection on the plan-driven path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove legacy providerCredentialHashes fallback from resolveChannelHashesFromEntry — entries without a plan now return {} and fall through to unknown-token (conservative). Fix PROVIDER_SUFFIXES to collect all credential suffixes per manifest rather than only credentials[0], so multi-credential channels like Slack probe all their provider names during backfill (channel active if any present). Split 473-line conflict-detection.test.ts monolith into conflict-detection-plan.test.ts (plan helpers) and conflict-detection-entry.test.ts (detection functions) to stay under the monolith growth threshold.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…omments

Reject the env-staged plan when its sandboxName does not match the current
sandbox being created, preventing a stale plan from a prior run from gating
or bypassing conflict detection for a different sandbox.

Update stale comments that described the credential-binding engine as not yet
emitting credentialHash — it does now. Rephrase to reflect the remaining
no-hash case (registry-only resume without a compiler re-run).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… test fixtures

Restore the providerCredentialHashes fallback in resolveChannelHashesFromEntry
for entries whose plan carries no channel hashes (e.g. registry-only resume
without a compiler re-run), preserving safety coverage during migration.

Fix tgChannel(false, true) → tgChannel(true, true) in disabled-channel test
cases so assertions isolate plan.disabledChannels as the only gate rather than
passing via the inactive-channel path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dings with active channels

Address PR Review Advisor findings on phase-4a conflict detection:

- resolveChannelHashesFromEntry now filters legacy providerCredentialHashes to
  only the providerEnvKey values declared for the requested channel in
  BUILT_IN_CHANNEL_MANIFESTS, preventing unrelated Slack hashes from producing
  false Telegram conflicts and vice versa. Unknown channels retain the full map.

- planToConflictChannelRequests now intersects credentialBindings with
  getActiveChannelIdsFromPlan so stale bindings for inactive or absent channels
  cannot generate false conflict requests.

- Fix misleading JSDoc on findChannelConflictsFromPlan: available bindings with
  no credentialHash are included (not excluded) and fall through to conservative
  unknown-token detection.

- Add regression tests: cross-channel legacy hash scoping, active-channel binding
  intersection (absent channel, active=false), and slackChannel fixture for
  plan test file.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…rison and add no-plan fallback

Address second round of PR Review Advisor findings:

- conflictReasonForRequest and conflictReasonForPair now use manifest-declared
  CHANNEL_CREDENTIAL_ENV_KEYS as the primary comparison set. For multi-credential
  channels like Slack (SLACK_BOT_TOKEN + SLACK_APP_TOKEN), a differing bot token
  with a missing app token previously returned null; it now returns unknown-token
  because the missing manifest key marks the comparison as incomplete.

- onboard.ts conflict gate: when hasPlanCredentials is false but token-backed
  channels were selected (enabledChannels contains non-QR channels), fall back to
  findChannelConflicts with just the channel names to preserve conservative
  unknown-token warnings even when the compiled plan has no credential data
  (e.g. credential-store-backed availability or stale env plan).

- policy-channel.ts: add explicit comment scoping the add-channel path as a
  known limitation — no compiled plan is available at channels-add time, so
  findChannelConflicts is used with caller-built hashes; plan-driven migration
  is follow-up work.

- Tests: four Slack partial-hash suppression cases covering the false-negative
  regression (conflictReasonForRequest with missing app token, both differ,
  conflictReasonForPair with missing app token from both sides, both differ).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…nifest-driven hashing

Address third round of PR Review Advisor findings:

- messaging-conflict.ts: add findChannelConflictsForOnboarding() that unions
  the plan-based hash-precise check with a channel-name-only fallback for any
  selectedTokenChannels not covered by the plan's credentialAvailable bindings.
  This ensures a partial or stale same-sandbox plan cannot silently skip channels
  it omits. onboard.ts now delegates to this single function rather than
  inlining the union logic.

- onboard.ts conflict gate: replace the multi-branch plan/fallback block with a
  single call to findChannelConflictsForOnboarding(sandboxName, currentPlan,
  selectedTokenChannels, registry), keeping all conflict logic in messaging-conflict.ts.

- policy-channel.ts checkChannelAddConflict: replace raw acquired-map iteration
  with manifest-driven hash building via createBuiltInChannelManifestRegistry().
  Iterates channelManifest.credentials[].providerEnvKey to build credentialHashes,
  mirroring planToConflictChannelRequests without requiring a compiled plan.
  QR-only channels (WhatsApp: credentials=[]) exit early with no conflict possible.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…tection

Legacy entries no longer carry hash metadata into conflict comparisons.
resolveChannelHashesFromEntry returns {} for entries without a plan, which
falls through to conservative unknown-token detection in all callers.

- ConflictRegistryEntry: drop providerCredentialHashes field
- resolveChannelHashesFromEntry: plan path only; legacy entries → {}
- CHANNEL_CREDENTIAL_ENV_KEYS comment updated (now solely for manifest-key
  comparison set, not legacy scoping)
- Section comment: "plan-preferred, legacy-fallback" → "Entry resolution"

- Remove findChannelConflictsForOnboarding and fallback logic; onboard.ts
  calls findChannelConflictsFromPlan directly (rationale in issue #4392)

Test updates:
- conflict-detection-entry.test.ts: remove legacy cross-channel hash scoping
  describe block; migrate Slack pair tests from providerCredentialHashes
  fixtures to plan-backed entries
- messaging-conflict.test.ts: remove hash-comparison tests (covered by
  plan-entry tests); update remaining fixtures to drop providerCredentialHashes;
  update file header comment

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…aths

Completes the providerCredentialHashes deprecation started in the phase
4a conflict-detection migration. The field was written by onboard.ts,
policy-channel.ts, and rebuild.ts, and read by messaging-credentials.ts
and workflow-planner.ts for rotation detection and credential-availability
inference.

All write paths are removed. Reads are replaced with plan-backed
equivalents: detectMessagingCredentialRotation now reads
credentialBindings.credentialHash from the compiled SandboxMessagingPlan,
and credentialAvailabilityFromSandboxEntry reads
plan.credentialBindings[].credentialAvailable. The field is removed from
SandboxEntry and MessagingWorkflowPlannerSandboxEntry entirely; existing
sandboxes.json files with stale providerCredentialHashes entries are
unaffected at runtime (unknown JSON fields are ignored).

Tests updated to use plan-backed fixtures for matching-token conflict
scenarios; credential-rotation tests use messaging.plan.credentialBindings
fixtures.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…-messaging

Both files shrank after removing providerCredentialHashes fixtures and
assertions in the preceding refactor commit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Take nemoclaw-start budget reduction from main (5310→5300) and keep
onboard-messaging reduction from this branch (2122→2110).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
sandl99 and others added 2 commits June 8, 2026 17:03
…ort, function or class'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@sandl99 sandl99 added refactor PR restructures code without intended behavior change area: messaging Messaging channels, bridges, manifests, or channel lifecycle labels Jun 8, 2026
Base automatically changed from feat/phase-4a-messaging-conflict-plan-driven to main June 8, 2026 15:04
…ession-registry

# Conflicts:
#	ci/test-file-size-budget.json
#	src/lib/actions/sandbox/policy-channel-conflict.test.ts
#	src/lib/actions/sandbox/policy-channel.ts
#	src/lib/inventory/index.ts
#	src/lib/messaging/applier/conflict-detection-legacy.test.ts
#	src/lib/messaging/compiler/workflow-planner.test.ts
#	src/lib/messaging/compiler/workflow-planner.ts
#	src/lib/onboard.ts
#	src/lib/state/registry.ts
#	src/lib/status-command-deps.ts
#	test/channels-add-preset.test.ts
#	test/channels-remove-full-teardown.test.ts
#	test/cli/status-health.test.ts
#	test/onboard-messaging.test.ts
#	test/rebuild-credential-preflight.test.ts
@copy-pr-bot

copy-pr-bot Bot commented Jun 8, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@NVIDIA NVIDIA deleted a comment from github-actions Bot Jun 9, 2026
Signed-off-by: San Dang <san1201.bkhn@gmail.com>
@NVIDIA NVIDIA deleted a comment from github-actions Bot Jun 9, 2026
sandl99 and others added 3 commits June 9, 2026 10:25
…ession-registry

Signed-off-by: San Dang <san1201.bkhn@gmail.com>

# Conflicts:
#	ci/test-file-size-budget.json
#	test/cli/status-health.test.ts
Signed-off-by: San Dang <san1201.bkhn@gmail.com>
@sandl99 sandl99 marked this pull request as ready for review June 9, 2026 03:52

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/lib/onboard/machine/handlers/sandbox.ts (1)

265-302: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve existing session plan when no fresher plan source exists.

At Line 265, messagingPlan is initialized to null, then persisted at Lines 299-301 and Line 345. If both env and registry return no plan, this overwrites an existing session.messagingPlan with null during resume, dropping persisted channel/disabled state for later resume/rebuild paths.

Suggested fix
-    let messagingPlan: SandboxMessagingPlan | null = null;
+    let messagingPlan: SandboxMessagingPlan | null = session?.messagingPlan ?? null;
...
     session = deps.updateSession((current) => {
-      current.messagingPlan = messagingPlan;
+      current.messagingPlan = messagingPlan ?? current.messagingPlan ?? null;
       return current;
     });

Also applies to: 337-346

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/machine/handlers/sandbox.ts` around lines 265 - 302, The code
currently initializes messagingPlan to null and then writes it into session via
deps.updateSession, which can erase an existing session.messagingPlan when
neither env nor registry provide a fresher plan; change the logic so
messagingPlan is initialized from the current session (e.g., messagingPlan =
session?.messagingPlan) and only overwrite it when
deps.readMessagingPlanFromEnv() or
deps.getRegistrySandboxMessagingPlan(sandboxName) return a non-null plan; ensure
the final deps.updateSession((current) => { current.messagingPlan =
messagingPlan ?? current.messagingPlan; return current; }) preserves the prior
session.messagingPlan when no new plan is found.
src/lib/messaging/applier/host-state-applier.test.ts (1)

53-75: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen this test to actually prove legacy-field cleanup on existing entries.

Line 54 seeds an entry without legacy keys, so Lines 74-75 pass even if cleanup regresses. Seed legacy fields first, then assert all three are removed (messagingChannels, messagingChannelConfig, disabledChannels).

Suggested test adjustment
   it("stores only the new messaging state on an existing sandbox entry", () => {
     registryMock.__setSandbox("demo", {
       name: "demo",
+      messagingChannels: ["telegram"],
+      messagingChannelConfig: { telegram: { requireMention: true } },
+      disabledChannels: ["telegram"],
     });
     const plan = makePlan(["telegram"]);
@@
     expect(registryMock.__getSandbox("demo")).not.toHaveProperty("messagingChannels");
+    expect(registryMock.__getSandbox("demo")).not.toHaveProperty("messagingChannelConfig");
     expect(registryMock.__getSandbox("demo")).not.toHaveProperty("disabledChannels");
   });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/messaging/applier/host-state-applier.test.ts` around lines 53 - 75,
The test currently seeds a sandbox without legacy keys so it can't prove
cleanup; update the setup for registryMock.__setSandbox("demo", ...) to include
legacy fields messagingChannels, messagingChannelConfig, and disabledChannels
with sample values, then call
MessagingHostStateApplier.applyPlanToRegistry("demo", plan) and assert it
returns true, assert registry.updateSandbox was called with the new messaging
object (schemaVersion and plan), and assert registryMock.__getSandbox("demo") no
longer has properties messagingChannels, messagingChannelConfig, or
disabledChannels while still matching the new messaging structure.
src/lib/onboard.ts (1)

368-381: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use configured plan channels here, not only active ones.

getRecordedMessagingChannelsForResumeFromState() is the resume seed for channel selection, so passing getActiveChannelsFromPlan(...) drops plan entries that are configured but currently disabled. On rebuild/resume that can turn a stopped channel into “not selected”, which loses the state channels start needs to reattach cached credentials.

Suggested fix
-const { getActiveChannelsFromPlan } = messagingPlanSession;
+const { getChannelsFromPlan } = messagingPlanSession;
...
-    sessionMessagingChannels: getActiveChannelsFromPlan(session?.messagingPlan),
+    sessionMessagingChannels: getChannelsFromPlan(session?.messagingPlan),

As per coding guidelines, src/lib/onboard.ts changes affect channel disable/enable lifecycle across rebuild with cached credentials, and the PR objective here is to keep configured, active, and disabled plan state distinct.

Also applies to: 5198-5206

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 368 - 381, The code is using
getActiveChannelsFromPlan(...) as the seed for
getRecordedMessagingChannelsForResumeFromState(), which drops plan entries that
are configured but currently disabled; change the call so the resume seed uses
the plan's configured channels instead of only active ones — i.e., pass the
configured channel list (obtainable from the plan/config: use
messagingPlanSession or
collectMessagingBuildConfig/getStoredMessagingChannelConfig outputs) into
getRecordedMessagingChannelsForResumeFromState rather than
getActiveChannelsFromPlan, so disabled-but-configured channels are preserved for
resume and credential reattachment.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/policy/index.ts`:
- Around line 779-782: The console.log that reports "Widening sandbox egress —
adding: ..." is currently executed right after const endpoints =
getPresetEndpoints(presetContent) and can fire even when the subsequent
precondition checks abort (lines around the apply-preconditions block). Move the
log statement so it runs only after the precondition pass/abort logic (i.e.,
after the block that currently decides to abort on failure), ensuring you still
reference the same endpoints variable returned by getPresetEndpoints; keep the
log position immediately before the code that actually applies the egress
changes so operators only see the "adding" message when changes will be applied.

In `@test/e2e/docs/parity-inventory.generated.json`:
- Line 7989: The generated JSON fixture parity-inventory.generated.json is
missing SPDX metadata; update the generation routine that produces this file so
the output includes SPDX fields (e.g., a top-level "$comment" object or entries)
containing SPDX-FileCopyrightText and SPDX-License-Identifier like other
fixtures; modify the generator (the code that writes
parity-inventory.generated.json) to prepend or embed the SPDX "$comment"
metadata before/alongside the root object (preserving keys such as "text") so
the produced JSON matches the SPDX format used by other test fixtures.

---

Outside diff comments:
In `@src/lib/messaging/applier/host-state-applier.test.ts`:
- Around line 53-75: The test currently seeds a sandbox without legacy keys so
it can't prove cleanup; update the setup for registryMock.__setSandbox("demo",
...) to include legacy fields messagingChannels, messagingChannelConfig, and
disabledChannels with sample values, then call
MessagingHostStateApplier.applyPlanToRegistry("demo", plan) and assert it
returns true, assert registry.updateSandbox was called with the new messaging
object (schemaVersion and plan), and assert registryMock.__getSandbox("demo") no
longer has properties messagingChannels, messagingChannelConfig, or
disabledChannels while still matching the new messaging structure.

In `@src/lib/onboard.ts`:
- Around line 368-381: The code is using getActiveChannelsFromPlan(...) as the
seed for getRecordedMessagingChannelsForResumeFromState(), which drops plan
entries that are configured but currently disabled; change the call so the
resume seed uses the plan's configured channels instead of only active ones —
i.e., pass the configured channel list (obtainable from the plan/config: use
messagingPlanSession or
collectMessagingBuildConfig/getStoredMessagingChannelConfig outputs) into
getRecordedMessagingChannelsForResumeFromState rather than
getActiveChannelsFromPlan, so disabled-but-configured channels are preserved for
resume and credential reattachment.

In `@src/lib/onboard/machine/handlers/sandbox.ts`:
- Around line 265-302: The code currently initializes messagingPlan to null and
then writes it into session via deps.updateSession, which can erase an existing
session.messagingPlan when neither env nor registry provide a fresher plan;
change the logic so messagingPlan is initialized from the current session (e.g.,
messagingPlan = session?.messagingPlan) and only overwrite it when
deps.readMessagingPlanFromEnv() or
deps.getRegistrySandboxMessagingPlan(sandboxName) return a non-null plan; ensure
the final deps.updateSession((current) => { current.messagingPlan =
messagingPlan ?? current.messagingPlan; return current; }) preserves the prior
session.messagingPlan when no new plan is found.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 68f84691-37d0-4121-9c42-9e97991dc3f2

📥 Commits

Reviewing files that changed from the base of the PR and between d9db199 and a30c537.

📒 Files selected for processing (65)
  • ci/test-file-size-budget.json
  • src/lib/actions/inference-route-api.test.ts
  • src/lib/actions/inference-set.test.ts
  • src/lib/actions/sandbox/channel-status.test.ts
  • src/lib/actions/sandbox/channel-status.ts
  • src/lib/actions/sandbox/doctor.ts
  • src/lib/actions/sandbox/policy-channel-conflict.test.ts
  • src/lib/actions/sandbox/policy-channel.ts
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/inventory/index.test.ts
  • src/lib/inventory/index.ts
  • src/lib/messaging/applier/conflict-detection-legacy.test.ts
  • src/lib/messaging/applier/conflict-detection/backfill.ts
  • src/lib/messaging/applier/conflict-detection/entries.ts
  • src/lib/messaging/applier/conflict-detection/index.ts
  • src/lib/messaging/applier/conflict-detection/probe.ts
  • src/lib/messaging/applier/conflict-detection/types.ts
  • src/lib/messaging/applier/host-state-applier.test.ts
  • src/lib/messaging/compiler/workflow-planner.test.ts
  • src/lib/messaging/compiler/workflow-planner.ts
  • src/lib/onboard.ts
  • src/lib/onboard/channel-state.test.ts
  • src/lib/onboard/channel-state.ts
  • src/lib/onboard/machine/events.ts
  • src/lib/onboard/machine/handlers/policies.test.ts
  • src/lib/onboard/machine/handlers/policies.ts
  • src/lib/onboard/machine/handlers/sandbox-messaging-plan.test.ts
  • src/lib/onboard/machine/handlers/sandbox.test.ts
  • src/lib/onboard/machine/handlers/sandbox.ts
  • src/lib/onboard/messaging-config.ts
  • src/lib/onboard/messaging-credentials.ts
  • src/lib/onboard/messaging-plan-session.test.ts
  • src/lib/onboard/messaging-plan-session.ts
  • src/lib/onboard/messaging-reuse.test.ts
  • src/lib/onboard/messaging-reuse.ts
  • src/lib/onboard/session-updates.ts
  • src/lib/policy/index.ts
  • src/lib/sandbox/whatsapp-diagnostics.ts
  • src/lib/state/onboard-session.test.ts
  • src/lib/state/onboard-session.ts
  • src/lib/state/registry-messaging.ts
  • src/lib/state/registry.ts
  • src/lib/status-command-deps.ts
  • test/channels-add-preset.test.ts
  • test/channels-remove-full-teardown.test.ts
  • test/cli/connect-recovery.test.ts
  • test/cli/doctor-gateway-token.test.ts
  • test/cli/list-share-live-inference.test.ts
  • test/cli/logs.test.ts
  • test/e2e/docs/parity-inventory.generated.json
  • test/e2e/test-channels-add-remove.sh
  • test/e2e/test-channels-stop-start.sh
  • test/e2e/test-messaging-providers.sh
  • test/e2e/test-rebuild-hermes.sh
  • test/helpers/messaging-plan-fixtures.ts
  • test/helpers/sandbox-handler-fixtures.ts
  • test/nemoclaw-start.test.ts
  • test/onboard-messaging.test.ts
  • test/onboard.test.ts
  • test/policies.test.ts
  • test/rebuild-credential-preflight.test.ts
  • test/rebuild-shields-auto-unlock.test.ts
  • test/registry.test.ts
  • test/repro-2201.test.ts
  • test/secret-redaction.test.ts
💤 Files with no reviewable changes (9)
  • src/lib/messaging/applier/conflict-detection/probe.ts
  • src/lib/messaging/applier/conflict-detection-legacy.test.ts
  • src/lib/messaging/applier/conflict-detection/backfill.ts
  • src/lib/onboard/session-updates.ts
  • test/onboard.test.ts
  • src/lib/messaging/applier/conflict-detection/types.ts
  • src/lib/messaging/compiler/workflow-planner.test.ts
  • src/lib/messaging/applier/conflict-detection/index.ts
  • src/lib/state/onboard-session.ts

Comment thread src/lib/policy/index.ts Outdated
Comment thread test/e2e/docs/parity-inventory.generated.json
sandl99 added 3 commits June 9, 2026 11:09
Signed-off-by: San Dang <san1201.bkhn@gmail.com>
…stry' into feat/messaging-plan-session-registry
Signed-off-by: San Dang <san1201.bkhn@gmail.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/policies.test.ts (1)

735-735: 💤 Low value

Consider simplifying the mock spy implementation.

The mockImplementation throws a sentinel error when the disclosure string is logged, but the test at Line 747-Line 752 still verifies the log by checking mock.calls. The throw is caught by the inner try-catch at Line 742-Line 745, making it redundant. Either:

  • Remove the throw and keep the no-op mock with the mock.calls assertion, or
  • Catch and verify the sentinel error explicitly instead of checking mock.calls.

As written, the throw stops applyPreset mid-execution but the test doesn't rely on that behavior.

♻️ Proposed simplification
-      const logSpy = vi.spyOn(console, "log").mockImplementation((message) => { if (typeof message === "string" && message.includes("Widening sandbox egress")) throw new Error("__disclosure_logged__"); });
+      const logSpy = vi.spyOn(console, "log").mockImplementation(() => {});

The test assertion at Line 747-Line 752 already verifies the log was called with the correct message.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/policies.test.ts` at line 735, The console.log spy implementation
currently throws a sentinel error which short-circuits applyPreset; change the
mock to a no-op recorder instead (i.e., vi.spyOn(console,
"log").mockImplementation(() => {}) so it doesn't throw) and keep the assertion
that inspects logSpy.mock.calls to verify the "Widening sandbox egress" message;
reference the existing logSpy and the applyPreset call so you update the spy
used by that test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/policies.test.ts`:
- Line 735: The console.log spy implementation currently throws a sentinel error
which short-circuits applyPreset; change the mock to a no-op recorder instead
(i.e., vi.spyOn(console, "log").mockImplementation(() => {}) so it doesn't
throw) and keep the assertion that inspects logSpy.mock.calls to verify the
"Widening sandbox egress" message; reference the existing logSpy and the
applyPreset call so you update the spy used by that test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 149e3ac7-9734-4d70-8aec-193ce65eed01

📥 Commits

Reviewing files that changed from the base of the PR and between a30c537 and 6a79070.

📒 Files selected for processing (3)
  • src/lib/policy/index.ts
  • test/e2e/docs/parity-inventory.generated.json
  • test/policies.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/docs/parity-inventory.generated.json

@NVIDIA NVIDIA deleted a comment from github-actions Bot Jun 9, 2026
@NVIDIA NVIDIA deleted a comment from github-actions Bot Jun 9, 2026
@NVIDIA NVIDIA deleted a comment from github-actions Bot Jun 9, 2026
@cv cv added the v0.0.62 Release target label Jun 9, 2026
@NVIDIA NVIDIA deleted a comment from github-actions Bot Jun 9, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27188836892
Target ref: feat/messaging-plan-session-registry
Requested jobs: all (no filter)
Summary: 64 passed, 0 failed, 2 skipped

Job Result
agent-turn-latency-e2e ✅ success
bedrock-runtime-compatible-anthropic-e2e ✅ success
brave-search-e2e ✅ success
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ✅ success
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
common-egress-agent-e2e ✅ success
concurrent-gateway-ports-e2e ✅ success
credential-migration-e2e ✅ success
credential-sanitization-e2e ✅ success
device-auth-health-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
double-onboard-e2e ✅ success
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-anthropic-inference-switch-e2e ✅ success
hermes-dashboard-e2e ✅ success
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success
hermes-slack-e2e ✅ success
inference-routing-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success
issue-3600-gpu-proof-optional-e2e ✅ success
issue-4434-tui-unreachable-inference-e2e ✅ success
issue-4462-gateway-pinned-approval-characterization-e2e ✅ success
issue-4462-scope-upgrade-approval-e2e ✅ success
kimi-inference-compat-e2e ✅ success
launchable-smoke-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
onboard-negative-paths-e2e ✅ success
onboard-repair-e2e ✅ success
onboard-resume-e2e ✅ success
openclaw-anthropic-inference-switch-e2e ✅ success
openclaw-discord-pairing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
openclaw-onboard-security-posture-e2e ✅ success
openclaw-skill-cli-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
rebuild-openclaw-e2e ✅ success
runtime-overrides-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
sessions-agents-cli-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ✅ success
telegram-injection-e2e ✅ success
token-rotation-e2e ✅ success
tunnel-lifecycle-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success
vm-driver-privileged-exec-routing-e2e ✅ success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: messaging Messaging channels, bridges, manifests, or channel lifecycle refactor PR restructures code without intended behavior change v0.0.62 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 4c: migrate session and registry messaging state to plans

3 participants