Skip to content

refactor(messaging): use plan-backed channel state#5123

Open
sandl99 wants to merge 9 commits into
mainfrom
fix/4947-plan-backed-messaging-state
Open

refactor(messaging): use plan-backed channel state#5123
sandl99 wants to merge 9 commits into
mainfrom
fix/4947-plan-backed-messaging-state

Conversation

@sandl99

@sandl99 sandl99 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrates onboard session and registry messaging state to the compiled messaging plan so lifecycle, status, and rebuild flows no longer depend on legacy flat channel fields. This removes the old conflict backfill/probe compatibility path and updates coverage for plan-backed channel state across the migration surface.

Related Issue

Fixes #4947

Changes

  • Added shared messaging plan validation and derivation helpers for session and registry state.
  • Replaced flat registry/session messaging fields with messaging.plan accessors across onboard, rebuild, channel lifecycle, status, doctor, inventory, and policy flows.
  • Removed legacy messaging conflict backfill/probe code and its legacy compatibility test.
  • Updated unit, integration, and edited E2E script assertions to validate plan-backed configured/active/disabled channel state.
  • Ratcheted test file size budgets after moving repeated spawned-fixture helpers into shared test helpers.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Local checks run:

  • npm run typecheck:cli passes
  • npm run build:cli passes
  • npm run test-size:check passes
  • npx @biomejs/biome check test/onboard-messaging.test.ts test/channels-add-preset.test.ts src/lib/onboard/machine/final-flow-phases.test.ts passes
  • Focused migration suite passes: 24 files, 355 tests
  • bash -n passes for the four edited E2E shell scripts
  • npx prek run --all-files was attempted; it fails in this local environment on unrelated full CLI/E2E tests, so it is not checked above.
  • npm test was attempted earlier; it timed out/fails in this local environment on unrelated full CLI/E2E tests, so it is not checked above.

Signed-off-by: San Dang sdang@nvidia.com

Summary by CodeRabbit

  • Refactor
    • Messaging state now uses a validated plan model (messaging.plan / messagingPlan) throughout registry, onboarding/resume, status, channel add/remove, rebuild and disable/enable flows.
  • New Features
    • Added plan parsing/validation and helpers for deriving configured/active/disabled channels and channel configs; registry exposes safe plan helpers and mutation for channel disable.
  • Tests
    • Unit, integration and E2E tests and fixtures migrated to plan-backed shapes and updated assertions.

Signed-off-by: San Dang <sdang@nvidia.com>
@sandl99 sandl99 self-assigned this Jun 10, 2026
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Migrates messaging state to structured SandboxMessagingPlan: adds plan validation and registry messaging helpers, converts session/registry schemas and channel flows to plan-backed logic, removes legacy backfill/probe, and updates tests and e2e fixtures to assert messaging.plan usage.

Changes

Plan Validation Foundation

Layer / File(s) Summary
Plan validation and derivation
src/lib/messaging/plan-validation.ts, src/lib/messaging/plan-validation.test.ts
New parsing, clone, and channel-derivation helpers: parseSandboxMessagingPlan(), cloneSandboxMessagingPlan(), getConfigured/Active/DisabledChannelIdsFromPlan(), and getMessagingChannelConfigFromPlan() with tests.

Session State Migration

Layer / File(s) Summary
Session messaging plan
src/lib/state/onboard-session.ts, src/lib/state/onboard-session.test.ts
Replace legacy session messaging fields with single messagingPlan; update create/normalize/filter logic and tests for plan persistence and safe updates.
Session update normalization
src/lib/onboard/session-updates.ts
Remove legacy messaging fields from OnboardSessionUpdateInput and mapping logic.

Registry Messaging Architecture

Layer / File(s) Summary
Registry messaging module
src/lib/state/registry-messaging.ts
New module exposing SandboxMessagingState, parse/clone helpers, configured/active/disabled channel getters, and setChannelDisabled() with locking semantics.
Registry entry schema
src/lib/state/registry.ts
Replace legacy registry messaging fields with messaging?: SandboxMessagingState; delegate reads/writes to registry-messaging and re-export types/helpers.

Conflict Detection Simplification

Layer / File(s) Summary
Conflict detection simplification
src/lib/messaging/applier/conflict-detection/*
Remove probe/backfill modules and related types; restrict conflict detection to plan-backed entries and shrink ConflictRegistry to read-only contract.

Channel Add/Remove Mutations

Layer / File(s) Summary
Channel add/remove with plan persistence
src/lib/actions/sandbox/policy-channel.ts, src/lib/actions/sandbox/policy-channel-conflict.test.ts
Stop mutating legacy registry lists directly; validate channel presence in plan, persist plan changes with boolean success, and update tests to use plan/no-plan fixtures.

Sandbox Lifecycle

Layer / File(s) Summary
Rebuild and sandbox state
src/lib/actions/sandbox/rebuild.ts, src/lib/onboard/machine/handlers/sandbox.ts, src/lib/onboard/machine/handlers/sandbox.test.ts
Stage messagingPlan for resume/create flows, derive disabled channels from staged plan, and stop persisting legacy messaging fields.
Disabled channel resolution
src/lib/onboard/channel-state.ts, src/lib/onboard/channel-state.test.ts
resolveDisabledChannels() reads env/session/registry messaging plans and derives disabled channels via plan helpers; added optional env hook and tests.

Onboard Messaging Integration

Layer / File(s) Summary
Onboard flow messaging updates
src/lib/onboard.ts, src/lib/onboard/messaging-plan-session.ts, src/lib/onboard/messaging-config.ts, src/lib/onboard/messaging-credentials.ts, src/lib/onboard/machine/events.ts, src/lib/onboard/machine/handlers/policies.ts
Compute active channels from plans (getActiveChannelsFromPlan), remove legacy env hydration/persistence hooks, update policy handling and events to use plans, and change messaging reuse to accept getConfiguredChannels() callback.

Inventory and Status

Layer / File(s) Summary
Messaging bridge health
src/lib/inventory/index.ts, src/lib/inventory/index.test.ts
Compute bridge channels from entry.messaging?.plan via getActiveChannelIdsFromPlan() and update tests to use messagingState() helper.
Status command dependencies
src/lib/status-command-deps.ts
Replace backfill/probe overlap flow with findStoredMessagingOverlaps() that reads stored plans and fails safely.

Channel Status and Doctor

Layer / File(s) Summary
Channel status and doctor
src/lib/actions/sandbox/channel-status.ts, src/lib/actions/sandbox/channel-status.test.ts, src/lib/actions/sandbox/doctor.ts
Read configured/disabled channels from registry helpers; update default channel selection, paused detection, and tests to use plan-backed entry fixtures.

Messaging Host State

Layer / File(s) Summary
Host state and workflow planner
src/lib/messaging/applier/host-state-applier.test.ts, src/lib/messaging/compiler/workflow-planner.ts
Adapt host-state tests and planner helpers to expect messaging: { schemaVersion, plan } and derive disabled channels from plan.

Diagnostics

Layer / File(s) Summary
WhatsApp diagnostics
src/lib/sandbox/whatsapp-diagnostics.ts
Docs and diagnostic messages updated to reference "messaging plan" wording.

Unit and Integration Test Fixtures

Layer / File(s) Summary
Session and registry fixtures
many test files (e.g., src/lib/actions/*, test/*)
Add makeMessagingPlan(), makeEmptyEntry(), sessionWithPlan(), messagingState() helpers; update fixtures and assertions to use messagingPlan / messaging.plan.
Integration test harnesses and parsing helpers
test/onboard-messaging.test.ts, test/channels-add-preset.test.ts, etc.
Add robust stdout JSON parsing, inline plan helpers, and update mocks & assertions to validate plan channels and disabledChannels.

E2E Test Infrastructure

Layer / File(s) Summary
E2E helpers and assertions
test/e2e/*
Add registry_plan_channel_contains() and registry_plan_disabled_contains() helpers and update E2E assertions to validate messaging.plan channel presence, disabled state, and per-channel inputs.

Test Size Budgets

Layer / File(s) Summary
Test file budgets
ci/test-file-size-budget.json
Adjust legacy max-lines budgets for large updated tests to account for fixture expansions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • #3802 — overlaps with onboarding/messaging state refactor touching similar FSM files and data contracts.

Possibly related PRs

Suggested labels

enhancement: messaging

"I hopped through sandboxes, tidy and spry,
Replaced loose lists with a structured sky.
Plans in the registry, sessions made neat,
Tests hummed in order — carrots for a treat.
🐇📦"

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4947-plan-backed-messaging-state

@sandl99 sandl99 added area: messaging Messaging channels, bridges, manifests, or channel lifecycle refactor PR restructures code without intended behavior change labels Jun 10, 2026
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: cloud-onboard-e2e, channels-add-remove-e2e, channels-stop-start-e2e, messaging-providers-e2e, rebuild-openclaw-e2e, rebuild-hermes-e2e, diagnostics-e2e, network-policy-e2e
Optional E2E: cloud-e2e, openclaw-inference-switch-e2e, state-backup-restore-e2e, token-rotation-e2e

Dispatch hint: cloud-onboard-e2e,channels-add-remove-e2e,channels-stop-start-e2e,messaging-providers-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,diagnostics-e2e,network-policy-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • cloud-onboard-e2e (medium): Required because onboarding/session/registry/sandbox-registration code changed; validates a clean install and real OpenClaw onboarding path with persisted sandbox state.
  • channels-add-remove-e2e (high): Required because channel add/remove runtime code and the corresponding E2E script changed; validates plan-backed durable channel state, provider reuse, policy application, rebuild after add/remove, and registry updates.
  • channels-stop-start-e2e (high): Required because disabled/enabled channel state moved to plan-backed registry helpers and stop/start behavior is directly affected across OpenClaw/Hermes and multiple channels.
  • messaging-providers-e2e (high): Required because messaging credential/provider planning, host state application, registry bindings, and provider config generation changed; validates OpenShell provider isolation and placeholder/token wiring for messaging channels.
  • rebuild-openclaw-e2e (high): Required because rebuild.ts and durable messaging plan persistence can affect rebuilding an existing OpenClaw sandbox and regenerating runtime config from registry state.
  • rebuild-hermes-e2e (high): Required because rebuild.ts and the Hermes rebuild E2E script changed; validates Hermes-specific rebuild and plan/state migration behavior against a real sandbox.
  • diagnostics-e2e (medium): Required because doctor and channels status now derive configured/disabled channel state through registry helpers; validates real operator diagnostics and health-reporting behavior.
  • network-policy-e2e (medium): Required because channel policy application and messaging plan validation can affect generated network-policy boundaries for channel-enabled sandboxes.

Optional E2E

  • cloud-e2e (high): Useful broader confidence because the full install → onboard → sandbox verify → live inference journey may catch integration fallout from session/registry changes not covered by cloud-onboard alone.
  • openclaw-inference-switch-e2e (medium): Optional adjacent coverage for onboard-session compatibility with inference switching; inference source does not appear to be changed, but session shape changes are touched.
  • state-backup-restore-e2e (medium): Optional because registry and onboard-session schema/state migration changes may affect backup/restore fidelity, but the PR focus is messaging plan durability rather than backup tooling.
  • token-rotation-e2e (high): Optional higher-cost confidence for credential rotation through Telegram/Discord/Slack after the provider/credential binding changes; required messaging-providers and channel lifecycle jobs cover the main fake-token paths.

New E2E recommendations

  • registry schema migration for messaging plans (high): This PR removes reliance on legacy messagingChannels/disabledChannels fields. Existing channel E2Es cover fresh plan-backed flows, but there should be an end-to-end stale-registry upgrade test that starts from a legacy registry entry with messagingChannels/disabledChannels and verifies doctor, channels status, rebuild, add/remove, and stop/start behavior after migration.
    • Suggested test: Add an E2E that seeds a legacy sandboxes.json messagingChannels/disabledChannels entry, upgrades to the PR build, then runs channels status, doctor, rebuild, channels add/remove, and channels stop/start against the migrated plan-backed registry state.
  • diagnostics for paused WhatsApp plan state (medium): Channel status has special handling to skip deep WhatsApp probing when a channel is paused through plan disabledChannels. Existing diagnostics E2E may not assert paused WhatsApp JSON/text output from a plan-backed registry.
    • Suggested test: Add a focused E2E or scenario that enables WhatsApp, pauses it, rebuilds, then verifies channels status --channel whatsapp and doctor report the intentional paused state without false unhealthy bridge failures.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,channels-add-remove-e2e,channels-stop-start-e2e,messaging-providers-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,diagnostics-e2e,network-policy-e2e

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: ubuntu-repo-cloud-openclaw, ubuntu-repo-docker-post-reboot-recovery
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-docker-post-reboot-recovery

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required Vitest E2E scenarios

  • ubuntu-repo-cloud-openclaw: Core onboarding, registry/session persistence, inventory, messaging-plan plumbing, and sandbox action changes can affect the primary live-supported Ubuntu Docker OpenClaw path. This is the smallest trusted-main live Vitest scenario that exercises baseline onboarding and post-onboard state validation for those surfaces.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
  • ubuntu-repo-docker-post-reboot-recovery: Registry and sandbox state changes also affect the live-supported post-reboot recovery lifecycle, which validates local registry preservation and Docker sandbox container recovery after a host-observable status/recovery action.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-docker-post-reboot-recovery

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • src/lib/actions/sandbox/channel-status.ts
  • src/lib/actions/sandbox/doctor.ts
  • src/lib/actions/sandbox/policy-channel.ts
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/inventory/index.ts
  • src/lib/messaging/compiler/workflow-planner.ts
  • src/lib/messaging/plan-validation.ts
  • src/lib/onboard.ts
  • src/lib/onboard/channel-state.ts
  • src/lib/onboard/machine/events.ts
  • src/lib/onboard/machine/handlers/policies.ts
  • src/lib/onboard/machine/handlers/sandbox.ts
  • src/lib/onboard/messaging-config.ts
  • src/lib/onboard/messaging-credentials.ts
  • src/lib/onboard/messaging-plan-session.ts
  • src/lib/onboard/messaging-reuse.ts
  • src/lib/onboard/sandbox-registration.ts
  • src/lib/onboard/session-updates.ts
  • src/lib/sandbox/whatsapp-diagnostics.ts
  • src/lib/state/onboard-session.ts
  • src/lib/state/registry-messaging.ts
  • src/lib/state/registry.ts

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 5 needs attention, 10 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 9 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • Remove remaining legacy flat messaging state compatibility paths (src/lib/messaging/applier/conflict-detection/backfill.ts:61): Issue Phase 4c: migrate session and registry messaging state to plans #4947 explicitly says this phase is intentionally not backward-compatible and that messagingPlan/messaging.plan are the only supported persisted messaging state. The branch still reads, writes, and tests legacy flat registry fields, so the plan-only acceptance criteria are not met.
    • Recommendation: Remove the remaining flat-field reads/writes and legacy tests, or narrow the linked issue before merging. Rebuild, conflict detection, registration, and registry types should treat absent messaging.plan as no supported persisted messaging state rather than deriving state from messagingChannels, messagingChannelConfig, or disabledChannels.
    • Evidence: backfillMessagingChannels writes registry.updateSandbox(name, { messagingChannels: channels }); conflict-detection/entries.ts reads entry.messagingChannels and entry.disabledChannels; workflow-planner.ts still falls back to context.sandboxEntry?.messagingChannels; rebuild.ts hydrates sandboxEntry.messagingChannelConfig; registry.ts still exposes top-level messagingChannels, messagingChannelConfig, and disabledChannels; sandbox-registration.ts writes those fallback fields when plannedMessagingState is absent.
  • Make channel side effects atomic with messaging.plan persistence (src/lib/actions/sandbox/policy-channel.ts:1042): Channel add/remove paths can still mutate OpenShell providers, tokens, policy presets, or sandbox QR state before the durable plan write is known to have succeeded. A registry persistence failure can leave egress widened or a bridge registered without durable state, or remove providers/policy while the registry still records the channel.
    • Recommendation: Persist and validate the next plan before provider/policy mutations where possible. Where ordering cannot be inverted, add rollback for provider upsert/delete, preset apply/remove, token/session writes, QR add, and remove cleanup. Check the in-sandbox QR applyPlanToRegistry return before printing success or prompting rebuild.
    • Evidence: The in-sandbox QR branch calls MessagingHostStateApplier.applyPlanToRegistry(sandboxName, plan) without checking the boolean return. The token add path checks applyPlanToRegistry only after gateway provider registration and policy preset application. removeSandboxChannel deletes providers and removes policy before persistManifestChannelRemovePlan reports whether the plan persisted.
  • Fail closed on present-but-malformed messaging.plan state (src/lib/state/registry-messaging.ts:37): A registry entry with messaging.schemaVersion === 1 but a malformed plan is collapsed to null, and channel helpers convert that to empty channel lists. That makes a corrupt or tampered plan indistinguishable from no messaging state and can hide Telegram, Slack, Discord, WeChat, or WhatsApp state from conflict checks, status, doctor, inventory, and lifecycle mutations.
    • Recommendation: Distinguish an absent plan from a present invalid plan. Surface corrupt-plan diagnostics in status, doctor, inventory, and lifecycle commands, and fail closed for conflict checks and channel mutations unless an explicit force path is used.
    • Evidence: getMessagingPlanFromEntry returns parseSandboxMessagingPlan(entry.messaging.plan); getConfiguredMessagingChannelsFromEntry, getActiveMessagingChannelsFromEntry, and getDisabledMessagingChannelsFromEntry return [] when parsing returns null. Session update code similarly ignores malformed messagingPlan values instead of surfacing corrupt-present state.
  • Add negative-path coverage for persistence failures and corrupt plan state (src/lib/actions/sandbox/policy-channel.ts:1042): The changed paths have broad plan-backed happy-path coverage, but the highest-risk failure modes still lack targeted tests across registry/session persistence, provider upsert/delete, network policy apply/remove, status/doctor rendering, and rebuild resume.
    • Recommendation: Add behavior-specific tests that force MessagingHostStateApplier.applyPlanToRegistry or the registry update it uses to fail in QR add, token add, remove, start, and stop branches. Add corrupt messaging.plan tests proving status, doctor, conflict detection, and lifecycle mutations fail closed with clear diagnostics.
    • Evidence: Direct applyPlanToRegistry coverage exists in host-state-applier.test.ts, and conflict-check throw handling exists in policy-channel-conflict.test.ts, but no changed policy-channel tests were found for QR add persistence failure, token-add rollback after persistence failure, remove persistence failure after cleanup, or corrupt-present-plan status/doctor/lifecycle behavior.
  • Large test hotspots grew past the repository guardrail (src/lib/onboard/machine/handlers/sandbox.test.ts:1): The PR reduces several implementation files, but three existing large test hotspots moved in the wrong direction and exceed the repository file-size growth guardrail.
    • Recommendation: Extract repeated plan-backed messaging fixtures/helpers or split focused cases so these hotspot files do not grow, or offset the growth in the same files before merge.
    • Evidence: Deterministic monolith analysis reported src/lib/onboard/machine/handlers/sandbox.test.ts baseLines=473 headLines=598 delta=125, src/lib/actions/sandbox/channel-status.test.ts baseLines=470 headLines=508 delta=38, and src/lib/inventory/index.test.ts baseLines=956 headLines=993 delta=37.

🔎 Worth checking

  • Source-of-truth review needed: Legacy entries without messaging plans in conflict detection: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: policy-channel-conflict.test.ts uses makeEmptyEntry("legacy"); conflict-detection/entries.ts supports messagingChannels/disabledChannels; backfill.ts writes messagingChannels.
  • Source-of-truth review needed: Malformed messaging.plan parsing and registry/session helpers: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: registry-messaging.ts returns parseSandboxMessagingPlan(entry.messaging.plan), and channel helpers return [] when parsing returns null.
  • Source-of-truth review needed: Status overlap catch-all fallback: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: status-command-deps.ts backfillAndFindOverlaps catches all errors and returns [].
  • Source-of-truth review needed: Channel lifecycle side effects versus messaging.plan persistence: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: QR add ignores applyPlanToRegistry return; token add checks after provider/policy mutation; remove persists after provider/policy cleanup.
  • Source-of-truth review needed: Messaging plan environment and Docker ARG serialization: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: MessagingSetupApplier.encodePlan writes the full SandboxMessagingPlan to NEMOCLAW_MESSAGING_PLAN_B64, and dockerfile-patch.ts writes that value to ARG NEMOCLAW_MESSAGING_PLAN_B64.
  • Source-of-truth review needed: Strict plan schema at persistence and env boundaries: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: plan-validation.ts and setup-applier.ts check top-level arrays and a few channel fields but not most nested entry shapes.
  • Tighten plan schema validation before using plans as blueprints (src/lib/messaging/plan-validation.ts:13): The messaging plan now drives provider bindings, network policy, render targets, build-file outputs, status, conflict behavior, and rebuild, but validation remains shallow. Nested credential bindings, policy entries, render targets, build steps, state updates, and health checks can pass registry/session/env boundaries without strict shape validation.
    • Recommendation: Use one strict SandboxMessagingPlan schema at registry, session, and environment boundaries. Validate nested credential bindings, network policy entries, render target paths, build-file paths/modes, state updates, and health checks before storing or applying a plan.
    • Evidence: parseSandboxMessagingPlan checks top-level fields and a few channel booleans/arrays. MessagingSetupApplier.assertSandboxMessagingPlan has the same top-level checks. Later appliers contain some path/mode guards, but the durable plan blueprint can already be accepted.
  • Minimize credential metadata exposed through plan env and Docker ARG surfaces (src/lib/messaging/applier/setup-applier.ts:39): The serialized messaging plan can carry provider names, provider env keys, placeholders, channel metadata, and unsalted credential hashes. Writing the full plan into process.env and then into a Docker ARG can expose sensitive operational metadata through environment dumps, build logs, image history, or local artifacts.
    • Recommendation: Define a minimal build-time payload for NEMOCLAW_MESSAGING_PLAN_B64, omit credential hashes and host-only provider metadata where the build does not need them, and add tests proving raw credentials are never serialized and hashes are treated as sensitive metadata.
    • Evidence: MessagingSetupApplier.writePlanToEnv serializes the full SandboxMessagingPlan. onboard/dockerfile-patch.ts rewrites ARG NEMOCLAW_MESSAGING_PLAN_B64 with MessagingSetupApplier.encodePlan(messagingPlan). SandboxMessagingCredentialBindingPlan includes credentialHash.
  • Do not render messaging conflict detection failures as no overlaps (src/lib/status-command-deps.ts:103): Status remains usable when overlap detection throws, but returning an empty overlap list makes a failed safety check indistinguishable from no conflicts. That can suppress warnings for shared bot tokens or Slack Socket Mode gateway conflicts.
    • Recommendation: Return an explicit degraded or unknown messaging-conflict status and render a diagnostic instead of [] when backfill, overlap detection, or Slack gateway overlap detection fails.
    • Evidence: backfillAndFindOverlaps wraps backfillMessagingChannels, findAllOverlaps, and Slack gateway overlap detection in a catch-all and returns [] on any error.
  • Document the intentionally non-backward-compatible messaging state cutover: Issue Phase 4c: migrate session and registry messaging state to plans #4947 makes this migration intentionally non-backward-compatible and removes support for pre-plan onboard-session or sandbox-registry messaging fields. That is user-facing for operators with older sandboxes or session files, but the PR body marks docs unchecked and no operator-facing repair guidance was found in the changed files.
    • Recommendation: Add release-note or docs guidance explaining that legacy flat messaging state is no longer supported, what symptoms operators will see, and the supported repair path for old sandboxes/session files.
    • Evidence: The linked issue says no migration/backfill and no compatibility shim for legacy flat messaging fields. The PR verification checklist leaves docs updated unchecked.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Add a test proving QR channels add aborts and does not print success or prompt rebuild when MessagingHostStateApplier.applyPlanToRegistry returns false.. Runtime/sandbox/infrastructure paths changed across channel lifecycle, status, doctor, rebuild, registry/session persistence, plan validation, and Docker/env plan transport. Unit mocks cover many happy paths, but they do not prove real provider/policy/registry ordering or corrupt-state behavior.
  • **Runtime validation** — Add a test proving token channels add rolls back provider registration, token persistence, and policy preset application when plan persistence fails.. Runtime/sandbox/infrastructure paths changed across channel lifecycle, status, doctor, rebuild, registry/session persistence, plan validation, and Docker/env plan transport. Unit mocks cover many happy paths, but they do not prove real provider/policy/registry ordering or corrupt-state behavior.
  • **Runtime validation** — Add a test proving channels remove does not delete gateway providers or unapply policy unless the remove plan can be persisted, or rolls back those side effects on persistence failure.. Runtime/sandbox/infrastructure paths changed across channel lifecycle, status, doctor, rebuild, registry/session persistence, plan validation, and Docker/env plan transport. Unit mocks cover many happy paths, but they do not prove real provider/policy/registry ordering or corrupt-state behavior.
  • **Runtime validation** — Add tests proving channels start and channels stop reject a configured channel when messaging.plan is present but malformed, with a clear diagnostic.. Runtime/sandbox/infrastructure paths changed across channel lifecycle, status, doctor, rebuild, registry/session persistence, plan validation, and Docker/env plan transport. Unit mocks cover many happy paths, but they do not prove real provider/policy/registry ordering or corrupt-state behavior.
  • **Runtime validation** — Add status, doctor, and inventory tests proving a present malformed messaging.plan renders degraded or corrupt state instead of no messaging channels registered or no overlaps.. Runtime/sandbox/infrastructure paths changed across channel lifecycle, status, doctor, rebuild, registry/session persistence, plan validation, and Docker/env plan transport. Unit mocks cover many happy paths, but they do not prove real provider/policy/registry ordering or corrupt-state behavior.
  • **Add negative-path coverage for persistence failures and corrupt plan state** — Add behavior-specific tests that force MessagingHostStateApplier.applyPlanToRegistry or the registry update it uses to fail in QR add, token add, remove, start, and stop branches. Add corrupt messaging.plan tests proving status, doctor, conflict detection, and lifecycle mutations fail closed with clear diagnostics.
  • **Acceptance clause:** Phase 4c of Refactor messaging integrations into a manifest-first planning architecture #3896 flips onboard-session and sandbox registry messaging state to the manifest-backed `SandboxMessagingPlan`. — add test evidence or identify existing coverage. Session state now exposes messagingPlan and registry exposes messaging?: SandboxMessagingState, with many consumers moved to plan helpers. However registry registration, conflict detection, backfill, and rebuild still support or write flat messaging fields.
  • **Acceptance clause:** This migration is intentionally not backward-compatible: after this phase, `messagingPlan` in onboard sessions and `messaging.plan` in the sandbox registry are the only supported persisted messaging state. — add test evidence or identify existing coverage. Legacy persisted registry state is still supported through messagingChannels, messagingChannelConfig, disabledChannels, conflict backfill, and legacy compatibility tests.
Since last review details

Current findings:

  • Source-of-truth review needed: Legacy entries without messaging plans in conflict detection: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: policy-channel-conflict.test.ts uses makeEmptyEntry("legacy"); conflict-detection/entries.ts supports messagingChannels/disabledChannels; backfill.ts writes messagingChannels.
  • Source-of-truth review needed: Malformed messaging.plan parsing and registry/session helpers: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: registry-messaging.ts returns parseSandboxMessagingPlan(entry.messaging.plan), and channel helpers return [] when parsing returns null.
  • Source-of-truth review needed: Status overlap catch-all fallback: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: status-command-deps.ts backfillAndFindOverlaps catches all errors and returns [].
  • Source-of-truth review needed: Channel lifecycle side effects versus messaging.plan persistence: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: QR add ignores applyPlanToRegistry return; token add checks after provider/policy mutation; remove persists after provider/policy cleanup.
  • Source-of-truth review needed: Messaging plan environment and Docker ARG serialization: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: MessagingSetupApplier.encodePlan writes the full SandboxMessagingPlan to NEMOCLAW_MESSAGING_PLAN_B64, and dockerfile-patch.ts writes that value to ARG NEMOCLAW_MESSAGING_PLAN_B64.
  • Source-of-truth review needed: Strict plan schema at persistence and env boundaries: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: plan-validation.ts and setup-applier.ts check top-level arrays and a few channel fields but not most nested entry shapes.
  • Remove remaining legacy flat messaging state compatibility paths (src/lib/messaging/applier/conflict-detection/backfill.ts:61): Issue Phase 4c: migrate session and registry messaging state to plans #4947 explicitly says this phase is intentionally not backward-compatible and that messagingPlan/messaging.plan are the only supported persisted messaging state. The branch still reads, writes, and tests legacy flat registry fields, so the plan-only acceptance criteria are not met.
    • Recommendation: Remove the remaining flat-field reads/writes and legacy tests, or narrow the linked issue before merging. Rebuild, conflict detection, registration, and registry types should treat absent messaging.plan as no supported persisted messaging state rather than deriving state from messagingChannels, messagingChannelConfig, or disabledChannels.
    • Evidence: backfillMessagingChannels writes registry.updateSandbox(name, { messagingChannels: channels }); conflict-detection/entries.ts reads entry.messagingChannels and entry.disabledChannels; workflow-planner.ts still falls back to context.sandboxEntry?.messagingChannels; rebuild.ts hydrates sandboxEntry.messagingChannelConfig; registry.ts still exposes top-level messagingChannels, messagingChannelConfig, and disabledChannels; sandbox-registration.ts writes those fallback fields when plannedMessagingState is absent.
  • Make channel side effects atomic with messaging.plan persistence (src/lib/actions/sandbox/policy-channel.ts:1042): Channel add/remove paths can still mutate OpenShell providers, tokens, policy presets, or sandbox QR state before the durable plan write is known to have succeeded. A registry persistence failure can leave egress widened or a bridge registered without durable state, or remove providers/policy while the registry still records the channel.
    • Recommendation: Persist and validate the next plan before provider/policy mutations where possible. Where ordering cannot be inverted, add rollback for provider upsert/delete, preset apply/remove, token/session writes, QR add, and remove cleanup. Check the in-sandbox QR applyPlanToRegistry return before printing success or prompting rebuild.
    • Evidence: The in-sandbox QR branch calls MessagingHostStateApplier.applyPlanToRegistry(sandboxName, plan) without checking the boolean return. The token add path checks applyPlanToRegistry only after gateway provider registration and policy preset application. removeSandboxChannel deletes providers and removes policy before persistManifestChannelRemovePlan reports whether the plan persisted.
  • Fail closed on present-but-malformed messaging.plan state (src/lib/state/registry-messaging.ts:37): A registry entry with messaging.schemaVersion === 1 but a malformed plan is collapsed to null, and channel helpers convert that to empty channel lists. That makes a corrupt or tampered plan indistinguishable from no messaging state and can hide Telegram, Slack, Discord, WeChat, or WhatsApp state from conflict checks, status, doctor, inventory, and lifecycle mutations.
    • Recommendation: Distinguish an absent plan from a present invalid plan. Surface corrupt-plan diagnostics in status, doctor, inventory, and lifecycle commands, and fail closed for conflict checks and channel mutations unless an explicit force path is used.
    • Evidence: getMessagingPlanFromEntry returns parseSandboxMessagingPlan(entry.messaging.plan); getConfiguredMessagingChannelsFromEntry, getActiveMessagingChannelsFromEntry, and getDisabledMessagingChannelsFromEntry return [] when parsing returns null. Session update code similarly ignores malformed messagingPlan values instead of surfacing corrupt-present state.
  • Tighten plan schema validation before using plans as blueprints (src/lib/messaging/plan-validation.ts:13): The messaging plan now drives provider bindings, network policy, render targets, build-file outputs, status, conflict behavior, and rebuild, but validation remains shallow. Nested credential bindings, policy entries, render targets, build steps, state updates, and health checks can pass registry/session/env boundaries without strict shape validation.
    • Recommendation: Use one strict SandboxMessagingPlan schema at registry, session, and environment boundaries. Validate nested credential bindings, network policy entries, render target paths, build-file paths/modes, state updates, and health checks before storing or applying a plan.
    • Evidence: parseSandboxMessagingPlan checks top-level fields and a few channel booleans/arrays. MessagingSetupApplier.assertSandboxMessagingPlan has the same top-level checks. Later appliers contain some path/mode guards, but the durable plan blueprint can already be accepted.
  • Minimize credential metadata exposed through plan env and Docker ARG surfaces (src/lib/messaging/applier/setup-applier.ts:39): The serialized messaging plan can carry provider names, provider env keys, placeholders, channel metadata, and unsalted credential hashes. Writing the full plan into process.env and then into a Docker ARG can expose sensitive operational metadata through environment dumps, build logs, image history, or local artifacts.
    • Recommendation: Define a minimal build-time payload for NEMOCLAW_MESSAGING_PLAN_B64, omit credential hashes and host-only provider metadata where the build does not need them, and add tests proving raw credentials are never serialized and hashes are treated as sensitive metadata.
    • Evidence: MessagingSetupApplier.writePlanToEnv serializes the full SandboxMessagingPlan. onboard/dockerfile-patch.ts rewrites ARG NEMOCLAW_MESSAGING_PLAN_B64 with MessagingSetupApplier.encodePlan(messagingPlan). SandboxMessagingCredentialBindingPlan includes credentialHash.
  • Do not render messaging conflict detection failures as no overlaps (src/lib/status-command-deps.ts:103): Status remains usable when overlap detection throws, but returning an empty overlap list makes a failed safety check indistinguishable from no conflicts. That can suppress warnings for shared bot tokens or Slack Socket Mode gateway conflicts.
    • Recommendation: Return an explicit degraded or unknown messaging-conflict status and render a diagnostic instead of [] when backfill, overlap detection, or Slack gateway overlap detection fails.
    • Evidence: backfillAndFindOverlaps wraps backfillMessagingChannels, findAllOverlaps, and Slack gateway overlap detection in a catch-all and returns [] on any error.
  • Add negative-path coverage for persistence failures and corrupt plan state (src/lib/actions/sandbox/policy-channel.ts:1042): The changed paths have broad plan-backed happy-path coverage, but the highest-risk failure modes still lack targeted tests across registry/session persistence, provider upsert/delete, network policy apply/remove, status/doctor rendering, and rebuild resume.
    • Recommendation: Add behavior-specific tests that force MessagingHostStateApplier.applyPlanToRegistry or the registry update it uses to fail in QR add, token add, remove, start, and stop branches. Add corrupt messaging.plan tests proving status, doctor, conflict detection, and lifecycle mutations fail closed with clear diagnostics.
    • Evidence: Direct applyPlanToRegistry coverage exists in host-state-applier.test.ts, and conflict-check throw handling exists in policy-channel-conflict.test.ts, but no changed policy-channel tests were found for QR add persistence failure, token-add rollback after persistence failure, remove persistence failure after cleanup, or corrupt-present-plan status/doctor/lifecycle behavior.
  • Large test hotspots grew past the repository guardrail (src/lib/onboard/machine/handlers/sandbox.test.ts:1): The PR reduces several implementation files, but three existing large test hotspots moved in the wrong direction and exceed the repository file-size growth guardrail.
    • Recommendation: Extract repeated plan-backed messaging fixtures/helpers or split focused cases so these hotspot files do not grow, or offset the growth in the same files before merge.
    • Evidence: Deterministic monolith analysis reported src/lib/onboard/machine/handlers/sandbox.test.ts baseLines=473 headLines=598 delta=125, src/lib/actions/sandbox/channel-status.test.ts baseLines=470 headLines=508 delta=38, and src/lib/inventory/index.test.ts baseLines=956 headLines=993 delta=37.
  • Document the intentionally non-backward-compatible messaging state cutover: Issue Phase 4c: migrate session and registry messaging state to plans #4947 makes this migration intentionally non-backward-compatible and removes support for pre-plan onboard-session or sandbox-registry messaging fields. That is user-facing for operators with older sandboxes or session files, but the PR body marks docs unchecked and no operator-facing repair guidance was found in the changed files.
    • Recommendation: Add release-note or docs guidance explaining that legacy flat messaging state is no longer supported, what symptoms operators will see, and the supported repair path for old sandboxes/session files.
    • Evidence: The linked issue says no migration/backfill and no compatibility shim for legacy flat messaging fields. The PR verification checklist leaves docs updated unchecked.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/actions/sandbox/policy-channel.ts (1)

970-977: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce consistent rollback on messaging plan persistence failure.

Line 976 ignores applyPlanToRegistry failure, and Lines 1027-1033 exit after side effects have already run. That can leave gateway/policy state changed without durable messaging.plan state.

Suggested fix
-    MessagingHostStateApplier.applyPlanToRegistry(sandboxName, plan);
+    if (!MessagingHostStateApplier.applyPlanToRegistry(sandboxName, plan)) {
+      removeChannelPresetIfPresent(sandboxName, canonical);
+      console.error(`  ${YW}⚠${R} Could not persist messaging plan for '${sandboxName}'.`);
+      process.exit(1);
+    }
@@
   if (!MessagingHostStateApplier.applyPlanToRegistry(sandboxName, plan)) {
+    removeChannelPresetIfPresent(sandboxName, canonical);
+    await rollbackChannelAdd(sandboxName, channelDef, canonical, {
+      wasAlreadyEnabled,
+      priorCreds,
+    });
     console.error(`  ${YW}⚠${R} Could not persist messaging plan for '${sandboxName}'.`);
     console.error(
       "  Earlier gateway or policy side effects may already have run, but durable channel state was not saved.",
     );
     process.exit(1);
   }

Also applies to: 1027-1033

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/policy-channel.ts` around lines 970 - 977, The code
persists manifest/gateway changes before calling
MessagingHostStateApplier.applyPlanToRegistry and currently ignores failures;
update the flow so that when applyPlanToRegistry(sandboxName, plan) throws or
rejects (both at the shown block and the similar 1027-1033 block) you perform a
rollback: undo applyChannelAddToGatewayAndRegistry side effects and revert
persistManifestAddState(sandboxName, manifest) (or mark it failed) and then exit
with non-zero; locate usages around applyChannelPresetIfAvailable,
applyChannelAddToGatewayAndRegistry, persistManifestAddState, and
MessagingHostStateApplier.applyPlanToRegistry and wrap the applyPlanToRegistry
call in a try/catch that triggers the rollback steps on error and
re-throws/exits with a logged error.
🧹 Nitpick comments (2)
src/lib/state/registry-messaging.ts (1)

31-37: 💤 Low value

Consider adding a clarifying comment for the undefined return.

The function returns undefined (not null) for invalid input, which is intentional to allow optional field omission when spreading into registry entries. A brief comment explaining this choice would help future maintainers.

💡 Suggested comment
+// Returns undefined (not null) for invalid input so spreading the result into
+// a registry entry will omit the field rather than set it to null.
 export function cloneSandboxMessagingState(
   messaging: SandboxMessagingState | null | undefined,
 ): SandboxMessagingState | undefined {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/state/registry-messaging.ts` around lines 31 - 37, Add a brief
clarifying comment inside the cloneSandboxMessagingState function explaining
that the function intentionally returns undefined (not null) for invalid or
absent input so callers can omit the optional field when spreading into registry
entries; reference the behavior around the null/undefined check and the final
return so future maintainers understand why undefined is used and not null.
src/lib/messaging/compiler/workflow-planner.ts (1)

235-240: 💤 Low value

Consider removing the unused _sandboxEntry parameter.

The function disabledChannelsFromSandboxEntry no longer reads from the sandboxEntry parameter after migrating to plan-only state. While the underscore prefix correctly marks it as unused per coding guidelines, the parameter could be removed entirely to simplify the signature. The only call site (line 104) already has access to existingPlan and doesn't need the extra parameter.

♻️ Proposed simplification
-function disabledChannelsFromSandboxEntry(
-  _sandboxEntry: MessagingWorkflowPlannerSandboxEntry | null | undefined,
-  fallbackPlan: SandboxMessagingPlan | null,
-): MessagingChannelId[] {
+function disabledChannelsFromFallbackPlan(
+  fallbackPlan: SandboxMessagingPlan | null,
+): MessagingChannelId[] {
   return uniqueChannels(fallbackPlan?.disabledChannels ?? []);
 }

And update the call site:

   return setPlanDisabledChannels(
     existingPlan,
-    disabledChannelsFromSandboxEntry(context.sandboxEntry, existingPlan),
+    disabledChannelsFromFallbackPlan(existingPlan),
     "rebuild",
   );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/messaging/compiler/workflow-planner.ts` around lines 235 - 240, The
function disabledChannelsFromSandboxEntry still accepts an unused _sandboxEntry
parameter; remove that parameter from the signature and update its usages so it
only takes (fallbackPlan: SandboxMessagingPlan | null) and returns
uniqueChannels(fallbackPlan?.disabledChannels ?? []); update the call site(s)
that currently pass a sandbox entry (e.g., the caller that has existingPlan) to
pass only the fallback plan (existingPlan) and adjust any type
annotations/imports accordingly to keep types consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 5409-5412: The call to
getRecordedMessagingChannelsForResumeFromState is seeding from only active
channels via getActiveChannelsFromPlan(session?.messagingPlan), which drops
configured-but-stopped channels and loses persisted stop state; change the
argument to use the plan's configured channels (e.g., pass the configured
channels list from session?.messagingPlan or a new helper like
getConfiguredChannelsFromPlan(session?.messagingPlan)) so
configured-but-disabled entries are included, and ensure the disabled/started
state remains tracked separately (do not derive enabled/disabled by filtering
here). Update any related call sites or helpers so
getRecordedMessagingChannelsForResumeFromState receives the full configured
channels list while the resume logic still respects each channel's disabled
flag.

In `@src/lib/onboard/channel-state.ts`:
- Around line 20-28: The env-plan reader passed via deps is never used by
default callers, so resolveDisabledChannels (the function containing
getDisabledChannelsFromPlan, readMessagingPlanFromEnv,
onboardSession.loadSession, and registry.getDisabledChannels) should default to
the real env reader instead of only using deps.readMessagingPlanFromEnv when
provided; fix by importing the actual readMessagingPlanFromEnv implementation
into channel-state.ts and use it when deps?.readMessagingPlanFromEnv is
undefined (i.e., set const envPlan = (deps?.readMessagingPlanFromEnv ??
readMessagingPlanFromEnv)()), so callers that do not pass deps will observe the
in-flight env plan rather than falling back to stale session/registry state.

In `@test/e2e/docs/parity-inventory.generated.json`:
- Around line 7986-8002: The generated JSON parity-inventory.generated.json is
missing SPDX metadata; either add SPDX fields to the JSON root (e.g., top-level
"SPDX-License-Identifier" and "SPDX-FileCopyrightText" properties with the
correct license/copyright text) so the file contains the required license
metadata, or update the repository's licensing exceptions/config to explicitly
exclude/generated JSON artifacts; locate parity-inventory.generated.json and
either insert those top-level SPDX properties or register the file pattern as an
allowed generated artifact.

In `@test/e2e/test-channels-add-remove.sh`:
- Around line 178-183: The test currently requires a telegram entry in
messaging.plan.channels (variables plan and channel) even when asserting a
removed state; update the assertion logic in test-channels-add-remove.sh so that
when testing post-remove you assert that no channel with channelId ===
"telegram" exists in plan.channels (i.e., channel === undefined/null) instead of
failing if it's missing, and ensure any removed-state checks rely on plan as the
single source of truth rather than expecting a removed marker entry; locate the
block that computes const plan = entry.messaging?.plan and the channel lookup
and change the expectation to assert absence of the telegram channel.

In `@test/e2e/test-rebuild-hermes.sh`:
- Line 311: The test fixture seeds the session by setting sess['messagingPlan']
= plan but leaves legacy keys (messagingChannels, messagingChannelConfig,
disabledChannels) from the loaded onboard-session.json, which can mask failures;
update the seeding code that sets sess (the sess object where messagingPlan is
assigned) to explicitly remove/delete those legacy keys before or after
assigning sess['messagingPlan'] so the session only contains messagingPlan
(e.g., delete sess['messagingChannels'], delete sess['messagingChannelConfig'],
delete sess['disabledChannels']).

---

Outside diff comments:
In `@src/lib/actions/sandbox/policy-channel.ts`:
- Around line 970-977: The code persists manifest/gateway changes before calling
MessagingHostStateApplier.applyPlanToRegistry and currently ignores failures;
update the flow so that when applyPlanToRegistry(sandboxName, plan) throws or
rejects (both at the shown block and the similar 1027-1033 block) you perform a
rollback: undo applyChannelAddToGatewayAndRegistry side effects and revert
persistManifestAddState(sandboxName, manifest) (or mark it failed) and then exit
with non-zero; locate usages around applyChannelPresetIfAvailable,
applyChannelAddToGatewayAndRegistry, persistManifestAddState, and
MessagingHostStateApplier.applyPlanToRegistry and wrap the applyPlanToRegistry
call in a try/catch that triggers the rollback steps on error and
re-throws/exits with a logged error.

---

Nitpick comments:
In `@src/lib/messaging/compiler/workflow-planner.ts`:
- Around line 235-240: The function disabledChannelsFromSandboxEntry still
accepts an unused _sandboxEntry parameter; remove that parameter from the
signature and update its usages so it only takes (fallbackPlan:
SandboxMessagingPlan | null) and returns
uniqueChannels(fallbackPlan?.disabledChannels ?? []); update the call site(s)
that currently pass a sandbox entry (e.g., the caller that has existingPlan) to
pass only the fallback plan (existingPlan) and adjust any type
annotations/imports accordingly to keep types consistent.

In `@src/lib/state/registry-messaging.ts`:
- Around line 31-37: Add a brief clarifying comment inside the
cloneSandboxMessagingState function explaining that the function intentionally
returns undefined (not null) for invalid or absent input so callers can omit the
optional field when spreading into registry entries; reference the behavior
around the null/undefined check and the final return so future maintainers
understand why undefined is used and not null.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cf133ce8-7327-4f24-a9dc-7abc3a590c41

📥 Commits

Reviewing files that changed from the base of the PR and between 82499f8 and fdec10a.

📒 Files selected for processing (60)
  • ci/test-file-size-budget.json
  • src/lib/actions/inference-route-api.test.ts
  • src/lib/actions/inference-set.test.ts
  • src/lib/actions/sandbox/channel-status.test.ts
  • src/lib/actions/sandbox/channel-status.ts
  • src/lib/actions/sandbox/doctor.ts
  • src/lib/actions/sandbox/policy-channel-conflict.test.ts
  • src/lib/actions/sandbox/policy-channel.ts
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/inventory/index.test.ts
  • src/lib/inventory/index.ts
  • src/lib/messaging/applier/conflict-detection-legacy.test.ts
  • src/lib/messaging/applier/conflict-detection/backfill.ts
  • src/lib/messaging/applier/conflict-detection/entries.ts
  • src/lib/messaging/applier/conflict-detection/index.ts
  • src/lib/messaging/applier/conflict-detection/probe.ts
  • src/lib/messaging/applier/conflict-detection/types.ts
  • src/lib/messaging/applier/host-state-applier.test.ts
  • src/lib/messaging/compiler/workflow-planner.test.ts
  • src/lib/messaging/compiler/workflow-planner.ts
  • src/lib/messaging/plan-validation.test.ts
  • src/lib/messaging/plan-validation.ts
  • src/lib/onboard.ts
  • src/lib/onboard/channel-state.test.ts
  • src/lib/onboard/channel-state.ts
  • src/lib/onboard/machine/core-flow-phases.test.ts
  • src/lib/onboard/machine/events.ts
  • src/lib/onboard/machine/final-flow-phases.test.ts
  • src/lib/onboard/machine/handlers/policies.test.ts
  • src/lib/onboard/machine/handlers/policies.ts
  • src/lib/onboard/machine/handlers/sandbox.test.ts
  • src/lib/onboard/machine/handlers/sandbox.ts
  • src/lib/onboard/messaging-config.ts
  • src/lib/onboard/messaging-credentials.ts
  • src/lib/onboard/messaging-plan-session.ts
  • src/lib/onboard/messaging-reuse.test.ts
  • src/lib/onboard/messaging-reuse.ts
  • src/lib/onboard/session-updates.ts
  • src/lib/sandbox/whatsapp-diagnostics.ts
  • src/lib/state/onboard-session.test.ts
  • src/lib/state/onboard-session.ts
  • src/lib/state/registry-messaging.ts
  • src/lib/state/registry.ts
  • src/lib/status-command-deps.ts
  • test/channels-add-preset.test.ts
  • test/channels-remove-full-teardown.test.ts
  • test/cli/logs.test.ts
  • test/cli/status-root-json.test.ts
  • test/e2e/docs/parity-inventory.generated.json
  • test/e2e/test-channels-add-remove.sh
  • test/e2e/test-channels-stop-start.sh
  • test/e2e/test-messaging-providers.sh
  • test/e2e/test-rebuild-hermes.sh
  • test/onboard-messaging.test.ts
  • test/onboard.test.ts
  • test/rebuild-credential-preflight.test.ts
  • test/rebuild-shields-auto-unlock.test.ts
  • test/rebuild-stale-recovery.test.ts
  • test/registry.test.ts
  • test/repro-2201.test.ts
💤 Files with no reviewable changes (10)
  • src/lib/messaging/applier/conflict-detection-legacy.test.ts
  • test/onboard.test.ts
  • src/lib/messaging/applier/conflict-detection/probe.ts
  • src/lib/onboard/machine/core-flow-phases.test.ts
  • src/lib/messaging/compiler/workflow-planner.test.ts
  • src/lib/onboard/session-updates.ts
  • src/lib/messaging/applier/conflict-detection/index.ts
  • src/lib/messaging/applier/conflict-detection/backfill.ts
  • src/lib/messaging/applier/conflict-detection/types.ts
  • src/lib/messaging/applier/host-state-applier.test.ts

Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard/channel-state.ts Outdated
Comment thread test/e2e/docs/parity-inventory.generated.json Outdated
Comment thread test/e2e/test-channels-add-remove.sh
Comment thread test/e2e/test-rebuild-hermes.sh
@sandl99

sandl99 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Responding to the PR Review Advisor “Needs attention” items:

I am intentionally not taking additional action on these three advisory items in this migration PR.

  1. Channel lifecycle side-effect atomicity

This is valid hardening, but it is not required for migration parity. The pre-migration channel lifecycle was not fully atomic either: add/remove paths could already mutate OpenShell providers, policy, credentials, or session state without a full rollback contract.

Adding full rollback now would introduce new lifecycle semantics across provider upsert/delete, policy apply/remove, token/session changes, QR state, and registry writes. That is security-sensitive behavior and should be designed/tested as a separate hardening follow-up rather than folded into this source-of-truth migration.

  1. Malformed messaging.plan handling

This is also valid validation hardening, but not a blocker for this PR. The migration intentionally makes messaging.plan the only supported persisted messaging state and removes legacy fallback behavior. A missing legacy state and a malformed current plan can be distinguished later, but corrupt local registry contents were not a supported normal flow before this migration either.

I agree a follow-up should make registry readers distinguish “absent plan” from “present but malformed plan,” especially for conflict detection and status/doctor diagnostics. For this PR, the acceptance bar is preserving supported normal flows with plan-backed state.

  1. Large test hotspots

I am not splitting the large test files in this migration PR. The repository’s enforced guardrail is the test-size budget check, and the current branch passes npm run test-size:check. Additional test-file extraction would be useful cleanup, but it is orthogonal to the messaging-state migration and would increase review scope.

For this PR, I am keeping the scope focused on replacing legacy persisted messaging fields with messaging.plan and ensuring normal add/remove/start/stop, rebuild, status, doctor, inventory, and E2E flows use that plan-backed state consistently. The three advisor items above should be tracked as follow-up hardening/cleanup, not blockers for this migration.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 27262121242
Target ref: fix/4947-plan-backed-messaging-state
Requested jobs: messaging-providers-e2e,channels-add-remove-e2e,channels-stop-start-e2e,cloud-onboard-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e
Summary: 4 passed, 2 failed, 0 skipped

Job Result
channels-add-remove-e2e ❌ failure
channels-stop-start-e2e ❌ failure
cloud-onboard-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success

Failed jobs: channels-add-remove-e2e, channels-stop-start-e2e. Check run artifacts for logs.

Comment thread src/lib/onboard.ts Fixed
sandl99 and others added 2 commits June 10, 2026 16:01
…ort, function or class'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 27264726061
Target ref: fix/4947-plan-backed-messaging-state
Requested jobs: messaging-providers-e2e,channels-add-remove-e2e,channels-stop-start-e2e,cloud-onboard-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e
Summary: 5 passed, 1 failed, 0 skipped

Job Result
channels-add-remove-e2e ❌ failure
channels-stop-start-e2e ✅ success
cloud-onboard-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success

Failed jobs: channels-add-remove-e2e. Check run artifacts for logs.

@sandl99 sandl99 added v0.0.62 Release target v0.0.63 Release target and removed v0.0.62 Release target labels Jun 10, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27266759162
Target ref: fix/4947-plan-backed-messaging-state
Requested jobs: messaging-providers-e2e,channels-add-remove-e2e,channels-stop-start-e2e,cloud-onboard-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e
Summary: 5 passed, 0 failed, 0 skipped

Job Result
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ⚠️ cancelled
cloud-onboard-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 27267840227
Target ref: fix/4947-plan-backed-messaging-state
Requested jobs: all (no filter)
Summary: 62 passed, 2 failed, 3 skipped

Job Result
agent-turn-latency-e2e ✅ success
bedrock-runtime-compatible-anthropic-e2e ✅ success
brave-search-e2e ✅ success
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ✅ success
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
common-egress-agent-e2e ✅ success
concurrent-gateway-ports-e2e ✅ success
credential-migration-e2e ✅ success
credential-sanitization-e2e ✅ success
device-auth-health-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
double-onboard-e2e ✅ success
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
gpu-jetson-nvmap-e2e ⏭️ skipped
hermes-anthropic-inference-switch-e2e ✅ success
hermes-dashboard-e2e ✅ success
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success
hermes-slack-e2e ✅ success
inference-routing-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success
issue-3600-gpu-proof-optional-e2e ✅ success
issue-4434-tui-unreachable-inference-e2e ✅ success
issue-4462-gateway-pinned-approval-characterization-e2e ✅ success
issue-4462-scope-upgrade-approval-e2e ✅ success
kimi-inference-compat-e2e ✅ success
launchable-smoke-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
onboard-negative-paths-e2e ✅ success
onboard-repair-e2e ✅ success
onboard-resume-e2e ✅ success
openclaw-anthropic-inference-switch-e2e ✅ success
openclaw-discord-pairing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
openclaw-onboard-security-posture-e2e ✅ success
openclaw-skill-cli-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
rebuild-openclaw-e2e ✅ success
runtime-overrides-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
sessions-agents-cli-e2e ❌ failure
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ✅ success
telegram-injection-e2e ✅ success
token-rotation-e2e ❌ failure
tunnel-lifecycle-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success
vm-driver-privileged-exec-routing-e2e ✅ success

Failed jobs: sessions-agents-cli-e2e, token-rotation-e2e. Check run artifacts for logs.

@sandl99 sandl99 requested a review from cv June 10, 2026 10:51
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27271049670
Target ref: fix/4947-plan-backed-messaging-state
Requested jobs: token-rotation-e2e,sessions-agents-cli-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
sessions-agents-cli-e2e ✅ success
token-rotation-e2e ✅ success

@jyaunches jyaunches added v0.0.64 Release target and removed v0.0.63 Release target labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 27321293497
Target ref: fix/4947-plan-backed-messaging-state
Requested jobs: all (no filter)
Summary: 40 passed, 25 failed, 3 skipped

Job Result
agent-turn-latency-e2e ❌ failure
bedrock-runtime-compatible-anthropic-e2e ❌ failure
brave-search-e2e ✅ success
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ✅ success
cloud-e2e ❌ failure
cloud-inference-e2e ❌ failure
cloud-onboard-e2e ❌ failure
common-egress-agent-e2e ❌ failure
concurrent-gateway-ports-e2e ✅ success
credential-migration-e2e ✅ success
credential-sanitization-e2e ✅ success
cron-preflight-inference-local-e2e ✅ success
device-auth-health-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
double-onboard-e2e ✅ success
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
gpu-jetson-nvmap-e2e ⏭️ skipped
hermes-anthropic-inference-switch-e2e ✅ success
hermes-dashboard-e2e ✅ success
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success
hermes-slack-e2e ✅ success
inference-routing-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ❌ failure
issue-3600-gpu-proof-optional-e2e ✅ success
issue-4434-tui-unreachable-inference-e2e ❌ failure
issue-4462-gateway-pinned-approval-characterization-e2e ❌ failure
issue-4462-scope-upgrade-approval-e2e ❌ failure
kimi-inference-compat-e2e ❌ failure
launchable-smoke-e2e ❌ failure
messaging-compatible-endpoint-e2e ❌ failure
messaging-providers-e2e ✅ success
network-policy-e2e ❌ failure
onboard-negative-paths-e2e ✅ success
onboard-repair-e2e ✅ success
onboard-resume-e2e ✅ success
openclaw-anthropic-inference-switch-e2e ❌ failure
openclaw-discord-pairing-e2e ✅ success
openclaw-inference-switch-e2e ❌ failure
openclaw-onboard-security-posture-e2e ❌ failure
openclaw-skill-cli-e2e ❌ failure
openclaw-slack-pairing-e2e ✅ success
openclaw-tui-chat-correlation-e2e ❌ failure
openshell-gateway-upgrade-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ❌ failure
rebuild-hermes-stale-base-e2e ❌ failure
rebuild-openclaw-e2e ✅ success
runtime-overrides-e2e ✅ success
sandbox-operations-e2e ❌ failure
sandbox-survival-e2e ❌ failure
sessions-agents-cli-e2e ❌ failure
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ✅ success
telegram-injection-e2e ✅ success
token-rotation-e2e ✅ success
tunnel-lifecycle-e2e ❌ failure
upgrade-stale-sandbox-e2e ✅ success
vm-driver-privileged-exec-routing-e2e ✅ success

Failed jobs: agent-turn-latency-e2e, bedrock-runtime-compatible-anthropic-e2e, cloud-e2e, cloud-inference-e2e, cloud-onboard-e2e, common-egress-agent-e2e, issue-2478-crash-loop-recovery-e2e, issue-4434-tui-unreachable-inference-e2e, issue-4462-gateway-pinned-approval-characterization-e2e, issue-4462-scope-upgrade-approval-e2e, kimi-inference-compat-e2e, launchable-smoke-e2e, messaging-compatible-endpoint-e2e, network-policy-e2e, openclaw-anthropic-inference-switch-e2e, openclaw-inference-switch-e2e, openclaw-onboard-security-posture-e2e, openclaw-skill-cli-e2e, openclaw-tui-chat-correlation-e2e, rebuild-hermes-e2e, rebuild-hermes-stale-base-e2e, sandbox-operations-e2e, sandbox-survival-e2e, sessions-agents-cli-e2e, tunnel-lifecycle-e2e. Check run artifacts for logs.

…messaging-state

# Conflicts:
#	ci/test-file-size-budget.json
#	src/lib/onboard.ts
@sandl99 sandl99 removed the v0.0.64 Release target label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: messaging Messaging channels, bridges, manifests, or channel lifecycle refactor PR restructures code without intended behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 4c: migrate session and registry messaging state to plans

4 participants