Skip to content

feat(messaging): persist channel manifest plans#4536

Merged
sandl99 merged 4 commits into
u/sdang/3896-phase-2-messaging-enrollmentfrom
u/sdang/build-messaging-manifest-plan-in-rebuild
Jun 4, 2026
Merged

feat(messaging): persist channel manifest plans#4536
sandl99 merged 4 commits into
u/sdang/3896-phase-2-messaging-enrollmentfrom
u/sdang/build-messaging-manifest-plan-in-rebuild

Conversation

@sandl99
Copy link
Copy Markdown
Contributor

@sandl99 sandl99 commented May 29, 2026

Summary

Persist manifest messaging plans through channel lifecycle operations so add, stop, start, remove, and rebuild can carry the new architecture state in SandboxEntry while legacy registry fields continue to work.

Related Issue

Fixes #4535
Refs #3896

Changes

  • Added MessagingWorkflowPlanner helpers that merge a compiled add-channel plan into a stored sandbox plan and mutate stored plans for stop/start/remove/rebuild.
  • Updated channels add, channels stop, channels start, and channels remove to write SandboxEntry.messaging.plan without removing legacy registry updates.
  • Staged stored manifest plans during rebuild through the existing messaging plan env path.
  • Added planner tests for add merge, stop/start mutation, remove pruning, rebuild staging from stored plans, and no-compile behavior when no stored plan exists.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: San Dang sdang@nvidia.com

Summary by CodeRabbit

Release Notes

  • New Features

    • Improved messaging channel management with manifest-driven configuration for Discord, Telegram, Slack, WeChat, and WhatsApp
    • Support for multiple authentication modes including token-based and QR code enrollment
    • Enhanced channel validation and reachability checks during setup
  • Bug Fixes

    • More reliable credential handling and credential binding resolution
    • Better error messaging and validation for channel configuration
  • Tests

    • Expanded test coverage for channel enrollment workflows and manifest validation

@sandl99 sandl99 self-assigned this May 29, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 29, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 81263166-044d-47d4-a561-365171734f45

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces manifest-driven messaging: adds manifests and hooks, compiler/planner, and appliers; updates onboarding to select/apply plans and persist them; integrates manifest lifecycle into sandbox actions and rebuild; extends registry to store messaging plans; updates tests and E2E.

Changes

Core messaging manifests, hooks, compiler, and appliers

Layer / File(s) Summary
Manifests and messaging plan/types
src/lib/messaging/channels/*/manifest.ts, src/lib/messaging/channels/index.ts, src/lib/messaging/manifest/types.ts, src/lib/messaging/index.ts
Adds built-in manifests and overhauls manifest/plan contracts; provides built-in registry and messaging barrel.
Hooks system: types, registry, runner, and common/built-in hooks
src/lib/messaging/hooks/**/*, src/lib/messaging/channels/telegram/hooks/*, src/lib/messaging/channels/wechat/hooks/*
Implements hook infrastructure and shared/built-in hooks (token-paste, config-prompt, reachability, QR login, seed, health-check).
Compiler, engines, planner, utils, and barrels
src/lib/messaging/compiler/**/*, src/lib/messaging/utils.ts
Compiles manifests into plans via engines and planner; adds helper utils and barrels.
Applier layer: types, setup/env encoding, host-state, agent-config, credentials, policy, filters
src/lib/messaging/applier/**/*
Encodes plans to env, applies to sandbox via OpenShell (agent config, credentials, policy), filters enabled entries, and persists/merges host state.

Onboarding integration and messaging selection flow

Layer / File(s) Summary
Onboarding messaging setup entrypoint and utilities
src/lib/onboard/messaging-channel-setup.ts
Replaces per-channel prompts with manifest-driven planning and env plan write; supports interactive selection and QR help.
Onboard entrypoints, machine handler contract changes, and state helpers
src/lib/onboard.ts, src/lib/onboard/machine/handlers/sandbox.ts, src/lib/onboard/messaging-state.ts
Threads sandboxName; stages env plan on register; updates handler signature; removes legacy helpers.
Onboarding tests for selection and plan emission
src/lib/onboard/*test.ts, test/onboard-messaging.test.ts
Adjusts tests to validate selection, env plan presence, and new handler calls.

Sandbox actions: policy-channel and rebuild plan staging

Layer / File(s) Summary
policy-channel.ts manifest-driven add/enable/disable/remove
src/lib/actions/sandbox/policy-channel.ts
Uses manifests and planner to add channels, persist plans/disabled state, and apply to registry.
rebuild.ts stages messaging manifest plan
src/lib/actions/sandbox/rebuild.ts
Stages and writes rebuild plan to env; aborts safely on failures.
CLI tests for channels add and E2E provider/reachability adjustments
test/channels-add-preset.test.ts, test/e2e/*
Captures planner invocations and adjusts Telegram reachability skip in E2E.

State registry messaging persistence

Layer / File(s) Summary
registry types and clone helper
src/lib/state/registry.ts
Adds SandboxMessagingState and persists normalized plans into registry entries.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant Planner
  participant SetupApplier
  participant Registry
  CLI->>Planner: buildPlan(workflow: onboard/add/rebuild)
  Planner-->>CLI: SandboxMessagingPlan
  CLI->>SetupApplier: writePlanToEnv(plan)
  CLI->>Registry: applyPlanToRegistry(plan, mode)
  Registry-->>CLI: messaging state stored/merged
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#4003 — Earlier introduction of manifest registry/types that this PR builds upon.
  • NVIDIA/NemoClaw#4001 — Touches the same policy-channel teardown flow now extended to persist manifest remove plans.
  • NVIDIA/NemoClaw#3925 — Modifies rebuild logic in the same area where this PR stages manifest plans.

Suggested labels

enhancement: messaging, refactor, NemoClaw CLI

Suggested reviewers

  • ericksoa
  • cv
  • jyaunches

Poem

A rabbit taps keys in a moonlit den,
Manifests bloom where hardcodes had been;
Hooks hop in sequence, plans ride the breeze,
Env carries secrets with careful ease.
Rebuilds remember, sandboxes sing—
Policies nest where credentials spring. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch u/sdang/build-messaging-manifest-plan-in-rebuild

@sandl99 sandl99 changed the base branch from main to u/sdang/3896-phase-2-messaging-enrollment May 29, 2026 17:21
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

E2E Advisor Recommendation

Required E2E: channels-add-remove-e2e, channels-stop-start-e2e
Optional E2E: messaging-providers-e2e, network-policy-e2e, rebuild-openclaw-e2e

Dispatch hint: channels-add-remove-e2e,channels-stop-start-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/u/sdang/3896-phase-2-messaging-enrollment
Head: HEAD
Confidence: high

Required E2E

  • channels-add-remove-e2e (high; timeout 75 minutes): Direct coverage for the modified channels add and channels remove paths, provider registration/removal, policy preset application/removal, rebuild after add/remove, and the newly asserted host registry messaging.plan persistence.
  • channels-stop-start-e2e (high; timeout 120 minutes): Direct coverage for the modified channels stop/channels start registry-plan persistence paths and rebuild behavior across multiple channels and agents, including disabled/active plan state assertions added by this PR.

Optional E2E

  • messaging-providers-e2e (high; timeout 75 minutes): Useful adjacent confidence for messaging credential providers, placeholder/L7 proxy routing, and multi-provider channel setup. Not required because the main changed runtime paths are add/remove/stop/start after sandbox creation and are covered by the required channel lifecycle jobs.
  • network-policy-e2e (medium/high): Optional broader check for network policy behavior because channel plan entries and preset application/removal are touched; the required add/remove job already covers the channel-specific policy regression path.
  • rebuild-openclaw-e2e (medium/high; timeout 60 minutes): Optional general rebuild smoke for the changed rebuild preflight path. The required channel lifecycle jobs exercise rebuild with messaging state, so this is supplemental rather than merge-blocking.

New E2E recommendations

  • messaging rebuild from stored plan without host env (high): The new rebuild planner is intended to restore messaging manifest input values from the stored registry plan. Existing channel lifecycle scripts appear to keep channel env vars exported during rebuild, so they may not prove rebuild works after host env values are absent.
    • Suggested test: Add a focused E2E that onboards or adds a channel with config inputs, verifies registry messaging.plan, unsets channel config/token environment variables, runs nemoclaw <sandbox> rebuild --yes, and asserts the sandbox-rendered channel config still comes from the stored plan without leaking secrets.
  • multi-channel add/remove plan merge (medium): mergeSandboxMessagingPlans and removePlanChannel are new logic for preserving existing plan entries while adding/removing one channel. Current required add/remove coverage is Telegram-only from an initially empty messaging state.
    • Suggested test: Add or extend a channels lifecycle E2E to start with one configured channel, add a second channel, then remove one channel and assert the remaining channel's credential bindings, network policy entries, agent render entries, and disabled state are preserved.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: channels-add-remove-e2e,channels-stop-start-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw-telegram, ubuntu-repo-cloud-openclaw-slack
Optional scenario E2E: ubuntu-repo-cloud-hermes-slack, ubuntu-repo-cloud-openclaw-token-rotation

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-telegram
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-slack

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/u/sdang/3896-phase-2-messaging-enrollment
Head: HEAD
Confidence: medium

Required scenario E2E

  • ubuntu-repo-cloud-openclaw-telegram: Messaging planner and sandbox channel mutation/rebuild code changed; the OpenClaw Telegram scenario is the cheapest routed scenario that exercises OpenClaw messaging manifest enrollment, provider attachment, placeholder rendering, policy/config wiring, and bridge reachability.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-telegram
  • ubuntu-repo-cloud-openclaw-slack: The planner changes include credential-availability merging and multi-secret channel behavior; Slack exercises the multi-credential messaging surface and validates provider state on the primary OpenClaw onboarding path.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-slack

Optional scenario E2E

  • ubuntu-repo-cloud-hermes-slack: Optional adjacent coverage for the same Slack messaging suite on the Hermes agent, useful because the workflow planner emits agent-specific render state and rebuild staging now loads the agent before staging messaging plans.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack
  • ubuntu-repo-cloud-openclaw-token-rotation: Optional adjacent messaging lifecycle coverage for credential/provider isolation, relevant to the changed messaging plan credential-binding and sandbox registry persistence logic.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-token-rotation

Relevant changed files

  • src/lib/actions/sandbox/policy-channel.ts
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/messaging/compiler/workflow-planner.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

PR Review Advisor

Findings: 2 needs attention, 8 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 6 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • Stored messaging plans are trusted as rebuild/apply blueprints without manifest revalidation (src/lib/messaging/compiler/workflow-planner.ts:237): The durable SandboxEntry.messaging.plan is accepted when only schemaVersion, sandboxName, and agent match, then reused for channel lifecycle mutations and rebuild staging. Nested security authority such as networkPolicy.entries, credentialBindings, agentRender, hook references, build-file data, provider names, providerEnvKey values, and policy preset names is not recompiled or checked against the current built-in manifests. A corrupted, stale, or manually edited registry plan can therefore widen messaging network policy, alter rendered agent config, run unintended hook/build-file behavior within the applier's allowed roots, or make the credential applier read and upsert an unintended host environment variable.
    • Recommendation: Treat stored plans as cache/state, not authority. Recompile rebuild and lifecycle plans from trusted manifests plus persisted channel input/config state, or add deep validation that every stored channel, credential binding, policy entry, render target, hook reference, build step, state update, and health check matches the registered manifest contract before staging or applying it. Add negative tests that tampered policy keys, providerEnvKey/providerName, render/build/hook entries, and stale/unsupported channels are rejected or recompiled.
    • Evidence: readSandboxEntryPlan() only checks plan.schemaVersion, plan.sandboxName, and plan.agent before clonePlan(plan). buildRebuildPlanFromSandboxEntry() returns setPlanDisabledChannels(existingPlan, ...), planForSandboxEntryMutation() clones existingPlan with a new workflow, and stageMessagingManifestPlanForRebuild() writes that plan into the messaging setup env. The appliers later consume plan.networkPolicy.entries in applyPolicyAtOpenShell() and binding.providerEnvKey/providerName in applyCredentialsAtOpenShell().
  • Adding a channel to a legacy no-plan sandbox can persist a partial manifest plan (src/lib/messaging/compiler/workflow-planner.ts:63): The linked issue requires keeping legacy messagingChannels and disabledChannels behavior intact for sandboxes without a stored manifest plan. For a legacy sandbox that already has messagingChannels but no messaging.plan, buildChannelAddPlanFromSandboxEntry() compiles only the newly added channel and persists that as the stored plan. Existing legacy channels are not seeded from sandboxEntry.messagingChannels or disabledChannels. After this conversion, rebuild sees a stored plan and can stage only the newly added channel through the manifest path, omitting previously configured legacy channels.
    • Recommendation: When adding to a no-plan legacy sandbox, either keep the sandbox planless until a complete migration/backfill exists, or build a complete manifest plan that includes all existing supported sandboxEntry.messagingChannels and disabledChannels plus the newly added channel. Add a regression test for a legacy sandbox with one existing channel and no messaging.plan, then channels add a second channel and rebuild/persist, proving both channels are preserved.
    • Evidence: buildChannelAddPlanFromSandboxEntry() calls buildPlan({ configuredChannels: [context.channelId], disabledChannels: [] }) and returns compiledPlan when existingPlan is null. The new test only asserts that rebuild returns null when no stored plan exists; it does not cover add-channel on a no-plan legacy sandbox with pre-existing messagingChannels.

🔎 Worth checking

  • Source-of-truth review needed: SandboxEntry.messaging.plan as durable messaging source: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: readSandboxEntryPlan() only checks schemaVersion, sandboxName, and agent, then buildRebuildPlanFromSandboxEntry() and planForSandboxEntryMutation() reuse the cloned plan.
  • Source-of-truth review needed: No stored plan fallback during rebuild and lifecycle mutation: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: buildRebuildPlanFromSandboxEntry() returns null for no stored plan, while buildChannelAddPlanFromSandboxEntry() persists a compiledPlan containing only [context.channelId] when existingPlan is null.
  • Source-of-truth review needed: Credential availability fallback from stored plans and providerCredentialHashes: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: credentialAvailabilityFromPlan() reads stored input/binding flags; credentialAvailabilityFromSandboxEntry() trusts providerCredentialHashes; applyCredentialsAtOpenShell() reads env[binding.providerEnvKey].
  • Source-of-truth review needed: Best-effort onboard-session persistence in channel add: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: safeLoadOnboardSession() catches load errors and returns null; nearby helpers swallow session persistence failures with best-effort comments.
  • Credential availability fallback lacks a source-of-truth contract (src/lib/messaging/compiler/workflow-planner.ts:170): The planner infers future credential availability from stored plan flags and providerCredentialHashes. That supports deferred rebuild flows, but stale or tampered availability can make a channel appear startable and can influence which credential binding the later applier uses. This is especially risky because stored credentialBindings are not validated against manifests before the applier reads env[binding.providerEnvKey].
    • Recommendation: Document and enforce which source owns credential availability after enrollment. Validate stored credential bindings against the manifest's credential declarations, and add negative tests for stale hashes, substituted providerEnvKey/providerName, and stored plan flags that claim credentials are available when the trusted manifest/registry state does not support that claim.
    • Evidence: credentialAvailabilityFromSandboxEntry() marks manifest credential inputs available when providerCredentialHashes contains credential.providerEnvKey. credentialAvailabilityFromPlan() trusts secret input credentialAvailable flags and credentialBindings from the stored plan. applyCredentialsAtOpenShell() later reads the host env var named by binding.providerEnvKey and creates or updates binding.providerName.
  • Lifecycle persistence is mostly covered by planner tests and E2E assertions, not focused production-boundary tests (src/lib/actions/sandbox/policy-channel.ts:702): The new planner tests and E2E registry assertions cover important happy paths, but the high-risk production command handlers still lack focused tests around registry writes, gateway provider updates, policy preset updates, prompt/rebuild env staging, failure ordering, missing stored plans, and tampered stored plans. E2E coverage is useful but too coarse to prove these security-sensitive lifecycle branches fail closed.
    • Recommendation: Add focused command-handler or integration tests with mocked registry, session, OpenShell, policy, and rebuild dependencies. Cover add writing both legacy fields and messaging.plan, stop/start/remove updating both legacy state and stored plan, rebuild writing and clearing NEMOCLAW_MESSAGING_PLAN_B64, no-plan sandboxes preserving legacy behavior, and tampered stored plans being rejected or recompiled.
    • Evidence: The direct tests added are in workflow-planner.test.ts. The changed production callers persistManifestChannelDisabledPlan(), persistManifestChannelRemovePlan(), planSandboxChannelAdd(), and stageMessagingManifestPlanForRebuild() are not directly exercised by focused unit/integration tests. The deterministic test-depth signal recommends runtime/integration validation for policy-channel.ts, rebuild.ts, and workflow-planner.ts.
  • Large-file growth adds new messaging/sandbox lifecycle monolith hotspots (src/lib/messaging/compiler/workflow-planner.ts:1): This PR adds broad planner persistence, merge, source-of-truth fallback, and lifecycle mutation behavior into already central messaging and sandbox lifecycle files. That makes future security review of policy, credential, rebuild, and registry state transitions harder, and duplicates merge behavior already present in MessagingHostStateApplier.
    • Recommendation: Split stored-plan reading/validation, lifecycle mutation, merge behavior, and source-of-truth migration logic into focused modules with targeted tests, or offset the growth with extraction in this PR. Avoid duplicating merge semantics between the planner and host-state applier unless one is clearly authoritative and covered by shared tests.
    • Evidence: Deterministic monolith data flags workflow-planner.ts +358 lines, workflow-planner.test.ts +257 lines, rebuild.ts +48 lines, and policy-channel.ts +41 lines. workflow-planner.ts now has mergeSandboxMessagingPlans() similar to host-state-applier.ts merge logic.
  • Best-effort onboard-session fallback still lacks tests and removal criteria (src/lib/actions/sandbox/policy-channel.ts:891): The touched channel lifecycle file still tolerates missing or unreadable onboard-session state while relying on registry state or immediate rebuild behavior for deferred channel setup. The source-of-truth questions are not fully answered: which durable source wins for corrupted, missing, or wrong-sandbox sessions; why the source cannot be consolidated now; what regression test proves deferred rebuild behavior remains correct; and when this workaround can be removed.
    • Recommendation: Add tests for corrupted, missing, and wrong-sandbox onboard sessions during channels add, including deferred rebuild behavior for WeChat metadata/config. Document whether registry state, messagingChannelConfig, or onboard-session owns each value and define a removal condition for the best-effort session fallback.
    • Evidence: safeLoadOnboardSession() catches loadSession() failures and returns null. Nearby persistence helpers swallow session update failures with comments that immediate rebuild remains usable or deferred rebuilds can be recovered by re-running channels add.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: SandboxEntry.messaging.plan as durable messaging source: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: readSandboxEntryPlan() only checks schemaVersion, sandboxName, and agent, then buildRebuildPlanFromSandboxEntry() and planForSandboxEntryMutation() reuse the cloned plan.
  • Source-of-truth review needed: No stored plan fallback during rebuild and lifecycle mutation: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: buildRebuildPlanFromSandboxEntry() returns null for no stored plan, while buildChannelAddPlanFromSandboxEntry() persists a compiledPlan containing only [context.channelId] when existingPlan is null.
  • Source-of-truth review needed: Credential availability fallback from stored plans and providerCredentialHashes: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: credentialAvailabilityFromPlan() reads stored input/binding flags; credentialAvailabilityFromSandboxEntry() trusts providerCredentialHashes; applyCredentialsAtOpenShell() reads env[binding.providerEnvKey].
  • Source-of-truth review needed: Best-effort onboard-session persistence in channel add: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: safeLoadOnboardSession() catches load errors and returns null; nearby helpers swallow session persistence failures with best-effort comments.
  • Stored messaging plans are trusted as rebuild/apply blueprints without manifest revalidation (src/lib/messaging/compiler/workflow-planner.ts:237): The durable SandboxEntry.messaging.plan is accepted when only schemaVersion, sandboxName, and agent match, then reused for channel lifecycle mutations and rebuild staging. Nested security authority such as networkPolicy.entries, credentialBindings, agentRender, hook references, build-file data, provider names, providerEnvKey values, and policy preset names is not recompiled or checked against the current built-in manifests. A corrupted, stale, or manually edited registry plan can therefore widen messaging network policy, alter rendered agent config, run unintended hook/build-file behavior within the applier's allowed roots, or make the credential applier read and upsert an unintended host environment variable.
    • Recommendation: Treat stored plans as cache/state, not authority. Recompile rebuild and lifecycle plans from trusted manifests plus persisted channel input/config state, or add deep validation that every stored channel, credential binding, policy entry, render target, hook reference, build step, state update, and health check matches the registered manifest contract before staging or applying it. Add negative tests that tampered policy keys, providerEnvKey/providerName, render/build/hook entries, and stale/unsupported channels are rejected or recompiled.
    • Evidence: readSandboxEntryPlan() only checks plan.schemaVersion, plan.sandboxName, and plan.agent before clonePlan(plan). buildRebuildPlanFromSandboxEntry() returns setPlanDisabledChannels(existingPlan, ...), planForSandboxEntryMutation() clones existingPlan with a new workflow, and stageMessagingManifestPlanForRebuild() writes that plan into the messaging setup env. The appliers later consume plan.networkPolicy.entries in applyPolicyAtOpenShell() and binding.providerEnvKey/providerName in applyCredentialsAtOpenShell().
  • Adding a channel to a legacy no-plan sandbox can persist a partial manifest plan (src/lib/messaging/compiler/workflow-planner.ts:63): The linked issue requires keeping legacy messagingChannels and disabledChannels behavior intact for sandboxes without a stored manifest plan. For a legacy sandbox that already has messagingChannels but no messaging.plan, buildChannelAddPlanFromSandboxEntry() compiles only the newly added channel and persists that as the stored plan. Existing legacy channels are not seeded from sandboxEntry.messagingChannels or disabledChannels. After this conversion, rebuild sees a stored plan and can stage only the newly added channel through the manifest path, omitting previously configured legacy channels.
    • Recommendation: When adding to a no-plan legacy sandbox, either keep the sandbox planless until a complete migration/backfill exists, or build a complete manifest plan that includes all existing supported sandboxEntry.messagingChannels and disabledChannels plus the newly added channel. Add a regression test for a legacy sandbox with one existing channel and no messaging.plan, then channels add a second channel and rebuild/persist, proving both channels are preserved.
    • Evidence: buildChannelAddPlanFromSandboxEntry() calls buildPlan({ configuredChannels: [context.channelId], disabledChannels: [] }) and returns compiledPlan when existingPlan is null. The new test only asserts that rebuild returns null when no stored plan exists; it does not cover add-channel on a no-plan legacy sandbox with pre-existing messagingChannels.
  • Credential availability fallback lacks a source-of-truth contract (src/lib/messaging/compiler/workflow-planner.ts:170): The planner infers future credential availability from stored plan flags and providerCredentialHashes. That supports deferred rebuild flows, but stale or tampered availability can make a channel appear startable and can influence which credential binding the later applier uses. This is especially risky because stored credentialBindings are not validated against manifests before the applier reads env[binding.providerEnvKey].
    • Recommendation: Document and enforce which source owns credential availability after enrollment. Validate stored credential bindings against the manifest's credential declarations, and add negative tests for stale hashes, substituted providerEnvKey/providerName, and stored plan flags that claim credentials are available when the trusted manifest/registry state does not support that claim.
    • Evidence: credentialAvailabilityFromSandboxEntry() marks manifest credential inputs available when providerCredentialHashes contains credential.providerEnvKey. credentialAvailabilityFromPlan() trusts secret input credentialAvailable flags and credentialBindings from the stored plan. applyCredentialsAtOpenShell() later reads the host env var named by binding.providerEnvKey and creates or updates binding.providerName.
  • Lifecycle persistence is mostly covered by planner tests and E2E assertions, not focused production-boundary tests (src/lib/actions/sandbox/policy-channel.ts:702): The new planner tests and E2E registry assertions cover important happy paths, but the high-risk production command handlers still lack focused tests around registry writes, gateway provider updates, policy preset updates, prompt/rebuild env staging, failure ordering, missing stored plans, and tampered stored plans. E2E coverage is useful but too coarse to prove these security-sensitive lifecycle branches fail closed.
    • Recommendation: Add focused command-handler or integration tests with mocked registry, session, OpenShell, policy, and rebuild dependencies. Cover add writing both legacy fields and messaging.plan, stop/start/remove updating both legacy state and stored plan, rebuild writing and clearing NEMOCLAW_MESSAGING_PLAN_B64, no-plan sandboxes preserving legacy behavior, and tampered stored plans being rejected or recompiled.
    • Evidence: The direct tests added are in workflow-planner.test.ts. The changed production callers persistManifestChannelDisabledPlan(), persistManifestChannelRemovePlan(), planSandboxChannelAdd(), and stageMessagingManifestPlanForRebuild() are not directly exercised by focused unit/integration tests. The deterministic test-depth signal recommends runtime/integration validation for policy-channel.ts, rebuild.ts, and workflow-planner.ts.
  • Large-file growth adds new messaging/sandbox lifecycle monolith hotspots (src/lib/messaging/compiler/workflow-planner.ts:1): This PR adds broad planner persistence, merge, source-of-truth fallback, and lifecycle mutation behavior into already central messaging and sandbox lifecycle files. That makes future security review of policy, credential, rebuild, and registry state transitions harder, and duplicates merge behavior already present in MessagingHostStateApplier.
    • Recommendation: Split stored-plan reading/validation, lifecycle mutation, merge behavior, and source-of-truth migration logic into focused modules with targeted tests, or offset the growth with extraction in this PR. Avoid duplicating merge semantics between the planner and host-state applier unless one is clearly authoritative and covered by shared tests.
    • Evidence: Deterministic monolith data flags workflow-planner.ts +358 lines, workflow-planner.test.ts +257 lines, rebuild.ts +48 lines, and policy-channel.ts +41 lines. workflow-planner.ts now has mergeSandboxMessagingPlans() similar to host-state-applier.ts merge logic.
  • Best-effort onboard-session fallback still lacks tests and removal criteria (src/lib/actions/sandbox/policy-channel.ts:891): The touched channel lifecycle file still tolerates missing or unreadable onboard-session state while relying on registry state or immediate rebuild behavior for deferred channel setup. The source-of-truth questions are not fully answered: which durable source wins for corrupted, missing, or wrong-sandbox sessions; why the source cannot be consolidated now; what regression test proves deferred rebuild behavior remains correct; and when this workaround can be removed.
    • Recommendation: Add tests for corrupted, missing, and wrong-sandbox onboard sessions during channels add, including deferred rebuild behavior for WeChat metadata/config. Document whether registry state, messagingChannelConfig, or onboard-session owns each value and define a removal condition for the best-effort session fallback.
    • Evidence: safeLoadOnboardSession() catches loadSession() failures and returns null. Nearby persistence helpers swallow session update failures with comments that immediate rebuild remains usable or deferred rebuilds can be recovered by re-running channels add.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@wscurran
Copy link
Copy Markdown
Contributor

@cv cv added the integration: whatsapp WhatsApp integration or channel behavior label May 30, 2026
@wscurran wscurran added area: messaging Messaging channels, bridges, manifests, or channel lifecycle bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality and removed enhancement: feature labels Jun 3, 2026
Signed-off-by: San Dang <sdang@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 26940918197
Target ref: u/sdang/build-messaging-manifest-plan-in-rebuild
Requested jobs: all (no filter)
Summary: 56 passed, 0 failed, 2 skipped

Job Result
bedrock-runtime-compatible-anthropic-e2e ✅ success
brave-search-e2e ✅ success
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ✅ success
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
credential-migration-e2e ✅ success
credential-sanitization-e2e ✅ success
device-auth-health-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
double-onboard-e2e ✅ success
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-dashboard-e2e ✅ success
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-slack-e2e ✅ success
inference-routing-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success
issue-3600-gpu-proof-optional-e2e ✅ success
issue-4462-gateway-pinned-approval-characterization-e2e ✅ success
issue-4462-scope-upgrade-approval-e2e ✅ success
kimi-inference-compat-e2e ✅ success
launchable-smoke-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
onboard-negative-paths-e2e ✅ success
onboard-repair-e2e ✅ success
onboard-resume-e2e ✅ success
openclaw-discord-pairing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
openclaw-onboard-security-posture-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
rebuild-openclaw-e2e ✅ success
runtime-overrides-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
sessions-agents-cli-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ✅ success
telegram-injection-e2e ✅ success
token-rotation-e2e ✅ success
tunnel-lifecycle-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success
vm-driver-privileged-exec-routing-e2e ✅ success

Signed-off-by: San Dang <sdang@nvidia.com>
Comment thread test/e2e/test-channels-add-remove.sh Fixed
Comment thread test/e2e/test-channels-add-remove.sh Fixed
Comment thread test/e2e/test-channels-stop-start.sh Fixed
Comment thread test/e2e/test-channels-stop-start.sh Fixed
Signed-off-by: San Dang <sdang@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 26943037345
Target ref: u/sdang/build-messaging-manifest-plan-in-rebuild
Workflow ref: u/sdang/build-messaging-manifest-plan-in-rebuild
Requested jobs: channels-add-remove-e2e,channels-stop-start-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
channels-add-remove-e2e ⚠️ cancelled
channels-stop-start-e2e ⚠️ cancelled

@sandl99 sandl99 merged commit 752ace3 into u/sdang/3896-phase-2-messaging-enrollment Jun 4, 2026
84 of 85 checks passed
@sandl99 sandl99 deleted the u/sdang/build-messaging-manifest-plan-in-rebuild branch June 4, 2026 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: messaging Messaging channels, bridges, manifests, or channel lifecycle bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality integration: whatsapp WhatsApp integration or channel behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 2b: persist manifest plans for channel lifecycle and rebuild

4 participants