Skip to content

fix(e2e): shard channel stop-start nightly#5212

Open
sandl99 wants to merge 1 commit into
mainfrom
fix/e2e-nightly-shards
Open

fix(e2e): shard channel stop-start nightly#5212
sandl99 wants to merge 1 commit into
mainfrom
fix/e2e-nightly-shards

Conversation

@sandl99

@sandl99 sandl99 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Shard the channel stop/start nightly E2E by agent so OpenClaw and Hermes coverage can run independently. This also keeps the overlayfs autofix E2E aligned with the refactored Docker-driver platform gate so it skips the legacy k3s path on Linux.

Changes

  • Add NEMOCLAW_CHANNELS_STOP_START_AGENT support to test/e2e/test-channels-stop-start.sh for openclaw, hermes, or the existing combined all mode.
  • Split the nightly channels-stop-start-e2e workflow job into channels-stop-start-openclaw-e2e and channels-stop-start-hermes-e2e with agent-specific env and artifacts.
  • Point the overlayfs E2E applicability check at src/lib/onboard/docker-driver-platform.ts after the Docker-driver gate moved there.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: San Dang sdang@nvidia.com

Summary by CodeRabbit

  • Tests
    • E2E test infrastructure refactored for improved scalability and execution efficiency
    • Nightly test workflow now runs channel stop/start tests as independent shards by agent type
    • Enhanced test parameterization for flexible test configuration across different environments

Signed-off-by: San Dang <sdang@nvidia.com>
@sandl99 sandl99 self-assigned this Jun 11, 2026
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR splits the channels stop-start E2E test into two sharded jobs by agent, parameterizes the test script to run OpenClaw, Hermes, or both via environment variable, and updates CI orchestration dependencies accordingly. A minor fix also redirects the overlayfs applicability check to a different source file.

Changes

Channels Stop-Start E2E Test Sharding

Layer / File(s) Summary
Test Script Parameterization
test/e2e/test-channels-stop-start.sh
Test script now reads NEMOCLAW_CHANNELS_STOP_START_AGENT to select which agent(s) to run (all, openclaw, or hermes). Variables REQUESTED_AGENT and SELECTED_AGENT_SCENARIOS are computed; teardown registration and main execution loop over selected scenarios instead of calling OpenClaw and Hermes unconditionally.
Workflow Job Sharding and Orchestration
.github/workflows/nightly-e2e.yaml
Single channels-stop-start-e2e job is replaced with channels-stop-start-openclaw-e2e and channels-stop-start-hermes-e2e. Workflow dispatch input allowlist is updated to list the new job IDs. Downstream jobs notify-on-failure, report-to-pr, and scorecard now depend on both sharded jobs instead of the single parent job.
Overlayfs Test Applicability Check
test/e2e/test-overlayfs-autofix.sh
Linux applicability skip condition now looks for the platform check in src/lib/onboard/docker-driver-platform.ts instead of src/lib/onboard.ts.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

area: ci, area: e2e, nightly-e2e, area: onboarding

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 Two agents now run side by side,
Where one job once lived with pride—
The script now chooses who takes the stage,
While workflows orchestrate each engagement. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: sharding the channel stop-start nightly E2E test by agent.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/e2e-nightly-shards

Comment @coderabbitai help to get the list of available commands and usage tips.

@sandl99 sandl99 added nightly-e2e Nightly E2E test failures area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: channels-stop-start-openclaw-e2e, channels-stop-start-hermes-e2e, overlayfs-autofix-e2e

Dispatch hint: channels-stop-start-openclaw-e2e,channels-stop-start-hermes-e2e,overlayfs-autofix-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No merge-blocking product E2E is required because the changed files are E2E workflow/test harness files only and do not modify installer, onboarding, credentials, sandbox lifecycle implementation, inference routing, deployment assets, or assistant runtime behavior. The listed E2Es are optional confidence checks for the modified CI jobs/scripts themselves.

Optional E2E

  • channels-stop-start-openclaw-e2e (high): Validates the newly introduced OpenClaw shard for the modified channels stop/start script and confirms the renamed workflow job is dispatchable and reports correctly.
  • channels-stop-start-hermes-e2e (high): Validates the newly introduced Hermes shard for the modified channels stop/start script and catches agent-specific regressions in the sharding environment variable or workflow wiring.
  • overlayfs-autofix-e2e (medium): Exercises the modified overlayfs autofix E2E skip/applicability logic and ensures the script still reaches the intended onboarding regression path or cleanly skips on Docker-driver Linux environments.

New E2E recommendations

  • ci-e2e-workflow-selective-dispatch (medium): This PR renames/shards an E2E job and updates multiple workflow needs lists. A lightweight non-runtime CI contract check should verify that every workflow_dispatch jobs input entry has a matching dispatch predicate and that report/scorecard/notify needs include all dispatchable jobs.
    • Suggested test: Add or extend a workflow contract test for nightly-e2e job IDs, selective-dispatch predicates, and aggregate needs lists.

Dispatch hint

  • Workflow: E2E / Nightly
  • jobs input: channels-stop-start-openclaw-e2e,channels-stop-start-hermes-e2e,overlayfs-autofix-e2e

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • None. Changes are limited to the legacy nightly E2E workflow and direct test/e2e shell scripts. They do not modify the Vitest scenario workflow, registry, runtime support, live Vitest entry point, or test/e2e-scenario fixtures/scenarios, so no Vitest-backed E2E scenario dispatch is recommended.

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • None.

@github-actions

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 3 worth checking, 1 nice ideas
Top item: Remove the unused GitHub token from the new channel stop/start shards

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: Overlayfs autofix applicability skip: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `test/e2e/test-overlayfs-autofix.sh:134` greps for `platform === "linux"` in `src/lib/onboard/docker-driver-platform.ts`.
  • New stop/start shards expose an unused GitHub token to target-ref code (.github/workflows/nightly-e2e.yaml:773): Both new reusable workflow jobs set `github_token: true`, but `test/e2e/test-channels-stop-start.sh` does not reference `GITHUB_TOKEN`. That preserves the old combined job's exposure while duplicating the dispatchable target-ref surface. For selective dispatches, the reusable runner checks out `inputs.target_ref || github.ref` and runs the target script, so unnecessary credentials should be withheld from this sandbox/infrastructure path.
    • Recommendation: Set `github_token: false` or remove the `github_token` input from `channels-stop-start-openclaw-e2e` and `channels-stop-start-hermes-e2e` unless the script has a documented, verified need for GitHub API access.
    • Evidence: The new shard at `.github/workflows/nightly-e2e.yaml:773` sets `github_token: true`; the Hermes shard does the same at line 792. Grep of `test/e2e/test-channels-stop-start.sh` found only `NVIDIA_API_KEY` usage and no `GITHUB_TOKEN` reference.
  • Repo-maintained E2E guidance still references the removed job id (.coderabbit.yaml:200): This PR removes the dispatchable `channels-stop-start-e2e` job and replaces it with `channels-stop-start-openclaw-e2e` and `channels-stop-start-hermes-e2e`, but `.coderabbit.yaml` still recommends and shows manual-dispatch commands using the old job id. Those instructions now select a non-existent job and undermine the new sharding contract.
    • Recommendation: Update the `.coderabbit.yaml` recommendations and example `gh workflow run` commands to use the two new shard ids, or document when both should be requested together.
    • Evidence: `.github/workflows/nightly-e2e.yaml` contains only `channels-stop-start-openclaw-e2e` and `channels-stop-start-hermes-e2e`; `.coderabbit.yaml:200`, `:216`, `:244`, and `:249` still mention `channels-stop-start-e2e`.

🌱 Nice ideas

  • Overlayfs applicability still depends on a brittle source-text grep (test/e2e/test-overlayfs-autofix.sh:134): The PR correctly moves the sentinel to the current Docker-driver platform file, but the skip decision still infers runtime behavior by grepping source text for `platform === "linux"`. A future refactor that preserves behavior but changes formatting could silently re-enable the legacy k3s overlayfs path in this E2E.
    • Recommendation: Prefer calling a small source-of-truth helper or add a regression check that proves the overlayfs E2E skips when `isLinuxDockerDriverGatewayEnabled('linux')` is true.
    • Evidence: `test/e2e/test-overlayfs-autofix.sh:134` checks `grep -q 'platform === "linux"' "$REPO_ROOT/src/lib/onboard/docker-driver-platform.ts"`; the source predicate currently lives in `src/lib/onboard/docker-driver-platform.ts`.
Consider writing more tests for
  • **Runtime validation** — Selective dispatch with only `channels-stop-start-openclaw-e2e` runs `test/e2e/test-channels-stop-start.sh` with `NEMOCLAW_CHANNELS_STOP_START_AGENT=openclaw`.. The PR changes nightly workflow wiring and real sandbox E2E sharding, so behavioral validation should exercise the dispatch/runtime boundary rather than relying only on static review.
  • **Runtime validation** — Selective dispatch with only `channels-stop-start-hermes-e2e` runs `test/e2e/test-channels-stop-start.sh` with `NEMOCLAW_CHANNELS_STOP_START_AGENT=hermes`.. The PR changes nightly workflow wiring and real sandbox E2E sharding, so behavioral validation should exercise the dispatch/runtime boundary rather than relying only on static review.
  • **Runtime validation** — `test/e2e/test-channels-stop-start.sh` with default `NEMOCLAW_CHANNELS_STOP_START_AGENT=all` still schedules both OpenClaw and Hermes scenarios.. The PR changes nightly workflow wiring and real sandbox E2E sharding, so behavioral validation should exercise the dispatch/runtime boundary rather than relying only on static review.
  • **Runtime validation** — `test/e2e/test-channels-stop-start.sh` exits nonzero before onboarding for an invalid `NEMOCLAW_CHANNELS_STOP_START_AGENT` value.. The PR changes nightly workflow wiring and real sandbox E2E sharding, so behavioral validation should exercise the dispatch/runtime boundary rather than relying only on static review.
  • **Runtime validation** — Repo-maintained E2E recommendation docs/config do not reference removed nightly job ids.. The PR changes nightly workflow wiring and real sandbox E2E sharding, so behavioral validation should exercise the dispatch/runtime boundary rather than relying only on static review.
  • **Overlayfs applicability still depends on a brittle source-text grep** — Prefer calling a small source-of-truth helper or add a regression check that proves the overlayfs E2E skips when `isLinuxDockerDriverGatewayEnabled('linux')` is true.
  • **Acceptance clause:** Split the nightly `channels-stop-start-e2e` workflow job into `channels-stop-start-openclaw-e2e` and `channels-stop-start-hermes-e2e` with agent-specific env and artifacts. — add test evidence or identify existing coverage. The two jobs, env values, artifact names, and downstream `needs` lists are updated, but repo-maintained dispatch guidance in `.coderabbit.yaml` still references the removed `channels-stop-start-e2e` job.
  • **Overlayfs autofix applicability skip** — Add a check that the overlayfs E2E skip condition tracks `isLinuxDockerDriverGatewayEnabled('linux')` rather than a specific source substring.. `test/e2e/test-overlayfs-autofix.sh:134` greps for `platform === "linux"` in `src/lib/onboard/docker-driver-platform.ts`.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/e2e/test-overlayfs-autofix.sh (1)

134-139: Applicability gate: grep pattern matches; add optional defensive guard

  • test/e2e/test-overlayfs-autofix.sh greps 'platform === "linux"' in src/lib/onboard/docker-driver-platform.ts; that file exists and contains return platform === "linux" || ..., so the Linux skip will trigger as intended.
  • Optional: guard the grep with an -f check to avoid noisy grep: ... No such file or directory if the file is renamed/removed in a future refactor.
🛡️ Suggested fix
-if [ "$(uname -s)" = "Linux" ] && grep -q 'platform === "linux"' "$REPO_ROOT/src/lib/onboard/docker-driver-platform.ts"; then
+if [ "$(uname -s)" = "Linux" ] && [ -f "$REPO_ROOT/src/lib/onboard/docker-driver-platform.ts" ] && grep -q 'platform === "linux"' "$REPO_ROOT/src/lib/onboard/docker-driver-platform.ts"; then
   section "Applicability"
   skip "OpenShell Docker-driver onboarding is active on Linux; k3s overlayfs auto-fix is not in the runtime path"
   print_summary
   exit 0
 fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-overlayfs-autofix.sh` around lines 134 - 139, The grep in
test/e2e/test-overlayfs-autofix.sh currently assumes
src/lib/onboard/docker-driver-platform.ts exists and can print "No such file" if
it's missing; update the applicability gate to first check that the file exists
(use a -f test on src/lib/onboard/docker-driver-platform.ts) before running grep
for the pattern 'platform === "linux"'. If the file check fails, skip the grep
and continue normal flow (i.e., do not treat missing file as a grep failure),
keeping the existing skip behavior when the file exists and the pattern matches.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/e2e/test-overlayfs-autofix.sh`:
- Around line 134-139: The grep in test/e2e/test-overlayfs-autofix.sh currently
assumes src/lib/onboard/docker-driver-platform.ts exists and can print "No such
file" if it's missing; update the applicability gate to first check that the
file exists (use a -f test on src/lib/onboard/docker-driver-platform.ts) before
running grep for the pattern 'platform === "linux"'. If the file check fails,
skip the grep and continue normal flow (i.e., do not treat missing file as a
grep failure), keeping the existing skip behavior when the file exists and the
pattern matches.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 036d8a60-c6e1-4e77-8d12-7bbb642ed500

📥 Commits

Reviewing files that changed from the base of the PR and between 3d5dab6 and 381ada4.

📒 Files selected for processing (3)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-channels-stop-start.sh
  • test/e2e/test-overlayfs-autofix.sh

@sandl99 sandl99 requested a review from cv June 11, 2026 10:37
@sandl99 sandl99 added the v0.0.64 Release target label Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27340814580
Target ref: fix/e2e-nightly-shards
Requested jobs: overlayfs-autofix-e2e,channels-stop-start-openclaw-e2e,channels-stop-start-hermes-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
channels-stop-start-hermes-e2e ✅ success
channels-stop-start-openclaw-e2e ✅ success
overlayfs-autofix-e2e ✅ success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow nightly-e2e Nightly E2E test failures v0.0.64 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant