Skip to content

fix(cli): surface docker/sandbox container/dashboard port failure layers in sandbox status#4388

Merged
cv merged 14 commits into
mainfrom
fix/4313-status-docker-unreachable-header
Jun 2, 2026
Merged

fix(cli): surface docker/sandbox container/dashboard port failure layers in sandbox status#4388
cv merged 14 commits into
mainfrom
fix/4313-status-docker-unreachable-header

Conversation

@laitingsheng

@laitingsheng laitingsheng commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

sandbox status (text and --json) on docker-driver sandboxes now classifies a stopped local stack before the inference probe runs, so the host-side Inference line never falsely reports healthy when the daemon is down, the per-sandbox container is stopped, or the dashboard port is held by a foreign listener. The classification surfaces as a verbatim failure-layer header in human output and a new failureLayer field in the JSON report, with exit code 1 in both paths.

Related Issue

Fixes #4313
Fixes #4515

Changes

  • Add sandbox_container_stopped and sandbox_dashboard_port_conflict layers in gateway-failure-classifier.ts plus classifySandboxContainerFailure (gated on integer dashboard port [1, 65535]).
  • Extract resolveSandboxContainerOwner into src/lib/actions/sandbox/sandbox-container-owner.ts; reuse from both gateway-failure-classifier.ts and docker-health.ts so the longest-owner rule is the single source of truth.
  • Add classifySandboxStatusPreflightFailure in status.ts; both showSandboxStatus and getSandboxStatusReport run it first, then pass suppressInferenceProbe into collectSandboxStatusSnapshot via the new maybeGetSandboxStatusInferenceHealth gate so the remote provider probe is never issued when the classifier already reported a failure.
  • Expose failureLayer on SandboxStatusReport; src/commands/sandbox/status.ts sets process.exitCode = 1 when it is non-null.
  • Update docs/reference/commands.mdx to describe the new failure layers.
  • Cover the new behaviour in src/lib/actions/sandbox/status.test.ts, test/gateway-failure-classifier.test.ts, test/sandbox-container-owner.test.ts, and four new --json integration cases in test/cli.test.ts (including a real net.createServer foreign listener for the dashboard-port-conflict path).

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • New Features

    • Adds sandbox preflight failure layers (docker_unreachable, sandbox_container_stopped, sandbox_dashboard_port_conflict). Exposed in status JSON as failureLayer; when set, inferenceHealth is reported as null and a single "Failure layer:" header appears in text output. Improved container ownership resolution and dashboard-port conflict detection.
  • Bug Fixes

    • CLI exit code now reflects preflight failures and suppresses host inference output when a preflight failure is present.
  • Documentation

    • Status docs updated to describe failureLayer values, inference suppression, and non-zero exit behavior.
  • Tests

    • Expanded CLI and unit tests covering failure layers, port probing, ownership resolution, and JSON semantics.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds upfront preflight classification for Docker-driver sandboxes (daemon unreachable, container stopped, dashboard-port conflict), threads inference-probe suppression into snapshot collection, exposes failureLayer in JSON and CLI output, refactors header printing to avoid duplicates, and adds unit, CLI tests and docs.

Changes

Docker Unreachable Detection

Layer / File(s) Summary
Documentation update for status command
docs/reference/commands.mdx
Documents docker_unreachable, sandbox_container_stopped, and sandbox_dashboard_port_conflict behaviors, first-line header semantics, inference suppression, and non-zero exit behavior.
Gateway failure classifier: types & implementation
src/lib/actions/sandbox/gateway-failure-classifier.ts
Extends GatewayFailureLayer with sandbox container failure layers, adds isDockerDaemonReachable(), new sandbox container failure types/runners, default runners, dashboard-port validation, and classifySandboxContainerFailure().
Sandbox container owner resolver
src/lib/actions/sandbox/sandbox-container-owner.ts
Adds resolveSandboxContainerOwner() that parses docker ps names and resolves ownership using exact and longest-owner prefix rules.
Sandbox owner unit tests
test/sandbox-container-owner.test.ts
Tests exact/prefixed matching, longest-owner resolution, known-owner edge cases, and input sanitation.
Docker-health integration
src/lib/actions/sandbox/docker-health.ts
Replaces in-function container-name resolution with resolveSandboxContainerOwner(...) wired to docker ps names and registered sandbox names.
Preflight classification & header printing
src/lib/actions/sandbox/status-preflight.ts
Implements preflight probe/deps, preflight classifier, JSON-compatible preflight result, terminal-phase suppression, and preflight-aware header printing to avoid duplicate Failure layer: lines.
Status snapshot & inference gating
src/lib/actions/sandbox/status-snapshot.ts
Adds snapshot/report contracts, getSandboxStatusInferenceHealth, gating wrapper maybeGetSandboxStatusInferenceHealth, snapshot collection, provider/model parsing, and getSandboxStatusReport composition including failureLayer.
Status flow: inference gating and failure output
src/lib/actions/sandbox/status.ts
Re-exports preflight/snapshot helpers, removes local snapshot implementations, runs preflight before snapshot collection, threads suppressInferenceProbe into collectSandboxStatusSnapshot, refactors header printing, and sets exit code on preflight failures.
Gateway-failure classifier tests
test/gateway-failure-classifier.test.ts
Adds tests for new sandbox container failure classification, dashboard-port probing, container-name matching, longest-owner collision behavior, and invalid dashboardPort handling.
Status helper tests
src/lib/actions/sandbox/status.test.ts
Adds unit tests for isDockerDaemonUnreachableForStatus(), classifySandboxContainerFailureForStatus(), maybeGetSandboxStatusInferenceHealth(), and classifySandboxStatusPreflightFailure().
CLI tests for sandbox status JSON/text
test/cli.test.ts
Adds non-JSON and --json tests asserting failure-layer headers, inference suppression, ordering relative to Sandbox: <name>, and exit-code behavior; includes TCP listener use for port-conflict simulation.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as showSandboxStatus
  participant Preflight as classifySandboxStatusPreflightFailure
  participant Docker as isDockerDaemonReachable
  participant ContainerClassifier as classifySandboxContainerFailure
  participant PS as dockerPsNames
  participant Port as portProbe

  CLI->>Preflight: ask preflight(sandbox)
  Preflight->>Docker: check docker daemon reachability
  alt docker unreachable
    Docker-->>Preflight: unreachable -> docker_unreachable
  else docker reachable
    Preflight->>ContainerClassifier: classify sandbox container
    ContainerClassifier->>PS: list container names
    ContainerClassifier->>Port: probe dashboardPort (if valid)
    Port-->>ContainerClassifier: port held / free
  end
  Preflight-->>CLI: preflight result (failureLayer or null)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

Docker, NemoClaw CLI

Suggested reviewers

  • cv
  • cjagwani

"I hopped the socket, nosed the pipe,
Found the daemon sleeping through the night.
I print the header, bold and clear,
Suppress stale probes, set exit gear.
Now hop to it — the bug won't fix itself!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective: surfacing docker/sandbox container/dashboard port failure layers in the sandbox status command.
Linked Issues check ✅ Passed The PR fully implements the requirements from both linked issues: adds docker_unreachable, sandbox_container_stopped, and sandbox_dashboard_port_conflict layers; suppresses inference probe when failures detected; exits non-zero on failures; accurately resolves container ownership via longest-owner rule; and updates docs and tests.
Out of Scope Changes check ✅ Passed All changes are within scope. New modules (status-preflight.ts, status-snapshot.ts, sandbox-container-owner.ts) and modifications logically support the failure-layer classification objective without introducing unrelated features.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4313-status-docker-unreachable-header

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: sandbox-operations-e2e
Optional E2E: diagnostics-e2e, gateway-drift-preflight-e2e

Dispatch hint: sandbox-operations-e2e

Auto-dispatched E2E: sandbox-operations-e2e via nightly-e2e.yaml at 21a0c8f2619f389818ff05a7b635599526ffb5e2nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • sandbox-operations-e2e (high (~60 min; installs/onboards live sandboxes)): Required because the PR changes live sandbox status, Docker-backed sandbox lifecycle diagnostics, and gateway recovery/status paths. This existing live E2E exercises nemoclaw <sandbox> status, status fields, process recovery, and gateway recovery against real OpenShell/Docker sandbox state.

Optional E2E

  • diagnostics-e2e (medium/high (live install + onboarding)): Useful adjacent confidence because it runs a live sandbox and checks nemoclaw status output as part of diagnostics validation, but it is less targeted than sandbox-operations for the changed preflight/container failure behavior.
  • gateway-drift-preflight-e2e (low/medium (~15 min; hermetic fakes)): Optional hermetic regression coverage for fail-closed gateway/preflight diagnostics. It is adjacent to gateway-state health classification, but it does not directly exercise the new per-sandbox stopped-container or dashboard-port-conflict status layers.

New E2E recommendations

  • sandbox status preflight failure layers (high): No existing E2E appears to directly stop a Docker-driver per-sandbox container while keeping the registry entry and then assert failureLayer=sandbox_container_stopped, non-zero exit, first-line text header, and suppressed inferenceHealth. Current coverage is primarily unit/CLI-fake tests plus broader live sandbox status checks.
    • Suggested test: Add a focused regression E2E job sandbox-status-preflight-failure-e2e that creates or fakes a registered docker-driver sandbox, simulates a stopped openshell-<sandbox> container, runs text and JSON status, and verifies failureLayer/exit-code/inference suppression.
  • sandbox dashboard port conflict classification (high): No existing E2E directly binds a foreign listener to the recorded dashboard port while the sandbox container is stopped and asserts escalation to sandbox_dashboard_port_conflict. This is a distinct operator-facing recovery path from generic gateway health.
    • Suggested test: Extend the proposed sandbox-status-preflight-failure-e2e or add a second case that reserves the dashboard port with a local listener and verifies JSON/text status reports sandbox_dashboard_port_conflict exactly once.
  • Docker daemon outage after sandbox registration (medium): Existing no-Docker scenario covers onboarding preflight without Docker, not the post-onboard status case where a docker-driver sandbox is registered and docker info becomes unreachable. The PR changes this specific post-registration status contract.
    • Suggested test: Add a typed negative scenario or hermetic regression case that seeds a registered docker-driver sandbox, makes Docker unreachable, and verifies nemoclaw <name> status and status --json return non-zero with failureLayer=docker_unreachable and no provider health probe.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: sandbox-operations-e2e

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw
Optional scenario E2E: ubuntu-repo-cloud-hermes, ubuntu-no-docker-preflight-negative

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required scenario E2E

  • ubuntu-repo-cloud-openclaw: The PR changes the sandbox status command and Docker/gateway failure classification paths. The ubuntu repo cloud OpenClaw scenario is the primary Docker-backed scenario whose baseline-onboarding suite calls nemoclaw <sandbox> status, exercising the status rendering/report path on the standard Ubuntu Docker runner.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Optional scenario E2E

  • ubuntu-repo-cloud-hermes: Adjacent coverage for the same baseline status check with the Hermes agent/onboarding profile. Useful if you want to confirm the refactored status path remains agent-agnostic, but the OpenClaw Ubuntu Docker scenario is the primary target.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes
  • ubuntu-no-docker-preflight-negative: Adjacent negative Docker-runtime coverage. This scenario validates the no-Docker preflight family, which is related to the docker_unreachable classification changed here, though it does not directly exercise the post-onboarding sandbox status failure-layer path.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-no-docker-preflight-negative

Relevant changed files

  • src/commands/sandbox/status.ts
  • src/lib/actions/sandbox/docker-health.ts
  • src/lib/actions/sandbox/gateway-failure-classifier.ts
  • src/lib/actions/sandbox/sandbox-container-owner.ts
  • src/lib/actions/sandbox/status-preflight.ts
  • src/lib/actions/sandbox/status-snapshot.ts
  • src/lib/actions/sandbox/status.ts

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 2 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 1 still applies, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: src/lib/actions/sandbox/status-preflight.ts driver gate: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `isDockerDaemonUnreachableForStatus()` and `classifySandboxContainerFailureForStatus()` require `sb.openshellDriver === "docker"`; prior tests for `isDockerRuntimeDown()` preserve legacy omitted/null behavior.
  • Status preflight still skips legacy Docker-backed registry entries (src/lib/actions/sandbox/status-preflight.ts:70): The new upfront status preflight only runs when `sb.openshellDriver === "docker"`. Nearby existing runtime-down logic deliberately treats omitted or null driver metadata as Docker-backed legacy/recovered entries, so older Docker sandboxes can still miss the new `docker_unreachable` header and inference-probe suppression path.
    • Recommendation: Reuse the same Docker-backed driver predicate as `isDockerRuntimeDown()` for status preflight, or explicitly document and test why omitted/null legacy entries should not receive the new preflight behavior. Add text/JSON or unit coverage for omitted `openshellDriver` and `openshellDriver: null`.
    • Evidence: `isDockerDaemonUnreachableForStatus()` and `classifySandboxContainerFailureForStatus()` gate on `sb.openshellDriver !== "docker"`, while `gateway-failure-classifier.ts:isDockerRuntimeDown()` treats only explicit `vm` as non-Docker and includes tests for legacy/recovered entries without driver metadata.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: src/lib/actions/sandbox/status-preflight.ts driver gate: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `isDockerDaemonUnreachableForStatus()` and `classifySandboxContainerFailureForStatus()` require `sb.openshellDriver === "docker"`; prior tests for `isDockerRuntimeDown()` preserve legacy omitted/null behavior.
  • Status preflight still skips legacy Docker-backed registry entries (src/lib/actions/sandbox/status-preflight.ts:70): The new upfront status preflight only runs when `sb.openshellDriver === "docker"`. Nearby existing runtime-down logic deliberately treats omitted or null driver metadata as Docker-backed legacy/recovered entries, so older Docker sandboxes can still miss the new `docker_unreachable` header and inference-probe suppression path.
    • Recommendation: Reuse the same Docker-backed driver predicate as `isDockerRuntimeDown()` for status preflight, or explicitly document and test why omitted/null legacy entries should not receive the new preflight behavior. Add text/JSON or unit coverage for omitted `openshellDriver` and `openshellDriver: null`.
    • Evidence: `isDockerDaemonUnreachableForStatus()` and `classifySandboxContainerFailureForStatus()` gate on `sb.openshellDriver !== "docker"`, while `gateway-failure-classifier.ts:isDockerRuntimeDown()` treats only explicit `vm` as non-Docker and includes tests for legacy/recovered entries without driver metadata.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/cli.test.ts (1)

1003-1052: ⚡ Quick win

Strengthen the non-docker-driver assertion coverage.

This test currently proves only header absence, not that inference probing remains enabled. Please also assert normal (non-error) status behavior for this path.

Suggested diff
   const r = runWithEnv("alpha status", {
     HOME: home,
     PATH: `${localBin}:${process.env.PATH || ""}`,
   });

+  expect(r.code).toBe(0);
+  expect(r.out).toContain("Inference:");
   expect(r.out).not.toContain(
     "Failure layer: docker_unreachable",
   );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/cli.test.ts` around lines 1003 - 1052, The test "sandbox <name> status
preserves Inference probe when openshellDriver is not docker" currently only
asserts the absence of "Failure layer: docker_unreachable"; update the test
(around runWithEnv("alpha status") / writeSandboxRegistry) to also assert that
inference probing and normal status output are present by checking r.out
contains the inference probe lines (e.g., "Gateway inference:" and "Provider:
openai-api" or "Model: gpt-4o-mini") and that normal gateway status appears
(e.g., "Status: Connected"), using the same runWithEnv result and existing
expect APIs so you verify probing remains enabled in non-docker openshellDriver
cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/cli.test.ts`:
- Around line 1003-1052: The test "sandbox <name> status preserves Inference
probe when openshellDriver is not docker" currently only asserts the absence of
"Failure layer: docker_unreachable"; update the test (around runWithEnv("alpha
status") / writeSandboxRegistry) to also assert that inference probing and
normal status output are present by checking r.out contains the inference probe
lines (e.g., "Gateway inference:" and "Provider: openai-api" or "Model:
gpt-4o-mini") and that normal gateway status appears (e.g., "Status:
Connected"), using the same runWithEnv result and existing expect APIs so you
verify probing remains enabled in non-docker openshellDriver cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8b9c4ead-b564-4518-bbce-7e09c2c07300

📥 Commits

Reviewing files that changed from the base of the PR and between 1daf081 and 5811f8d.

📒 Files selected for processing (3)
  • src/lib/actions/sandbox/status.test.ts
  • src/lib/actions/sandbox/status.ts
  • test/cli.test.ts

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26551292928
Target ref: 5811f8d5a3327e07e126109875c8ba9256504454
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26552024714
Target ref: 2da0d96682cf4f98ce9e44d2a5c7dca4c73bc331
Workflow ref: main
Requested jobs: sandbox-operations-e2e,diagnostics-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
diagnostics-e2e ✅ success
sandbox-operations-e2e ✅ success

@jyaunches jyaunches added R2 v0.0.56 Release target and removed v0.0.55 labels May 29, 2026
@laitingsheng laitingsheng removed the v0.0.56 Release target label May 30, 2026
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
…ct in status

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@laitingsheng laitingsheng changed the title fix(cli): emit docker_unreachable header upfront in sandbox status fix(cli): surface docker/sandbox container/dashboard port failure layers in sandbox status May 30, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26676934408
Target ref: 5a98bdf9b0c7b04e1a5c7430d10212014a210b71
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

…hboardPort

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26679093934
Target ref: b0b0fd448fe497af40c13a99f219d2a08cd74f37
Workflow ref: main
Requested jobs: sandbox-operations-e2e,cloud-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-e2e ✅ success
sandbox-operations-e2e ✅ success

…layers and expose failureLayer to status --json

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26680162950
Target ref: 39ee9e45b509fef4e6cead1b280c0a96a12305f3
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

…itions

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26680613479
Target ref: 9a8a3661a042371148132d84ebef9dfa7134bd50
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

…inted layer

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@laitingsheng laitingsheng added the v0.0.56 Release target label May 30, 2026
@laitingsheng laitingsheng added the v0.0.56 Release target label May 31, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26704812164
Target ref: 7b2e44ca98be4df2cf0ee0a86e091858eecd09bc
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

@cv cv added v0.0.57 Release target and removed v0.0.56 Release target labels Jun 1, 2026
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26790614231
Target ref: db2221c421df1e15b468cccae8f5b4fb068b3f39
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ⚠️ cancelled

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/actions/sandbox/status-preflight.ts (1)

129-154: ⚡ Quick win

Misplaced JSDoc: the "Print the exact first-line preflight header" comment is attached to withoutTerminalPhasePreflight.

The doc block at Lines 129-133 describes header printing, but it sits above withoutTerminalPhasePreflight (which clears terminal-phase preflight state and prints nothing). The function it actually describes, printSandboxStatusPreflightHeader at Line 147, is left undocumented.

📝 Proposed fix to move the doc to the correct function
-/**
- * Print the exact first-line preflight header. Unlike gateway-level fallback
- * headers this intentionally has no leading indentation because users and
- * tests rely on `docker_unreachable` being the first bytes of status output.
- */
 export function withoutTerminalPhasePreflight(
   preflight: SandboxStatusPreflightResult,
   phase: string | null,
 ): SandboxStatusPreflightResult {
   if (!phase || !isTerminalSandboxPhase(phase)) return preflight;
   return {
     failure: null,
     failureLayer: null,
     suppressInferenceProbe: preflight.suppressInferenceProbe,
     exitCode: 0,
   };
 }

+/**
+ * Print the exact first-line preflight header. Unlike gateway-level fallback
+ * headers this intentionally has no leading indentation because users and
+ * tests rely on `docker_unreachable` being the first bytes of status output.
+ */
 export function printSandboxStatusPreflightHeader(
   preflight: SandboxStatusPreflightResult,
   writer: (message: string) => void = console.log,
 ): void {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/status-preflight.ts` around lines 129 - 154, The
JSDoc describing "Print the exact first-line preflight header" is incorrectly
placed above withoutTerminalPhasePreflight; move that doc block to directly
above printSandboxStatusPreflightHeader and update its wording to describe
header printing (mention getLayerHeader and writer callback). Leave
withoutTerminalPhasePreflight either undocumented or add a short doc describing
that it clears terminal-phase preflight state (mention isTerminalSandboxPhase
and returned fields failure/failureLayer/suppressInferenceProbe/exitCode) so
intent remains clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/actions/sandbox/status-preflight.ts`:
- Around line 129-154: The JSDoc describing "Print the exact first-line
preflight header" is incorrectly placed above withoutTerminalPhasePreflight;
move that doc block to directly above printSandboxStatusPreflightHeader and
update its wording to describe header printing (mention getLayerHeader and
writer callback). Leave withoutTerminalPhasePreflight either undocumented or add
a short doc describing that it clears terminal-phase preflight state (mention
isTerminalSandboxPhase and returned fields
failure/failureLayer/suppressInferenceProbe/exitCode) so intent remains clear.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 77de6f99-3f42-4d19-8b08-e66c11e8be42

📥 Commits

Reviewing files that changed from the base of the PR and between 7b2e44c and db2221c.

📒 Files selected for processing (8)
  • docs/reference/commands.mdx
  • src/lib/actions/sandbox/docker-health.ts
  • src/lib/actions/sandbox/gateway-failure-classifier.ts
  • src/lib/actions/sandbox/status-preflight.ts
  • src/lib/actions/sandbox/status-snapshot.ts
  • src/lib/actions/sandbox/status.ts
  • test/cli.test.ts
  • test/gateway-failure-classifier.test.ts
💤 Files with no reviewable changes (2)
  • test/gateway-failure-classifier.test.ts
  • test/cli.test.ts
✅ Files skipped from review due to trivial changes (1)
  • docs/reference/commands.mdx
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/lib/actions/sandbox/docker-health.ts
  • src/lib/actions/sandbox/gateway-failure-classifier.ts
  • src/lib/actions/sandbox/status.ts

@cv cv enabled auto-merge (squash) June 2, 2026 00:42
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26790735708
Target ref: db2221c421df1e15b468cccae8f5b4fb068b3f39
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
sandbox-operations-e2e ❌ failure

Failed jobs: sandbox-operations-e2e. Check run artifacts for logs.

@cv

cv commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Addressed the updated PR Review Advisor finding about Phase: Error masking the per-sandbox preflight layers.

What changed:

Validated locally:

  • npm run build:cli
  • npm run typecheck:cli
  • npm run source-shape:check
  • npx vitest run test/cli.test.ts src/lib/actions/sandbox/status.test.ts

Pushed head: 46b16be2fc2bc432b030525ab4d41a34c70b3d08.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26791270599
Target ref: db35fa39d3f2ce7dca284cb9f9da29d6b9af23a2
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ⚠️ cancelled

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26791599457
Target ref: 46b16be2fc2bc432b030525ab4d41a34c70b3d08
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
sandbox-operations-e2e ❌ failure

Failed jobs: sandbox-operations-e2e. Check run artifacts for logs.

cv pushed a commit that referenced this pull request Jun 2, 2026
…rk (#4640)

## Summary

`TC-SBX-09: Tmux Session Flow` has been failing on every scheduled
nightly E2E run since #4606 merged. The first assertion (tmux binary
present) still passes; the second assertion — drive a full detached
`new-session` → `list-sessions` → `kill-session` cycle inside the
sandbox — consistently fails with `create window failed: fork failed:
Permission denied`.

Root cause: OpenShell sandbox hardening (seccomp, `no-new-privileges`,
`nproc=512` ulimit) blocks tmux's fork-to-spawn child window when
invoked under the e2e SSH session account. The binary-presence assertion
already covers the surface of issue #4513; the lifecycle drive depends
on sandbox runtime capabilities that are environment-dependent and out
of scope for this case. Degrade that branch to `skip` with the observed
`fork failed` output so the suite reports the limitation without failing
the nightly.

Latest failing scheduled nightly: [run
26790528855](https://github.com/NVIDIA/NemoClaw/actions/runs/26790528855).
Same failure also blocks PR review on
[#4388](#4388) via inherited
advisor reruns [run
26790735708](https://github.com/NVIDIA/NemoClaw/actions/runs/26790735708)
and [run
26791599457](https://github.com/NVIDIA/NemoClaw/actions/runs/26791599457).

## Related Issue

Follow-up to #4606 (which added TC-SBX-09 alongside the sandbox-image
tmux pin). The PR body of #4606 noted *"A full image-build +
live-sandbox E2E was not run in this environment"* — the lifecycle drive
added by that PR turned out to be incompatible with the live OpenShell
sandbox seccomp + capability profile, so every scheduled `E2E / Nightly`
run since the merge has reported `sandbox-operations-e2e` as failing on
this single assertion. This PR keeps the binary-presence guard from
#4606 intact (the actual surface of #4513) while making the lifecycle
drive a soft skip when the sandbox refuses to fork, so the nightly
pipeline can go green again without masking real regressions (a
non-`fork failed` error still hits the `fail` branch).

## Changes

- `test/e2e/test-sandbox-operations.sh`: in
`test_sbx_09_tmux_session_flow`, add a `skip` branch matching `fork
failed: (Permission denied|Resource temporarily unavailable)` between
the existing `pass`/`fail` branches; keeps best-effort `kill-session`
cleanup before recording the skip.

## Type of Change

- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification

- [x] `npx prek run --all-files` passes
- [x] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `npm run docs` builds without warnings (doc changes only)
- [ ] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Improved tmux sandbox operations test to better detect and handle fork
failures with enhanced error recovery and clearer skip messages.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@cv

cv commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Merged latest main after #4640 landed, so PR 4388 now includes the TC-SBX-09 tmux fork-policy skip.

New head: 21a0c8f2619f389818ff05a7b635599526ffb5e2.

Push-time hooks passed, including TypeScript CLI and source-shape budget. CI/E2E advisor checks are pending on the new head.

@cv cv merged commit bb979ee into main Jun 2, 2026
30 checks passed
@cv cv deleted the fix/4313-status-docker-unreachable-header branch June 2, 2026 03:08
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26795813868
Target ref: 21a0c8f2619f389818ff05a7b635599526ffb5e2
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

@wscurran wscurran added area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression labels Jun 3, 2026
@wscurran wscurran removed fix labels Jun 3, 2026
cv pushed a commit that referenced this pull request Jun 3, 2026
## Summary
- Add the missing `v0.0.57` release-notes section with links to the
detailed docs pages for command, inference, onboarding, messaging,
status, installer, and policy changes.
- Remove public references to docs-skip terms from source docs and
regenerate the NemoClaw user skills from the current Fern MDX docs.
- Carry forward generated references for the per-agent documentation
split, including Hermes-specific reference files.

## Source summary
- #4615 and #4653 -> `docs/about/release-notes.mdx`,
`docs/reference/commands.mdx`: Release notes now cover host-side
`sessions` and `agents` commands plus `NEMOCLAW_EXTRA_AGENTS_JSON`
secondary-agent baking.
- #4163, #4204, #4611, #4619, and #4676 ->
`docs/about/release-notes.mdx`,
`docs/inference/use-local-inference.mdx`: Release notes now cover
managed vLLM progress/readiness, DGX Spark model default changes, local
Ollama streaming usage, and inference route divergence warnings.
- #4267, #4601, #4609, #4642, #4645, and #4661 ->
`docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release
notes now cover UFW auto-remediation, local-inference reachability
gates, gateway reuse/binding, cancel rollback, and policy selection
persistence.
- #4577, #4582, #4607, and #4660 -> `docs/about/release-notes.mdx`,
`docs/manage-sandboxes/messaging-channels.mdx`: Release notes now cover
Slack validation, atomic `channels add`, WhatsApp QR diagnostics, and
Slack placeholder normalization.
- #4388, #4600, #4646, and #4647 -> `docs/about/release-notes.mdx`,
`docs/reference/commands.mdx`: Release notes now cover status failure
layers, paused-container hints, Docker-driver doctor behavior, and
non-destructive stale-registry recovery.
- #4569, #4579, and #4678 -> `docs/about/release-notes.mdx`,
`docs/manage-sandboxes/lifecycle.mdx`,
`docs/network-policy/integration-policy-examples.mdx`: Release notes now
cover installer tag pinning, PyPI `uv` policy access, and observable
Jira validation.
- #4632 -> `.agents/skills/`: Regenerated user skills from the current
per-agent docs source, including newly generated Hermes reference files.

## Verification
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" docs --glob "*.mdx"`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" .agents/skills --glob "*.md"`
- `npm run docs`
- `npm run build:cli`
- Commit hooks: markdownlint, docs-to-skills verification, gitleaks,
skills YAML, commitlint

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Restructured documentation to clearly distinguish OpenClaw and Hermes
agent variants throughout user guides.
* Enhanced security, credential storage, and deployment guidance with
clearer setup flows.
  * Added Hermes plugin installation and ecosystem documentation.
* Improved workspace, messaging, and policy management references with
variant-specific command examples.
  * Refined troubleshooting and CLI reference sections for clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression v0.0.57 Release target

Projects

None yet

4 participants