Skip to content

test(cli): trim logs subprocess coverage#4919

Merged
cv merged 2 commits into
mainfrom
codex/logs-runtime-target
Jun 7, 2026
Merged

test(cli): trim logs subprocess coverage#4919
cv merged 2 commits into
mainfrom
codex/logs-runtime-target

Conversation

@cv

@cv cv commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR targets the next slow CLI runtime bucket from #4892 by moving non-follow sandbox logs coverage out of subprocess-heavy CLI tests and into fast action/command unit tests. The public logs CLI smoke and follow/signal tests stay in test/cli/logs.test.ts, while source selection, audit setup, degraded-source behavior, merged output, and Docker outage handling are covered through injected dependencies.

Related Issue

Part of #4892

Changes

  • Add showSandboxLogsWithDeps so sandbox logs behavior can be tested with fake OpenShell, Docker, stdout, and exit dependencies.
  • Add action-level logs tests for audit setup, source selection, degraded source warnings, merged output, and Docker outage short-circuiting.
  • Move malformed --tail and --since parser coverage into src/commands/sandbox/logs.test.ts.
  • Trim test/cli/logs.test.ts from 20 subprocess tests to 10 runtime smokes and shorten the remaining timeout/order waits.
  • Local profile: test/cli/logs.test.ts moved from 17.95s / 20 tests to 5.66s / 10 tests; the focused 10-file hot set moved from 14.95s to 12.85s wall-clock.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests

    • Enhanced test coverage for sandbox logs, including input validation and log orchestration scenarios
    • Optimized test execution performance
  • Chores

    • Refactored sandbox logs implementation to improve testability through dependency injection

@cv cv self-assigned this Jun 7, 2026
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@cv, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 31 minutes and 54 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6f25efba-03e4-46c2-878d-b1953821bd38

📥 Commits

Reviewing files that changed from the base of the PR and between cf47c2c and 88f83a1.

📒 Files selected for processing (3)
  • src/commands/sandbox/logs.test.ts
  • src/lib/actions/sandbox/logs.test.ts
  • test/cli/logs.test.ts
📝 Walkthrough

Walkthrough

This PR refactors the sandbox logs action to accept injectable runtime dependencies instead of hard-coding process/spawn calls. A new showSandboxLogsWithDeps API enables Docker outage handling with configurable exit and output paths. Comprehensive unit tests validate orchestration, resilience, and CLI input validation. CLI test suite timing and stubs are adjusted for faster execution.

Changes

Sandbox Logs Dependency Injection & Testability

Layer / File(s) Summary
Runtime dependency contract and helper refactoring
src/lib/actions/sandbox/logs.ts
Introduces SandboxLogsRuntimeDeps type capturing optional injections for exit, spawn, env, runOpenshell, and binary resolution. Updates exitWithSpawnResult, runOpenclawGatewayLogs, streamSandboxFollowLogs, and child-process spawning to accept deps and use injected behavior with fallbacks to original imports.
Public API with dependency injection
src/lib/actions/sandbox/logs.ts
Adds exported showSandboxLogsWithDeps(sandboxName, options, deps?) entrypoint; refactors existing showSandboxLogs to delegate. Wires Docker runtime outage preflight with injected classifier/guidance/exit hooks, merged output via injected writeStdout, and terminal exit via exitWithSpawnResult(..., deps).
Unit test harness and integration tests
src/lib/actions/sandbox/logs.test.ts
Adds test helpers (ExitError, CapturedLogsRun, captureLogsRun) mocking console.error and providing callbacks for runOpenshell, writeStdout, and exit. Test suite validates success (audit enabled, logs merged in order), --since targeting (gateway tail skipped), resilience under failures, and Docker outage (guidance printed, exit 1, no probes).
CLI input validation tests
src/commands/sandbox/logs.test.ts
Adds two test cases asserting --tail 0 and --since "someday" are rejected with appropriate error messages and that showSandboxLogs is not invoked.
CLI test suite adjustments
test/cli/logs.test.ts
Removes unused readRecordedArgs import and "plain logs" passthrough test. Updates gateway probe timeout test to use infinite-loop stub and reduces timeout constant to 50ms. Speeds up audit-enable test by reducing stub delay from 1 to 0.05 and elapsed-time assertion from 900ms to 40ms.

Sequence Diagram

sequenceDiagram
  participant Caller
  participant showSandboxLogsWithDeps
  participant DockerPreflight
  participant AuditLogs
  participant GatewayLogs
  participant OpenShellLogs
  participant Output
  
  Caller->>showSandboxLogsWithDeps: (sandboxName, options, deps?)
  showSandboxLogsWithDeps->>DockerPreflight: check if runtime is down
  alt Runtime Down
    DockerPreflight->>Output: print guidance
    DockerPreflight->>Caller: exit(1)
  else Runtime Healthy
    showSandboxLogsWithDeps->>AuditLogs: enable audit logging
    showSandboxLogsWithDeps->>GatewayLogs: probe gateway tail
    showSandboxLogsWithDeps->>OpenShellLogs: probe OpenShell logs
    GatewayLogs->>Output: merge stdout
    OpenShellLogs->>Output: merge stdout
    Output->>Caller: writeStdout result
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4605: Modifies the same Docker-daemon outage preflight/exit behavior in src/lib/actions/sandbox/logs.ts (main PR refactors to use injected guidance/exit deps, related PR adds Docker-outage detection and messaging).
  • NVIDIA/NemoClaw#4911: Updates SandboxLogsCommand CLI adapter tests validating --tail/--since dispatch shape, overlapping with this PR's new CLI input validation tests.
  • NVIDIA/NemoClaw#4898: Modifies test/cli/logs.test.ts test cases and removes/adjusts the same "plain logs" and timing/timeout scenarios that this PR adjusts.

Suggested labels

area: cli, area: sandbox

Suggested reviewers

  • prekshivyas

Poem

🐰 Dependencies now dance where hardcodes stood still,
Testable logs flow through injected will,
Docker outages caught with graceful grace,
Unit tests bloom in their testing space,
A CLI validated, with checks in place!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(cli): trim logs subprocess coverage' accurately describes the main change: reducing subprocess-heavy test coverage by moving non-follow logs tests from the CLI test suite to unit tests with injected dependencies.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/logs-runtime-target

Comment @coderabbitai help to get the list of available commands and usage tips.

@cv cv mentioned this pull request Jun 7, 2026
2 tasks
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: sandbox-operations-e2e
Optional E2E: ubuntu-repo-cloud-openclaw, full

Dispatch hint: sandbox-operations-e2e

Auto-dispatched E2E: sandbox-operations-e2e via nightly-e2e.yaml at 88f83a1981ecfc04adf35b9a92abe1eb418c9915nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • sandbox-operations-e2e (high; creates live sandboxes and has a 60 minute workflow timeout): Directly covers the affected real sandbox operation: test/e2e/test-sandbox-operations.sh includes TC-SBX-04, which runs nemoclaw <sandbox> logs, verifies non-empty output, starts nemoclaw <sandbox> logs --follow, and checks the follower remains streaming until killed. This is the best existing E2E guard for changes to log source probing, merged log output, follow-mode child process handling, and exit behavior.

Optional E2E

  • ubuntu-repo-cloud-openclaw (high; full scenario runner with onboarding and runtime checks): Scenario-runner coverage for a clean install/onboarded OpenClaw sandbox includes sandbox operations validation (lifecycle.sandbox.logs-and-exec / validation.sandbox_operations.logs_available). Useful as an end-to-end install-to-runtime confidence check if the PR is high risk, but less targeted than sandbox-operations-e2e.
  • full (high; provisions an ephemeral Brev instance and runs the full user journey): Branch validation full E2E runs test/e2e/test-full-e2e.sh, which includes a nemoclaw logs output check during the complete install/onboard/inference user journey. It is useful broad coverage but does not specifically exercise follow-mode or logs flag variants.

New E2E recommendations

  • sandbox logs flag behavior (medium): Existing E2E coverage exercises default logs and logs --follow, but the PR specifically changes/test-refactors behavior around --tail, -n, --since, malformed values, audit enablement degradation, and skipping the gateway tail when --since is used. Those currently appear covered mainly by unit/CLI stub tests, not live sandbox E2E.
    • Suggested test: Add a targeted sandbox logs E2E step that runs nemoclaw <sandbox> logs --tail 50, nemoclaw <sandbox> logs -n 25, and nemoclaw <sandbox> logs --since 5m, then verifies the command exits successfully and produces output without starting the unfiltered gateway tail for --since.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: sandbox-operations-e2e

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw
Optional scenario E2E: ubuntu-repo-cloud-hermes

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • ubuntu-repo-cloud-openclaw: The PR changes the sandbox logs action used by nemoclaw <sandbox> logs. The ubuntu repo/cloud OpenClaw scenario includes the baseline-onboarding suite, whose sandbox-state step validates that logs produce output against a real Docker-backed sandbox.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw

Optional scenario E2E

  • ubuntu-repo-cloud-hermes: Optional adjacent coverage for the same CLI logs surface on the Hermes agent path; useful because log availability can differ by agent, but the OpenClaw Ubuntu scenario is the primary targeted check.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes

Relevant changed files

  • src/lib/actions/sandbox/logs.ts

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 0 still apply, 0 new items found

Consider writing more tests for
  • **Acceptance clause:** Part of ci: safely split slow CLI coverage suites #4892 — add test evidence or identify existing coverage. The PR body references ci: safely split slow CLI coverage suites #4892, but no linked issue body or comments for ci: safely split slow CLI coverage suites #4892 were included in the deterministic context, so issue-level acceptance clauses could not be independently verified.
  • **Acceptance clause:** Local profile: `test/cli/logs.test.ts` moved from 17.95s / 20 tests to 5.66s / 10 tests; the focused 10-file hot set moved from 14.95s to 12.85s wall-clock. — add test evidence or identify existing coverage. The timing claims are PR-provided context only; this read-only review did not execute local profiling.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/cli/logs.test.ts (1)

64-65: 💤 Low value

Unreachable exit statement in timeout test stub.

The exit 0 on line 65 is unreachable after the infinite loop on line 64. While this doesn't affect test behavior, removing it would make the stub's timeout-forcing intent clearer.

♻️ Proposed cleanup
   const setup = createLogsTestSetup("nemoclaw-cli-logs-openclaw-timeout-", [
     'if [ "$1" = "sandbox" ]; then',
     "  while true; do :; done",
-    "  exit 0",
     "fi",
   ]);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/cli/logs.test.ts` around lines 64 - 65, The stub in the timeout test in
test/cli/logs.test.ts contains an infinite loop string "  while true; do :;
done" followed by an unreachable "  exit 0"; remove the unreachable "  exit 0"
entry (or replace the stub with a single infinite-loop command) so the intent of
forcing a timeout is clear—look for the timeout test that builds the command
array containing those two strings and update it to only include the infinite
loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/cli/logs.test.ts`:
- Around line 64-65: The stub in the timeout test in test/cli/logs.test.ts
contains an infinite loop string "  while true; do :; done" followed by an
unreachable "  exit 0"; remove the unreachable "  exit 0" entry (or replace the
stub with a single infinite-loop command) so the intent of forcing a timeout is
clear—look for the timeout test that builds the command array containing those
two strings and update it to only include the infinite loop.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 610c8e93-4baf-4fa3-9d97-efb29a53b79e

📥 Commits

Reviewing files that changed from the base of the PR and between 13405a5 and cf47c2c.

📒 Files selected for processing (4)
  • src/commands/sandbox/logs.test.ts
  • src/lib/actions/sandbox/logs.test.ts
  • src/lib/actions/sandbox/logs.ts
  • test/cli/logs.test.ts

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27105642903
Target ref: cf47c2c4b8102621a287c2ca1af6e2dfd04738fb
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

@cv

cv commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed the feedback comments in 88f83a1:

  • Removed the unreachable timeout-stub exit 0 noted by CodeRabbit.
  • Added action-level coverage for logs --follow --tail 50 and logs --follow --since 5m, including the no-unfiltered-gateway-tail behavior for --since.
  • Added command parser coverage for missing --tail, nonnumeric -n foo, and missing --since before the logs action is invoked.

Verification:

  • npx vitest run --project cli src/lib/actions/sandbox/logs.test.ts src/commands/sandbox/logs.test.ts test/cli/logs.test.ts --reporter=verbose
  • npm run typecheck:cli
  • npx prek run --all-files

The push hook also passed for the updated branch.

@cv cv merged commit 0cb2c84 into main Jun 7, 2026
34 checks passed
@cv cv deleted the codex/logs-runtime-target branch June 7, 2026 23:22
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27107687337
Target ref: 88f83a1981ecfc04adf35b9a92abe1eb418c9915
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

cv added a commit that referenced this pull request Jun 8, 2026
## Summary
This PR targets the next slow policy runtime bucket identified after
#4919 by moving built-in `policy-add` / `policy-remove` confirmation
coverage out of subprocess-heavy CLI harnesses and into same-process
action tests. Custom preset `--from-file` / `--from-dir` subprocess
coverage stays in `test/policies.test.ts` for a smaller, mechanical
first step.

## Related Issue
Part of #4892

## Changes
- Add `src/lib/actions/sandbox/policy-channel-policy.test.ts` with
mocked same-process coverage for `addSandboxPolicy` and
`removeSandboxPolicy` prompts, declined confirmation, dry-run,
non-interactive mode, agent-specific preset filtering, and messaging
validation guidance.
- Remove the equivalent built-in policy-add/remove subprocess blocks
from `test/policies.test.ts`, reducing the policy suite from 166 tests
in the #4919 artifact to 144 tests in this full local run.
- Ratchet `test/policies.test.ts` in `ci/test-file-size-budget.json`
from 3143 lines to 2763 lines so it cannot silently grow back.
- Leave custom preset `--from-file` / `--from-dir` subprocess coverage
intact for a follow-up target.

## Type of Change

- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [x] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `npm run docs` builds without warnings (doc changes only)
- [ ] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv added the v0.0.61 Release target label Jun 8, 2026
@wscurran wscurran added the chore Build, CI, dependency, or tooling maintenance label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Build, CI, dependency, or tooling maintenance v0.0.61 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants