Surface explicit task status and attention on background tasks by chumyin · Pull Request #1178 · eastreams/loong

chumyin · 2026-04-10T04:36:19Z

Summary

Problem:
- loong tasks already exposed truthful child-session, approval, and tool-policy facts, but operators still had to reconstruct effective task state from raw state, phase, approval counts, and tool policy details.
Why it matters:
- Background task inspection was still substrate-shaped instead of operator-shaped, which made it easy to miss approval blockers, overdue recoverability, or task-specific next steps.
What changed:
- derive a stable task_status summary from existing task/session facts
- surface task_status.status, task_status.needs_attention, task_status.next_action, and task_status.signals in the machine-readable payload
- render the derived summary directly in tasks create, tasks list, tasks status, and tasks wait
- lock the new behavior with daemon unit/integration coverage
- harden one onboarding web-search test by clearing credential env vars so local all-features verification stays hermetic when operator API keys are present
What did not change (scope boundary):
- no new task scheduler or parallel state machine
- no background execution semantic changes
- no web UI task dashboard work
- no changes to persisted session/task records

Linked Issues

Change Type

Touched Areas

Risk Track

Track A (routine / low-risk)
Track B (higher-risk / policy-impacting)

Validation

cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test --workspace --locked
cargo test --workspace --all-features --locked
Relevant architecture / dep-graph / docs checks for touched areas
Additional scenario, benchmark, or manual checks when behavior changed
If this changes config/env fallback, limits, or defaults: include before/after behavior and regression coverage for explicit path, fallback path, and boundary values
If tests mutate process-global env: document how state is restored or serialized

Commands and evidence:

cargo fmt --all -- --check
CARGO_TARGET_DIR=<redacted-target-dir> cargo clippy --workspace --all-targets --all-features -- -D warnings
CARGO_TARGET_DIR=<redacted-target-dir> cargo test --workspace --locked -q
CARGO_TARGET_DIR=<redacted-target-dir> cargo test --workspace --all-features --locked -q
cargo test -p loongclaw tasks_cli::tests:: --lib -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_list_returns_visible_background_tasks -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_status_surfaces_approval_and_tool_policy -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_events_and_wait_surface_incremental_payloads -- --test-threads=1
cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1
cargo test -p loongclaw-app delegate_announce_queue_batches_children_completed_within_debounce_window --lib -- --test-threads=1
cargo test -p loongclaw-app reopening_current_schema_db_skips_metadata_repairs --lib -- --test-threads=1
./scripts/check-docs.sh
./scripts/check_architecture_boundaries.sh
./scripts/check_dep_graph.sh

Result: all commands passed on the rebased branch. `check-docs.sh` reported the existing non-blocking release-artifact warnings but still passed governance checks.

Env restoration note: the onboarding web-search regression test now uses `ScopedEnv` + `clear_web_search_credential_envs(...)` so any process-global credential env state is cleared for the test body and restored on scope exit.

User-visible / Operator-visible Changes

loong tasks create|list|status|wait now surface an explicit derived task summary instead of leaving operators to infer the task state from raw session fields.
Task payloads now include machine-readable task_status.status, task_status.needs_attention, task_status.next_action, and task_status.signals.
Text rendering now calls out approval-pending, overdue/recoverable, cancellation, recovery, and tool-narrowing cues directly.

Failure Recovery

Fast rollback or disable path:
- revert this PR; the change is limited to daemon rendering/summary derivation plus one test hardening path
Observable failure symptoms reviewers should watch for:
- task text rendering shows a derived status that disagrees with raw state / phase
- task payload misses task_status on create/list/status/wait surfaces
- onboarding web-search test behavior changes when credential env vars are present

Reviewer Focus

crates/daemon/src/tasks_cli.rs
- verify the derived status precedence (completed/failed/timed_out vs overdue vs approval-pending vs queued/running)
- verify needs_attention, next_action, and signals stay truthful to existing session/approval/tool-policy facts
crates/daemon/tests/integration/tasks_cli.rs
- confirm the create/list/status/wait surfaces are all locked by integration coverage
crates/daemon/src/onboard_web_search.rs
- confirm the env cleanup keeps the configured-provider test hermetic without changing product behavior

Summary by CodeRabbit

New Features
- Tasks now surface a comprehensive derived task_status (kind, status, display, attention flags, next action, signals, recovery hints) in create/list/status/wait and CLI brief/detail output.
Tests
- Added/extended unit and integration tests validating derived task_status fields and CLI rendering.
- Improved test utilities to fully clear web-search credential env vars to avoid accidental credential detection.
Documentation
- Updated task specs to require derived status display in task commands.

coderabbitai · 2026-04-10T04:36:34Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 896bb9bc-5307-4d97-b5ad-9aa5220a1ce2

📥 Commits

Reviewing files that changed from the base of the PR and between e51088b and a16eee1.

📒 Files selected for processing (1)

crates/daemon/src/onboard_web_search.rs

🚧 Files skipped from review as they are similar to previous changes (1)

crates/daemon/src/onboard_web_search.rs

📝 Walkthrough

Walkthrough

Adds a derived task_status summary to task payloads and CLI output; updates rendering and tests to surface status, attention flags, signals, and next actions. Also clears web-search credential env vars in an onboarding test and updates product-spec acceptance criteria.

Changes

Cohort / File(s)	Summary
Web Search Test Cleanup `crates/daemon/src/onboard_web_search.rs`	Test utility now clears each provider's `default_api_key_env` from the scoped environment; a test creates a `ScopedEnv` and calls the cleaner before invoking provider resolution.
Task Status Derivation & Rendering `crates/daemon/src/tasks_cli.rs`	Introduces `build_task_status_payload` and helpers to derive `task_status` (kind, status, display, flags, next_action, approval_primary_action, signals). Surfaces `task_status` in brief/detail renderers and adds unit tests for derivation and rendering.
Integration Test Coverage `crates/daemon/tests/integration/tasks_cli.rs`	Extends integration tests for `tasks create
Product Specification `docs/product-specs/background-tasks.md`	Adds acceptance criterion requiring CLI surfaces to present derived `task_status` summary (`task_status.status`, `task_status.needs_attention`, `task_status.next_action`).

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI (loong tasks)
  participant Daemon as Daemon / tasks_cli
  participant Builder as TaskStatusBuilder
  participant Session as Session Runtime
  participant Delegate as Delegate
  participant Approvals as Approval Service

  CLI->>Daemon: request (list/status/create/wait)
  Daemon->>Builder: build_task_status_payload(session, delegate, approvals, events, policy)
  Builder->>Session: read session state
  Builder->>Delegate: read delegate phase
  Builder->>Approvals: read approval/attention summary
  Builder-->>Daemon: task_status payload (kind,status,display,flags,next_action,signals)
  Daemon-->>CLI: JSON + rendered text (includes task_status)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: add background tasks CLI over session runtime #661 — earlier work on the tasks CLI module that this PR extends with derived task_status and updated rendering/tests.
fix(runtime): recover shell repair workflow in safe lane #1080 — related changes to test helpers that clear web-search credential env vars used in onboarding tests.
refactor(onboard): extract web search provider policy #521 — touches the same onboard web-search test file and provider credential handling.

Suggested labels

tools, docs, size: M

Suggested reviewers

gh-xj

Poem

🐰 I hopped through sessions, read events and cues,
I stitched a tiny status to save you time and news.
Signals bright and next-steps clear, with attention flags that sing,
A rabbit’s tidy summary — hop to it, do the thing!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 32.26% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: deriving and surfacing explicit task status and operator attention signals in background task outputs.
Linked Issues check	✅ Passed	All acceptance criteria from `#1177` are met: derived task_status is surfaced across create/list/status/wait, approval-pending/overdue/recovery/tool-narrowing signals are rendered, JSON and text surfaces include the status summary, and regression tests validate the derivation and rendering behavior.
Out of Scope Changes check	✅ Passed	All changes are in-scope: task status derivation and CLI rendering, integration/unit test coverage, docs update, and test hardening via credential cleanup. No scheduler changes, execution semantic changes, or web UI work introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

crates/daemon/src/tasks_cli.rs (1)

1212-1220: Nit: Simplify intermediate variable.

The function is correct but could be slightly more concise.

♻️ Optional simplification

 fn render_task_status_display(kind: &str, recovered: bool) -> String {
     let base = kind.to_owned();
     if !recovered {
         return base;
     }
 
-    let display = format!("{base} (recovered)");
-    display
+    format!("{base} (recovered)")
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/daemon/src/tasks_cli.rs` around lines 1212 - 1220, The function
render_task_status_display uses an unnecessary intermediate variable and
explicit return; simplify by returning kind.to_owned() when not recovered and
otherwise returning format!("{kind} (recovered)"), removing the `base` and
`display` temporaries; update references inside render_task_status_display to
use kind and recovered directly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/daemon/src/onboard_web_search.rs`:
- Around line 610-611: The test's hermetic env cleanup is incomplete because
clear_web_search_credential_envs currently only unsets api_key_env_names but not
providers' default_api_key_env, allowing host credentials to leak; update
clear_web_search_credential_envs (or add a helper it calls) to also unset each
provider's default_api_key_env in addition to api_key_env_names, ensuring
ScopedEnv::new() + clear_web_search_credential_envs(&mut env) removes both
api_key_env_names and default_api_key_env entries for every WebSearch provider.

---

Nitpick comments:
In `@crates/daemon/src/tasks_cli.rs`:
- Around line 1212-1220: The function render_task_status_display uses an
unnecessary intermediate variable and explicit return; simplify by returning
kind.to_owned() when not recovered and otherwise returning format!("{kind}
(recovered)"), removing the `base` and `display` temporaries; update references
inside render_task_status_display to use kind and recovered directly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b19ebd29-822a-4b9a-b0b8-239f9fb1e7bc

📥 Commits

Reviewing files that changed from the base of the PR and between 04c8a2b and 5af0cad.

📒 Files selected for processing (4)

crates/daemon/src/onboard_web_search.rs
crates/daemon/src/tasks_cli.rs
crates/daemon/tests/integration/tasks_cli.rs
docs/product-specs/background-tasks.md

chumyin · 2026-04-10T08:57:15Z

Follow-up pushed in a16eee127.

What changed in this pass:

kept the onboarding web-search recommendation test hermetic by clearing each provider's canonical default_api_key_env in addition to alias env names
kept the already-pushed detached-task create race fix (e51088b3e) intact; the branch now covers both the CI race and the env-leak test gap

Fresh validation on the updated branch:

cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test --workspace --locked -q
cargo test --workspace --all-features --locked -q
cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --nocapture
./scripts/check_architecture_boundaries.sh
./scripts/check_dep_graph.sh

Notes:

the first full --all-features rerun hit the existing flaky delegate_announce_queue_batches_children_completed_within_debounce_window test; the targeted rerun passed, and the second full --all-features rerun passed cleanly.

…on fields The task surface already had truthful child-session, approval, and tool-policy facts, but still forced operators to reconstruct effective status by hand. This change derives a stable summary from the existing runtime facts and renders it directly in the CLI surface. While validating the full local gate, an onboarding web-search test proved to be environment-sensitive under real operator credential env vars. The test now clears credential envs explicitly so the configured-provider path stays hermetic during all-features verification. Constraint: Must stay truthful to the existing child-session runtime and avoid inventing a second task state model Rejected: Add a new durable task-state machine | would drift from session truth and broaden the change Rejected: Document raw fields more clearly | would leave the operator inference gap in place Confidence: high Scope-risk: narrow Directive: Keep task_status derived from existing session, approval, and tool-policy facts unless the runtime model itself changes Tested: cargo fmt --all -- --check; cargo clippy --workspace --all-targets --all-features -- -D warnings; cargo test --workspace --locked -q; cargo test --workspace --all-features --locked -q; cargo test -p loongclaw tasks_cli::tests:: --lib -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_list_returns_visible_background_tasks -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_status_surfaces_approval_and_tool_policy -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_events_and_wait_surface_incremental_payloads -- --test-threads=1; cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1; cargo test -p loongclaw-app delegate_announce_queue_batches_children_completed_within_debounce_window --lib -- --test-threads=1; cargo test -p loongclaw-app reopening_current_schema_db_skips_metadata_repairs --lib -- --test-threads=1; ./scripts/check-docs.sh; ./scripts/check_architecture_boundaries.sh; ./scripts/check_dep_graph.sh Not-tested: Manual operator walkthrough against a live long-running background task

…ion window The create-path task surface intentionally reports the latest child-session truth. On CI, detached workers can fail quickly enough that the freshly hydrated task is already `failed` instead of still `queued`, which made the integration test assert a scheduler fiction instead of the real runtime state. This narrows the test to the product contract: create must surface a derived status and next action that match the current payload, whether the child is still queued or has already failed. Constraint: The task surface must stay truthful to current child-session state instead of freezing a synthetic queued snapshot for tests Rejected: Force tasks create to always report queued | would regress runtime truth and hide fast child failures Rejected: Add sleeps or polling to wait for a specific state | would keep the test race-prone and slower across CI Confidence: high Scope-risk: narrow Directive: When detached child execution can race inspection, assert the task payload/rendering contract instead of a single transient state Tested: cargo fmt --all -- --check; cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1; CARGO_TARGET_DIR=/Users/chum/loongclaw/.cargo-target-shared cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1; CARGO_TARGET_DIR=/Users/chum/loongclaw/.cargo-target-shared cargo test --workspace --locked -q; CARGO_TARGET_DIR=/Users/chum/loongclaw/.cargo-target-shared cargo test --workspace --all-features --locked -q Not-tested: Manual Windows detached-worker timing reproduction outside GitHub Actions

The task-status branch was already corrected for detached task create races, but CodeRabbit surfaced one remaining test-only gap: the onboarding web-search helper cleared provider alias env names without clearing each provider's canonical default credential env. That meant local host credentials could still leak into the explicit-default-provider test and make CI or maintainer reruns less trustworthy. The fix extends the existing scoped cleanup helper instead of introducing new test-only branching. Constraint: The onboarding recommendation test must stay hermetic even when maintainers have provider credentials in their shell Rejected: Add provider-specific hardcoded removals inside each test | duplicates the descriptor contract and drifts as providers evolve Confidence: high Scope-risk: narrow Directive: Keep test env cleanup descriptor-driven; if web-search provider metadata changes, update the shared cleanup helper rather than individual tests Tested: cargo fmt --all -- --check Tested: cargo clippy --workspace --all-targets --all-features -- -D warnings Tested: cargo test --workspace --locked -q Tested: cargo test --workspace --all-features --locked -q Tested: cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1 Tested: cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --nocapture Tested: ./scripts/check_architecture_boundaries.sh Tested: ./scripts/check_dep_graph.sh Not-tested: GitHub Actions rerun after push (local CI-parity passed)

github-actions Bot added documentation Improvements or additions to documentation. spec Architecture boundaries, product specs, and design docs. daemon Daemon binary, CLI entrypoints, and install flow. size: L Large pull request: 501-1000 changed lines. labels Apr 10, 2026

coderabbitai Bot reviewed Apr 10, 2026

View reviewed changes

Comment thread crates/daemon/src/onboard_web_search.rs

chumyin mentioned this pull request Apr 10, 2026

[Feature]: Surface explicit status and attention on background tasks #1177

Closed

chumyin self-assigned this Apr 16, 2026

chumyin added 4 commits April 16, 2026 20:47

derive default for control-plane workflow summary

282e975

chumyin force-pushed the feat/task-status-surface-20260410 branch from a16eee1 to 282e975 Compare April 16, 2026 13:39

chumyin requested review from a team as code owners April 16, 2026 13:39

github-actions Bot added the protocol Protocol crate and wire-level behavior. label Apr 16, 2026

chumyin merged commit e0516c9 into eastreams:dev Apr 16, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surface explicit task status and attention on background tasks#1178

Surface explicit task status and attention on background tasks#1178
chumyin merged 4 commits intoeastreams:devfrom
chumyin:feat/task-status-surface-20260410

chumyin commented Apr 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

chumyin commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chumyin commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked Issues

Change Type

Touched Areas

Risk Track

Validation

User-visible / Operator-visible Changes

Failure Recovery

Reviewer Focus

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chumyin commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chumyin commented Apr 10, 2026 •

edited

Loading

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading