Skip to content

Surface explicit task status and attention on background tasks#1178

Merged
chumyin merged 4 commits intoeastreams:devfrom
chumyin:feat/task-status-surface-20260410
Apr 16, 2026
Merged

Surface explicit task status and attention on background tasks#1178
chumyin merged 4 commits intoeastreams:devfrom
chumyin:feat/task-status-surface-20260410

Conversation

@chumyin
Copy link
Copy Markdown
Collaborator

@chumyin chumyin commented Apr 10, 2026

Summary

  • Problem:
    • loong tasks already exposed truthful child-session, approval, and tool-policy facts, but operators still had to reconstruct effective task state from raw state, phase, approval counts, and tool policy details.
  • Why it matters:
    • Background task inspection was still substrate-shaped instead of operator-shaped, which made it easy to miss approval blockers, overdue recoverability, or task-specific next steps.
  • What changed:
    • derive a stable task_status summary from existing task/session facts
    • surface task_status.status, task_status.needs_attention, task_status.next_action, and task_status.signals in the machine-readable payload
    • render the derived summary directly in tasks create, tasks list, tasks status, and tasks wait
    • lock the new behavior with daemon unit/integration coverage
    • harden one onboarding web-search test by clearing credential env vars so local all-features verification stays hermetic when operator API keys are present
  • What did not change (scope boundary):
    • no new task scheduler or parallel state machine
    • no background execution semantic changes
    • no web UI task dashboard work
    • no changes to persisted session/task records

Linked Issues

Change Type

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Security hardening
  • CI / workflow / release

Touched Areas

  • Kernel / policy / approvals
  • Contracts / protocol / spec
  • Daemon / CLI / install
  • Providers / routing
  • Tools
  • Browser automation
  • Channels / integrations
  • ACP / conversation / session runtime
  • Memory / context assembly
  • Config / migration / onboarding
  • Docs / contributor workflow
  • CI / release / workflows

Risk Track

  • Track A (routine / low-risk)
  • Track B (higher-risk / policy-impacting)

Validation

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace --locked
  • cargo test --workspace --all-features --locked
  • Relevant architecture / dep-graph / docs checks for touched areas
  • Additional scenario, benchmark, or manual checks when behavior changed
  • If this changes config/env fallback, limits, or defaults: include before/after behavior and regression coverage for explicit path, fallback path, and boundary values
  • If tests mutate process-global env: document how state is restored or serialized

Commands and evidence:

cargo fmt --all -- --check
CARGO_TARGET_DIR=<redacted-target-dir> cargo clippy --workspace --all-targets --all-features -- -D warnings
CARGO_TARGET_DIR=<redacted-target-dir> cargo test --workspace --locked -q
CARGO_TARGET_DIR=<redacted-target-dir> cargo test --workspace --all-features --locked -q
cargo test -p loongclaw tasks_cli::tests:: --lib -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_list_returns_visible_background_tasks -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_status_surfaces_approval_and_tool_policy -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1
cargo test -p loongclaw --test integration execute_tasks_command_events_and_wait_surface_incremental_payloads -- --test-threads=1
cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1
cargo test -p loongclaw-app delegate_announce_queue_batches_children_completed_within_debounce_window --lib -- --test-threads=1
cargo test -p loongclaw-app reopening_current_schema_db_skips_metadata_repairs --lib -- --test-threads=1
./scripts/check-docs.sh
./scripts/check_architecture_boundaries.sh
./scripts/check_dep_graph.sh

Result: all commands passed on the rebased branch. `check-docs.sh` reported the existing non-blocking release-artifact warnings but still passed governance checks.

Env restoration note: the onboarding web-search regression test now uses `ScopedEnv` + `clear_web_search_credential_envs(...)` so any process-global credential env state is cleared for the test body and restored on scope exit.

User-visible / Operator-visible Changes

  • loong tasks create|list|status|wait now surface an explicit derived task summary instead of leaving operators to infer the task state from raw session fields.
  • Task payloads now include machine-readable task_status.status, task_status.needs_attention, task_status.next_action, and task_status.signals.
  • Text rendering now calls out approval-pending, overdue/recoverable, cancellation, recovery, and tool-narrowing cues directly.

Failure Recovery

  • Fast rollback or disable path:
    • revert this PR; the change is limited to daemon rendering/summary derivation plus one test hardening path
  • Observable failure symptoms reviewers should watch for:
    • task text rendering shows a derived status that disagrees with raw state / phase
    • task payload misses task_status on create/list/status/wait surfaces
    • onboarding web-search test behavior changes when credential env vars are present

Reviewer Focus

  • crates/daemon/src/tasks_cli.rs
    • verify the derived status precedence (completed/failed/timed_out vs overdue vs approval-pending vs queued/running)
    • verify needs_attention, next_action, and signals stay truthful to existing session/approval/tool-policy facts
  • crates/daemon/tests/integration/tasks_cli.rs
    • confirm the create/list/status/wait surfaces are all locked by integration coverage
  • crates/daemon/src/onboard_web_search.rs
    • confirm the env cleanup keeps the configured-provider test hermetic without changing product behavior

Summary by CodeRabbit

  • New Features

    • Tasks now surface a comprehensive derived task_status (kind, status, display, attention flags, next action, signals, recovery hints) in create/list/status/wait and CLI brief/detail output.
  • Tests

    • Added/extended unit and integration tests validating derived task_status fields and CLI rendering.
    • Improved test utilities to fully clear web-search credential env vars to avoid accidental credential detection.
  • Documentation

    • Updated task specs to require derived status display in task commands.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 896bb9bc-5307-4d97-b5ad-9aa5220a1ce2

📥 Commits

Reviewing files that changed from the base of the PR and between e51088b and a16eee1.

📒 Files selected for processing (1)
  • crates/daemon/src/onboard_web_search.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/daemon/src/onboard_web_search.rs

📝 Walkthrough

Walkthrough

Adds a derived task_status summary to task payloads and CLI output; updates rendering and tests to surface status, attention flags, signals, and next actions. Also clears web-search credential env vars in an onboarding test and updates product-spec acceptance criteria.

Changes

Cohort / File(s) Summary
Web Search Test Cleanup
crates/daemon/src/onboard_web_search.rs
Test utility now clears each provider's default_api_key_env from the scoped environment; a test creates a ScopedEnv and calls the cleaner before invoking provider resolution.
Task Status Derivation & Rendering
crates/daemon/src/tasks_cli.rs
Introduces build_task_status_payload and helpers to derive task_status (kind, status, display, flags, next_action, approval_primary_action, signals). Surfaces task_status in brief/detail renderers and adds unit tests for derivation and rendering.
Integration Test Coverage
crates/daemon/tests/integration/tasks_cli.rs
Extends integration tests for `tasks create
Product Specification
docs/product-specs/background-tasks.md
Adds acceptance criterion requiring CLI surfaces to present derived task_status summary (task_status.status, task_status.needs_attention, task_status.next_action).

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI (loong tasks)
  participant Daemon as Daemon / tasks_cli
  participant Builder as TaskStatusBuilder
  participant Session as Session Runtime
  participant Delegate as Delegate
  participant Approvals as Approval Service

  CLI->>Daemon: request (list/status/create/wait)
  Daemon->>Builder: build_task_status_payload(session, delegate, approvals, events, policy)
  Builder->>Session: read session state
  Builder->>Delegate: read delegate phase
  Builder->>Approvals: read approval/attention summary
  Builder-->>Daemon: task_status payload (kind,status,display,flags,next_action,signals)
  Daemon-->>CLI: JSON + rendered text (includes task_status)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

tools, docs, size: M

Suggested reviewers

  • gh-xj

Poem

🐰 I hopped through sessions, read events and cues,
I stitched a tiny status to save you time and news.
Signals bright and next-steps clear, with attention flags that sing,
A rabbit’s tidy summary — hop to it, do the thing!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: deriving and surfacing explicit task status and operator attention signals in background task outputs.
Linked Issues check ✅ Passed All acceptance criteria from #1177 are met: derived task_status is surfaced across create/list/status/wait, approval-pending/overdue/recovery/tool-narrowing signals are rendered, JSON and text surfaces include the status summary, and regression tests validate the derivation and rendering behavior.
Out of Scope Changes check ✅ Passed All changes are in-scope: task status derivation and CLI rendering, integration/unit test coverage, docs update, and test hardening via credential cleanup. No scheduler changes, execution semantic changes, or web UI work introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added documentation Improvements or additions to documentation. spec Architecture boundaries, product specs, and design docs. daemon Daemon binary, CLI entrypoints, and install flow. size: L Large pull request: 501-1000 changed lines. labels Apr 10, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/daemon/src/tasks_cli.rs (1)

1212-1220: Nit: Simplify intermediate variable.

The function is correct but could be slightly more concise.

♻️ Optional simplification
 fn render_task_status_display(kind: &str, recovered: bool) -> String {
     let base = kind.to_owned();
     if !recovered {
         return base;
     }
 
-    let display = format!("{base} (recovered)");
-    display
+    format!("{base} (recovered)")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/daemon/src/tasks_cli.rs` around lines 1212 - 1220, The function
render_task_status_display uses an unnecessary intermediate variable and
explicit return; simplify by returning kind.to_owned() when not recovered and
otherwise returning format!("{kind} (recovered)"), removing the `base` and
`display` temporaries; update references inside render_task_status_display to
use kind and recovered directly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/daemon/src/onboard_web_search.rs`:
- Around line 610-611: The test's hermetic env cleanup is incomplete because
clear_web_search_credential_envs currently only unsets api_key_env_names but not
providers' default_api_key_env, allowing host credentials to leak; update
clear_web_search_credential_envs (or add a helper it calls) to also unset each
provider's default_api_key_env in addition to api_key_env_names, ensuring
ScopedEnv::new() + clear_web_search_credential_envs(&mut env) removes both
api_key_env_names and default_api_key_env entries for every WebSearch provider.

---

Nitpick comments:
In `@crates/daemon/src/tasks_cli.rs`:
- Around line 1212-1220: The function render_task_status_display uses an
unnecessary intermediate variable and explicit return; simplify by returning
kind.to_owned() when not recovered and otherwise returning format!("{kind}
(recovered)"), removing the `base` and `display` temporaries; update references
inside render_task_status_display to use kind and recovered directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b19ebd29-822a-4b9a-b0b8-239f9fb1e7bc

📥 Commits

Reviewing files that changed from the base of the PR and between 04c8a2b and 5af0cad.

📒 Files selected for processing (4)
  • crates/daemon/src/onboard_web_search.rs
  • crates/daemon/src/tasks_cli.rs
  • crates/daemon/tests/integration/tasks_cli.rs
  • docs/product-specs/background-tasks.md

Comment thread crates/daemon/src/onboard_web_search.rs
@chumyin
Copy link
Copy Markdown
Collaborator Author

chumyin commented Apr 10, 2026

Follow-up pushed in a16eee127.

What changed in this pass:

  • kept the onboarding web-search recommendation test hermetic by clearing each provider's canonical default_api_key_env in addition to alias env names
  • kept the already-pushed detached-task create race fix (e51088b3e) intact; the branch now covers both the CI race and the env-leak test gap

Fresh validation on the updated branch:

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace --locked -q
  • cargo test --workspace --all-features --locked -q
  • cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1
  • cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --nocapture
  • ./scripts/check_architecture_boundaries.sh
  • ./scripts/check_dep_graph.sh

Notes:

  • the first full --all-features rerun hit the existing flaky delegate_announce_queue_batches_children_completed_within_debounce_window test; the targeted rerun passed, and the second full --all-features rerun passed cleanly.

chumyin added 4 commits April 16, 2026 20:47
…on fields

The task surface already had truthful child-session, approval, and tool-policy facts,
but  still forced operators to reconstruct
effective status by hand. This change derives a stable
summary from the existing runtime facts and renders it directly in the CLI
surface.

While validating the full local gate, an onboarding web-search test proved to
be environment-sensitive under real operator credential env vars. The test now
clears credential envs explicitly so the configured-provider path stays
hermetic during all-features verification.

Constraint: Must stay truthful to the existing child-session runtime and avoid inventing a second task state model
Rejected: Add a new durable task-state machine | would drift from session truth and broaden the change
Rejected: Document raw fields more clearly | would leave the operator inference gap in place
Confidence: high
Scope-risk: narrow
Directive: Keep task_status derived from existing session, approval, and tool-policy facts unless the runtime model itself changes
Tested: cargo fmt --all -- --check; cargo clippy --workspace --all-targets --all-features -- -D warnings; cargo test --workspace --locked -q; cargo test --workspace --all-features --locked -q; cargo test -p loongclaw tasks_cli::tests:: --lib -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_list_returns_visible_background_tasks -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_status_surfaces_approval_and_tool_policy -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1; cargo test -p loongclaw --test integration execute_tasks_command_events_and_wait_surface_incremental_payloads -- --test-threads=1; cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1; cargo test -p loongclaw-app delegate_announce_queue_batches_children_completed_within_debounce_window --lib -- --test-threads=1; cargo test -p loongclaw-app reopening_current_schema_db_skips_metadata_repairs --lib -- --test-threads=1; ./scripts/check-docs.sh; ./scripts/check_architecture_boundaries.sh; ./scripts/check_dep_graph.sh
Not-tested: Manual operator walkthrough against a live long-running background task
…ion window

The create-path task surface intentionally reports the latest child-session
truth. On CI, detached workers can fail quickly enough that the freshly
hydrated task is already `failed` instead of still `queued`, which made the
integration test assert a scheduler fiction instead of the real runtime state.

This narrows the test to the product contract: create must surface a derived
status and next action that match the current payload, whether the child is
still queued or has already failed.

Constraint: The task surface must stay truthful to current child-session state instead of freezing a synthetic queued snapshot for tests
Rejected: Force tasks create to always report queued | would regress runtime truth and hide fast child failures
Rejected: Add sleeps or polling to wait for a specific state | would keep the test race-prone and slower across CI
Confidence: high
Scope-risk: narrow
Directive: When detached child execution can race inspection, assert the task payload/rendering contract instead of a single transient state
Tested: cargo fmt --all -- --check; cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1; CARGO_TARGET_DIR=/Users/chum/loongclaw/.cargo-target-shared cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --test-threads=1; CARGO_TARGET_DIR=/Users/chum/loongclaw/.cargo-target-shared cargo test --workspace --locked -q; CARGO_TARGET_DIR=/Users/chum/loongclaw/.cargo-target-shared cargo test --workspace --all-features --locked -q
Not-tested: Manual Windows detached-worker timing reproduction outside GitHub Actions
The task-status branch was already corrected for detached task create races, but CodeRabbit surfaced one remaining test-only gap: the onboarding web-search helper cleared provider alias env names without clearing each provider's canonical default credential env. That meant local host credentials could still leak into the explicit-default-provider test and make CI or maintainer reruns less trustworthy. The fix extends the existing scoped cleanup helper instead of introducing new test-only branching.

Constraint: The onboarding recommendation test must stay hermetic even when maintainers have provider credentials in their shell
Rejected: Add provider-specific hardcoded removals inside each test | duplicates the descriptor contract and drifts as providers evolve
Confidence: high
Scope-risk: narrow
Directive: Keep test env cleanup descriptor-driven; if web-search provider metadata changes, update the shared cleanup helper rather than individual tests
Tested: cargo fmt --all -- --check
Tested: cargo clippy --workspace --all-targets --all-features -- -D warnings
Tested: cargo test --workspace --locked -q
Tested: cargo test --workspace --all-features --locked -q
Tested: cargo test -p loongclaw resolve_web_search_provider_recommendation_keeps_explicitly_configured_default_provider --lib -- --test-threads=1
Tested: cargo test -p loongclaw --test integration execute_tasks_command_create_queues_background_task_and_surfaces_follow_up_recipes -- --nocapture
Tested: ./scripts/check_architecture_boundaries.sh
Tested: ./scripts/check_dep_graph.sh
Not-tested: GitHub Actions rerun after push (local CI-parity passed)
@chumyin chumyin force-pushed the feat/task-status-surface-20260410 branch from a16eee1 to 282e975 Compare April 16, 2026 13:39
@chumyin chumyin requested review from a team as code owners April 16, 2026 13:39
@github-actions github-actions Bot added the protocol Protocol crate and wire-level behavior. label Apr 16, 2026
@chumyin chumyin merged commit e0516c9 into eastreams:dev Apr 16, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

daemon Daemon binary, CLI entrypoints, and install flow. documentation Improvements or additions to documentation. protocol Protocol crate and wire-level behavior. size: L Large pull request: 501-1000 changed lines. spec Architecture boundaries, product specs, and design docs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Surface explicit status and attention on background tasks

1 participant