feat: partial failure visibility with completed_with_failures status by Muizzkolapo · Pull Request #92 · Muizzkolapo/agent-actions

Muizzkolapo · 2026-04-02T10:17:19Z

Summary

When an action processes N items and some fail while others succeed, the action is now marked completed_with_failures instead of silently completing as completed
Failed items have their input snapshot persisted in the disposition table (JSON-serialized, 10KB cap) — enables future retry functionality
The tally now shows N OK | M PARTIAL | S SKIP | K ERROR and level lines show yellow for partial failures
The circuit breaker correctly ignores completed_with_failures — downstream actions run on the successful records

Changes

Storage layer: set_disposition() gains input_snapshot param on abstract interface; get_failed_items() convenience method added; SQLite schema migration for input_snapshot column
Result collector: passes input_snapshot (from source_snapshot or input_record) when writing FAILED dispositions
Executor: _handle_run_success() queries storage for item-level failures after the runner returns; sets completed_with_failures when partial failures detected
State manager: completed_with_failures added to terminal sets (get_pending_actions, is_workflow_done, is_workflow_complete)
Display: WorkflowCompleteEvent gains actions_partial; formatter includes PARTIAL count; level lines yellow for partial failures
Status checks: 6 locations updated to accept completed_with_failures alongside completed

Test plan

ruff check . — clean (1 pre-existing issue in test_limits.py)
ruff format --check — clean
pytest — 4279 passed, 2 skipped, 0 failures
Manual: run workflow where 1 item fails → action shows completed_with_failures → tally shows PARTIAL → downstream actions still run on good records

When an action processes N items and some fail while others succeed, the action is now marked "completed_with_failures" instead of silently completing as "completed". This gives operators visibility into partial failures without halting downstream actions. Changes: - Storage backend: add input_snapshot param to set_disposition() and get_failed_items() method on abstract interface - SQLite: add input_snapshot column with schema migration for existing DBs - Result collector: persists JSON-serialized input snapshot for failed items - Executor: queries storage after success to detect partial failures - State manager: completed_with_failures as terminal status, circuit breaker ignores it (descendants run on successful records) - Tally: shows "N OK | M PARTIAL | S SKIP | K ERROR" - Level lines: yellow for levels with partial failures

- Extract _resolve_completion_status() helper and use it in both online and batch completion paths (fixes batch actions ignoring partial failures) - dependency.py: accept completed_with_failures for upstream checks (fixes downstream blocking on partial failures) - manifest.py: is_action_completed() and get_completed_actions() accept completed_with_failures (fixes unnecessary re-runs on resume)

Muizzkolapo

Review: Phase 2 — Partial Failure Visibility

What's addressed

All three issues from the audit are fixed:

Issue	Status	How
ARCH-007 (#89) — batch path blind to partial failures	Fixed	Extracted `_resolve_completion_status()` shared helper, called from `_handle_run_success()`, `_handle_batch_check()`, and `_handle_batch_check_async()`
ARCH-008 (#90) — `dependency.py:167` blocks on `completed_with_failures`	Fixed	Set check: `in {"completed", "completed_with_failures"}`
ARCH-009 (#91) — `manifest.py` excludes `completed_with_failures`	Fixed	Both `is_action_completed()` and `get_completed_actions()` updated

The _resolve_completion_status() helper is the right call — single definition, used by all three completion paths. No more divergence surface for this check.

The six == "completed" → in {"completed", "completed_with_failures"} updates across executor.py, coordinator.py, execution_events.py, action_executor.py all look correct and consistent.

Storage layer, result collector snapshot persistence, state manager terminal sets, tally display — all clean.

Issues to fix

1. _verify_completion_status() hardcodes status="completed" — loses partial failure signal on resume

executor.py (current L212-216 in the PR):

return (
    True,
    ActionExecutionResult(
        success=True, status="completed", metrics=ExecutionMetrics(duration=0.0)
    ),
)

When a completed_with_failures action is verified on re-run (L555-562 / L617-624), the gate correctly enters — but the returned result hardcodes "completed". The caller returns this directly. So on resume, every completed_with_failures action silently upgrades to "completed", erasing the partial failure signal from downstream consumers (tally, tracker, coordinator).

Fix: Use the actual status from the state manager:

actual_status = self.deps.state_manager.get_status(action_name)
return (
    True,
    ActionExecutionResult(
        success=True, status=actual_status, metrics=ExecutionMetrics(duration=0.0)
    ),
)

Same fix needed at L227-232 (the fallback path when no storage backend exists).

2. BatchCompleteEvent hardcodes completed=1, failed=0 even on partial failure

executor.py L680-685 (and the async twin at L754-759):

fire_event(
    BatchCompleteEvent(
        ...
        completed=1,
        failed=0,
        ...
    )
)

When final_status is completed_with_failures, this event still reports 0 failures. Any dashboard or log consumer watching BatchCompleteEvent will see clean completions when items actually failed.

Fix: Reflect the actual status:

fire_event(
    BatchCompleteEvent(
        ...
        completed=1 if final_status == "completed" else 0,
        failed=0 if final_status == "completed" else 1,
        ...
    )
)

Or better, add a partial field to BatchCompleteEvent if it exists.

3. No tests for the actual completed_with_failures path

The test changes (test_executor_lifecycle.py, test_executor_events.py) mock get_failed_items to return [] — this prevents test breakage but doesn't test the new behavior. There's no test that verifies:

get_failed_items() returns items → action status is completed_with_failures (not completed)
get_failed_items() raises → graceful fallback to completed
Batch completion with item failures → completed_with_failures (the ARCH-007 fix)
Tally shows correct PARTIAL count when partial failures exist
Level line is yellow (not green) for partial failures

Recommend at minimum adding to test_executor_lifecycle.py:

def test_handle_run_success_with_item_failures(mock_deps):
    mock_deps.action_runner.storage_backend.get_failed_items.return_value = [
        {"record_id": "rec-1", "disposition": "failed", "reason": "timeout"}
    ]
    executor = ActionExecutor(mock_deps)
    result = executor._handle_run_success(params, "/out", 1.0, None)
    assert result.status == "completed_with_failures"
    mock_deps.state_manager.update_status.assert_called_with(
        "test_action", "completed_with_failures", record_limit=None, file_limit=None
    )

def test_handle_batch_check_with_item_failures(mock_deps):
    mock_deps.batch_manager.handle_batch_agent.return_value = ("/out", "completed")
    mock_deps.action_runner.storage_backend.get_failed_items.return_value = [
        {"record_id": "rec-1", "disposition": "failed", "reason": "api_error"}
    ]
    executor = ActionExecutor(mock_deps)
    result = executor._handle_batch_check("action_a", 0, config, start_time)
    assert result.status == "completed_with_failures"

Minor

tasks/todo.md still describes Phase 2 with the CollectionSummary approach (items 2.2, 2.3). The actual implementation is simpler — querying get_failed_items() directly. Worth updating so the plan matches what shipped.

Verdict

The core architecture is right — shared helper, symmetric batch/online paths, all downstream status checks updated. Fix items 1-3 above before merge. Item 1 is a correctness bug (data loss on resume), item 2 is observability, item 3 is coverage.

WARNINGs fixed: - is_completed() now includes completed_with_failures (stale-completion verification works on resume) - Added PARTIAL to COLORS dict (tally coloring now works) - _resolve_completion_status exception path logs at WARNING not DEBUG - JSON truncation produces valid JSON with __truncated__ sentinel instead of mid-character slice - TOCTOU acknowledged (low impact, documented) MINORs fixed: - mark_action_completed() accepts status parameter - Migration ALTER TABLE logs at DEBUG on both branches - json.dumps failure logs at DEBUG - All test coverage gaps filled (13 new tests): _resolve_completion_status (4 paths), get_failed_items sentinel filtering, circuit breaker ignores partial, is_completed_with_failures, terminal sets, is_workflow_complete with partial, level coloring for partial failures

The set {"completed", "completed_with_failures"} was repeated as inline literals in 9 locations across 6 files. Extracted as COMPLETED_STATUSES and TERMINAL_STATUSES frozensets in state.py — single source of truth when new statuses are added.

…manifest

Muizzkolapo added 2 commits April 2, 2026 11:16

Muizzkolapo commented Apr 2, 2026

View reviewed changes

Muizzkolapo and others added 3 commits April 2, 2026 11:46

Merge branch 'main' into feat/partial-failure-visibility-and-sidecar-…

8182548

…manifest

Muizzkolapo merged commit f4a1b32 into main Apr 2, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: partial failure visibility with completed_with_failures status#92

feat: partial failure visibility with completed_with_failures status#92
Muizzkolapo merged 5 commits intomainfrom
feat/partial-failure-visibility-and-sidecar-manifest

Muizzkolapo commented Apr 2, 2026

Uh oh!

Muizzkolapo left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Muizzkolapo commented Apr 2, 2026

Summary

Changes

Test plan

Uh oh!

Muizzkolapo left a comment

Choose a reason for hiding this comment

Review: Phase 2 — Partial Failure Visibility

What's addressed

Issues to fix

Minor

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant