Skip to content

feat: pre-flight validation and dependency-aware execution guards#67

Merged
Muizzkolapo merged 10 commits intomainfrom
feat/preflight-validation-and-execution-guards
Apr 1, 2026
Merged

feat: pre-flight validation and dependency-aware execution guards#67
Muizzkolapo merged 10 commits intomainfrom
feat/preflight-validation-and-execution-guards

Conversation

@Muizzkolapo
Copy link
Copy Markdown
Owner

Summary

  • Failed Result Isolation: Write DISPOSITION_FAILED on action failure, reject tainted results before output verification, add --fresh CLI flag and delete_target() storage method
  • Circuit Breaker Execution: Check upstream health before get_previous_outputs (not after), skip downstream actions when upstream fails while independent branches continue, return ("completed_with_failures", {...}) for partial-success workflows
  • Pre-Flight Resolution Service: New WorkflowResolutionService validates API key env vars (resolved dynamically from vendor config model_fields), seed file $file: references (via shared resolve_seed_path() utility), and vendor batch-mode compatibility — all before any action executes
  • Drop Directive Validation: _check_drop_directives() in static analyzer catches drops targeting passthrough or non-existent fields with actionable hints
  • Lineage Reachability: passthrough_wildcard_sources on OutputSchema makes wildcard passthroughs visible to analysis; _check_lineage_reachability() traces observe refs through passthrough chains (warnings, not errors)

Addresses all 8 issues from ISSUES_FOR_DEVELOPERS.md (3 architectural gaps: late resolution, no dependency-aware execution, no result provenance).

Test plan

  • pytest tests/ — 4145 tests pass, 0 regressions
  • ruff check . — clean (1 pre-existing import sort issue in test_limits.py)
  • Run review_analyzer with missing API key → fails at pre-flight, not at action execution
  • Run workflow where early action fails → downstream shows "Skipped: upstream X failed", independent branches complete
  • Run --fresh after a failed run → clean re-execution, no stale cache

Muizzkolapo and others added 10 commits April 1, 2026 20:32
Address 8 issues discovered at runtime across failed review_analyzer runs.
All were statically detectable or preventable with proper execution guards.

Three architectural gaps fixed:

1. Failed Result Isolation (Issue #8)
   - Write DISPOSITION_FAILED on action failure (online + batch paths)
   - Check disposition FIRST in _verify_completion_status before output check
   - Add delete_target() to StorageBackend interface
   - Add --fresh CLI flag to clear stale results before execution

2. Circuit Breaker — Dependency-Aware Execution (Issue #7)
   - Add _check_upstream_health() before get_previous_outputs (not after)
   - Skip downstream actions when upstream fails, independent branches continue
   - Level orchestrator logs failures instead of raising WorkflowError
   - Return ("completed_with_failures", {...}) for partial success workflows
   - get_pending_actions now excludes failed actions; add is_workflow_done()

3. Pre-Flight Resolution Service (Issues #1, #2, #6)
   - WorkflowResolutionService checks API keys (from vendor config model_fields),
     seed file $file: references, and vendor batch-mode compatibility
   - Shared resolve_seed_path() utility in utils/path_security.py
   - AA_SKIP_ENV_VALIDATION=1 escape hatch for CI environments

4. Drop Directive Validation (Issue #4)
   - _check_drop_directives() in WorkflowStaticAnalyzer validates drop targets
     against schema/observe/passthrough fields with distinct error messages

5. Lineage Reachability Validation (Issue #5)
   - Add passthrough_wildcard_sources to OutputSchema (fixes invisible wildcards)
   - _check_lineage_reachability() traces observe refs through passthrough chains
   - Emits warnings (not errors) — strict mode can promote them
Review fixes:
1. Wire resolve_seed_path() into StaticDataLoader._resolve_path() — shared
   utility now used by both pre-flight and runtime (no more duplicates)
2. Add 60 new tests covering all new production code:
   - resolve_seed_path (5 tests)
   - WorkflowResolutionService API keys/seed files/vendor compat (16 tests)
   - Circuit breaker _check_upstream_health/_handle_dependency_skip (13 tests)
   - delete_target SQLite (3 tests)
   - is_workflow_done + get_pending_actions (8 tests)
   - Drop directive + lineage reachability validation (15 tests)
3. Restore test for unexpected-crash exception path (handle_workflow_error)
4. Fix queue.pop(0) → deque.popleft() in BFS lineage tracer
5. Fix logger placement in parallel/action_executor.py (after all imports)
6. Add public reset() to ActionStateManager (replace private method calls)
7. Fix finalize_workflow ordering in async path — set state.failed before
   finalize so handlers see correct state
When every input item fails during LLM processing (e.g. 401 auth error),
the pipeline now raises instead of silently writing empty output. This
lets the executor mark the action as failed and the circuit breaker skip
downstream dependents.

Previously, per-item errors were caught inside process_batch(), logged,
and the action was marked "completed" with 0 records — causing cascading
empty-data execution through the entire downstream chain.
1. Extract storage_backend before dependency loop (#2)
2. Align status: _handle_dependency_skip returns status="failed" to match
   state_manager, eliminating failed/skipped inconsistency (#3)
3. Add depth limit (10 levels) to _resolve_seed_data_dir (#4)
4. Replace global _VENDOR_CONFIG_MAP with @lru_cache (#5)
5. Add BFS field-name limitation comment in lineage tracer (#6)
6. Extract MAX_DISPOSITION_REASON_LENGTH constant (#7)
7. Move ProcessingStatus import to module level in pipeline.py (#10)
@Muizzkolapo Muizzkolapo merged commit 8c023f9 into main Apr 1, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant