Skip to content

feat: DEFERRED records audit trail (ARCH-003)#105

Merged
Muizzkolapo merged 3 commits intomainfrom
arch-003/deferred-audit-trail
Apr 2, 2026
Merged

feat: DEFERRED records audit trail (ARCH-003)#105
Muizzkolapo merged 3 commits intomainfrom
arch-003/deferred-audit-trail

Conversation

@Muizzkolapo
Copy link
Copy Markdown
Owner

Summary

  • Writes a DISPOSITION_DEFERRED disposition to the storage backend when a record is queued for batch processing, closing the observability gap where DEFERRED was the only status without a durable trail
  • Clears DEFERRED dispositions when batch results land, ensuring records don't accumulate in a stale intermediate state
  • Warns on orphaned DEFERRED records (queued but never returned) after batch completion

Changes (7 files)

File What
storage/backend.py Add DISPOSITION_DEFERRED constant + Disposition.DEFERRED enum member
processing/result_collector.py Write DEFERRED disposition via _safe_set_disposition(), add deferred to CollectionStats, fire ResultCollectedEvent
logging/events/data_pipeline_events.py Add total_deferred: int = 0 to ResultCollectionCompleteEvent (defaulted for backward compat)
llm/batch/services/processing_recovery.py Clear DEFERRED disposition for every record when batch results arrive
workflow/managers/batch.py Add _warn_orphaned_deferred() — diagnostic warning after batch processing
tests/unit/core/test_result_collector.py 5 new tests: disposition write, no-guid skip, stats counting, mixed statuses
tests/unit/wave3/test_enrichment_complete_event.py Updated deferred test to verify disposition write instead of just status existence

Design decisions

  • source_guid as record_id — matches every other disposition in the codebase; enables correlation when batch results come back (also keyed by source_guid)
  • clear_disposition() before writing final status — avoids UNIQUE constraint issues and ensures clean state regardless of final outcome (success records get no disposition, only the DEFERRED is cleared)
  • Orphan detection is diagnostic only — warning log, never raises. Telemetry must not crash the pipeline.

Test plan

  • ruff check . — all clean
  • pytest — 4305 passed, 2 skipped, 0 failures
  • New tests verify: DEFERRED disposition write, stats counting, no-guid safety, mixed-status correctness

Closes #84

🤖 Generated with Claude Code

Muizzkolapo and others added 3 commits April 2, 2026 16:23
Close the audit trail gap where records entering batch processing
had no durable disposition in the storage backend.

- Add DISPOSITION_DEFERRED constant and Disposition.DEFERRED enum member
- Write DEFERRED disposition in ResultCollector with source_guid + task_id
- Add deferred count to CollectionStats and ResultCollectionCompleteEvent
- Clear DEFERRED dispositions when batch results arrive (processing_recovery)
- Add orphan detection: warn if DEFERRED records remain after batch completion
- Add 5 new tests covering disposition write, stats, and mixed-status scenarios

Closes #84
@Muizzkolapo Muizzkolapo merged commit 6d8975f into main Apr 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ARCH-003: DEFERRED records leave no audit trail in result collection

1 participant