Skip to content

feat: dry-run shadow mode for pipeline evaluation#66

Merged
dwsmith1983 merged 5 commits intomainfrom
feat/dry-run-shadow-mode
Mar 12, 2026
Merged

feat: dry-run shadow mode for pipeline evaluation#66
dwsmith1983 merged 5 commits intomainfrom
feat/dry-run-shadow-mode

Conversation

@dwsmith1983
Copy link
Copy Markdown
Owner

Summary

  • Adds observation-only dryRun: true mode that evaluates trigger conditions, validation rules, and SLA projections against real sensor data without executing jobs or starting Step Functions
  • Publishes 4 new EventBridge events: DRY_RUN_WOULD_TRIGGER, DRY_RUN_LATE_DATA, DRY_RUN_SLA_PROJECTION, DRY_RUN_DRIFT
  • DRY_RUN# markers with 7-day TTL for dedup and late-data detection; SLA projection reuses production handleSLACalculate for consistent deadline resolution

Changes

  • pkg/types/pipeline.goDryRun bool on PipelineConfig
  • pkg/types/events.go — 4 new event types
  • pkg/types/dynamo.goDryRunSK() helper
  • internal/store/control.goWriteDryRunMarker (conditional write) / GetDryRunMarker (consistent read)
  • internal/lambda/dryrun.go — core dry-run logic (handleDryRunTrigger, handleDryRunPostRunSensor, publishDryRunSLAProjection)
  • internal/lambda/stream_router.go — dry-run branch after calendar exclusion check
  • internal/lambda/postrun.go — dry-run branch for post-run drift
  • internal/validation/config.go — dryRun requires job.type and schedule.trigger
  • README, pipelines docs, alerting docs, CHANGELOG updated

Observation-only mode that evaluates trigger conditions, validation
rules, and SLA projections against real sensor data without executing
jobs or starting Step Functions. Publishes DRY_RUN_WOULD_TRIGGER,
DRY_RUN_LATE_DATA, DRY_RUN_SLA_PROJECTION, and DRY_RUN_DRIFT events
to EventBridge for comparison against the existing orchestrator.
@github-actions github-actions bot added tests Test changes lambda Lambda handlers docs Documentation types Public types (pkg/types) labels Mar 12, 2026
@dwsmith1983 dwsmith1983 self-assigned this Mar 12, 2026
Remove unused sensorData parameter from handleDryRunTrigger (unparam),
invert nested if-blocks to use continue (gocritic nestingReduce), and
assign v0.9.0 to the dry-run changelog entry.
Addresses common misidentification of Interlock as an orchestrator.
New "What Interlock Is (and Isn't)" section clarifies that Interlock
is a safety controller that gates the trigger path — not a scheduler
or orchestrator replacement.
Cancel previously only published SLA_MET and swallowed BREACH, leaving
reruns with no SLA outcome notification. Now always publishes a binary
MET or BREACH verdict and propagates publish errors so the step function
can retry on transient failures.

Also adds RERUN_ACCEPTED and JOB_COMPLETED to the EventBridge alert
filter so rerun lifecycle events reach Slack.
@github-actions github-actions bot added the deploy Deployment and ASL label Mar 12, 2026
Add dedicated guide page walking through the end-to-end dry-run
workflow: setup, evaluation flow, monitoring events, SLA projection,
drift detection, going live, and troubleshooting. Add missing _index.md
for the guides section with card navigation.
@dwsmith1983 dwsmith1983 merged commit 83ed6cc into main Mar 12, 2026
6 checks passed
@dwsmith1983 dwsmith1983 deleted the feat/dry-run-shadow-mode branch March 12, 2026 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deploy Deployment and ASL docs Documentation lambda Lambda handlers tests Test changes types Public types (pkg/types)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant