Skip to content

fix(scheduler): consolidate missed-fire catchup into one row#49

Merged
finedesignz merged 1 commit into
mainfrom
fix/scheduler-catchup-consolidate
May 26, 2026
Merged

fix(scheduler): consolidate missed-fire catchup into one row#49
finedesignz merged 1 commit into
mainfrom
fix/scheduler-catchup-consolidate

Conversation

@finedesignz
Copy link
Copy Markdown
Owner

Problem

Hub restart with N missed cron slots for a single task produced N identical skipped/catchup rows in scheduled_task_runs. Real-world hit: 20 dupes for a 4h-cadence task across a long offline window — flooded the runs UI.

Fix

One consolidated row per task on boot:

  • status='skipped'
  • error='hub_restart:N_missed' (machine-readable)
  • output_snippet='Skipped N missed fires from {first} to {last} during hub downtime'
  • started_at = first missed timestamp (preserves original cron-fire time)
  • finished_at = now()

run_once policy still dispatches the latest missed slot live; older slots collapse into one consolidated row instead of N.

Cap raised to 1000 — single row regardless of count.

Test

hub/test/catchup-consolidate.test.ts — 20 / 1 / 1000 / 0 missed → 1 / 1 / 1 / 0 rows. 4 pass.
Existing scheduler tests: 52 pass, no regression.

Hub restart with N missed cron slots for a task was producing N
identical 'skipped/catchup' rows (real-world: 20 dupes for a 4h task
across a long offline window). Now produces ONE row with
error='hub_restart:N_missed' and a human-readable output_snippet
summarising first→last missed timestamps.

- catchup.ts: new consolidateMissed() helper; runOnce() emits one
  consolidated row per task for both 'skip' and 'run_once' policies
  (run_once still dispatches the latest slot live).
- insertRunV2: accept optional output_snippet, started_at, finished_at
  so the consolidated row carries the original first-missed timestamp.
- MAX_MISSED bumped to 1000 — single row regardless of count.
- Test: 20 / 1 / 1000 / 0 missed slots → 1 / 1 / 1 / 0 rows.
@finedesignz finedesignz merged commit 45db470 into main May 26, 2026
1 check passed
finedesignz added a commit that referenced this pull request May 26, 2026
PR #49 made insertRunV2's started_at param optional and set it to null
when status='pending'. The scheduled_task_runs.started_at column is
NOT NULL, so every cron fire was failing with a constraint violation.

- insertRunV2: always default started_at to new Date() when omitted.
- schema.sql: idempotent ALTER to ensure column DEFAULT now() is set in
  prod (belt-and-suspenders against drifted instances).
- regression test asserting started_at is always non-null in the INSERT.
@finedesignz finedesignz deleted the fix/scheduler-catchup-consolidate branch May 26, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant