Enhanced run_immediately functionality#64162
Conversation
|
Can you fill in this table so the intent is clear?
|
cc: @collinmcnulty |
SameerMesiah97
left a comment
There was a problem hiding this comment.
I think this PR is mixing a legitimate bug fix i.e. ensuring that setting the start_date does not subvert run_immediately in the timetable logic, with a behavioral change that feels a bit opinionated. Always routing the first run through _calc_first_run() makes sense, since previously run_immediately could effectively be ignored when start_date was set which is clearly inconsistent.
The changes around run_immediately=False and removing the implicit buffer feel more like a semantic change than a bug fix. The buffer logic looks quite intentional and provides a small grace window for slightly late scheduling, and now False strictly skips the past run. Also changing the default from False to True is prettty big shift. Now, I will caveat this by stating that I may be missing context, so I am happy to be corrected but a semantic shift like this in scheduler-adjacent code needs stronger justification than what you have provided.
Thanks for the review @SameerMesiah97 Why the auto-buffer was removed from In the old code, Why the default changed from Because of the bug,
This is my actual pain point. I have updated the description as stated |
Summary
Fix
CronTriggerTimetableignoringrun_immediatelywhenstart_dateis set, simplify itssemantics, and document the parameter properly.
Bug fix
run_immediatelywas only respected on the very first run whenstart_date=None.When
start_datewas set (the common case),_calc_first_run()was skipped and thetimetable always fell back to
_align_to_prev(now), behaving asrun_immediately=Trueregardless of the value passed. The fix ensures
_calc_first_run()is always called onthe first run, whether or not
start_dateis set.Semantic simplification (justified by the bug)
Because
run_immediatelywas silently ignored wheneverstart_datewas set, theFalsepath in
_calc_first_run()was effectively dead code for the common case. Auditing thatdead code revealed two inconsistencies:
run_immediately=Falseapplied an undocumented auto-buffer window (10% of the croninterval, minimum 5 min), identical to
None, makingFalsenot mean what users expected.Since
timedeltaalready covers the "run past if within a window" use case, the bufferis removed.
Falsenow cleanly means "always skip the past cron point and wait for thenext future one".
run_immediatelywas not documented at all, so no user could have knowingly depended onthe buffer behaviour.
Backward compatibility
The default is changed from
FalsetoTrueso that existing DAGs without an explicitrun_immediatelycontinue to run the most recent past cron point immediately. This preservesthe de-facto behaviour that all users with
start_dateset were already seeing (since thebug made
Falseact likeTruein that case). Only DAGs that explicitly passrun_immediately=Falsewithout astart_dateare affected — and for those,Falsenowmeans exactly what the name implies.
Changes
_TriggerTimetable.next_dagrun_info: on first run (no prior runs), always call_calc_first_run()regardless ofstart_date; after a pause/resume keep the existing"pick most recent past boundary" logic unchanged
CronTriggerTimetable._calc_first_run: simplified to three clean cases —True(run past),False(wait for next),timedelta(run past if within window)run_immediatelychangedFalse → TrueinCronTriggerTimetable,MultipleCronTriggerTimetable, andCronPartitionTimetablerun_immediatelydocumentation with examples andversionadded:: 3.0.0immediately trigger Feb 2 — the claim that
CronTriggerTimetableskips Feb 2 as well was wrong)test_run_immediately_false_with_start_date,test_run_immediately_true_with_start_date,test_run_immediately_false_after_unpauseWas generative AI tooling used to co-author this PR?