Skip to content

Warn when a continuous job does not set task_retry_mode#5691

Open
pranshupand-db wants to merge 3 commits into
databricks:mainfrom
pranshupand-db:add-job-task-retry-warning
Open

Warn when a continuous job does not set task_retry_mode#5691
pranshupand-db wants to merge 3 commits into
databricks:mainfrom
pranshupand-db:add-job-task-retry-warning

Conversation

@pranshupand-db

@pranshupand-db pranshupand-db commented Jun 23, 2026

Copy link
Copy Markdown

Summary

  • Adds a fast validation (validate:continuous_task_retry_mode) that emits a warning when a DABs job defines a continuous block but does not set continuous.task_retry_mode.
  • task_retry_mode (enum NEVER | ON_FAILURE) defaults to NEVER, so without it a continuous job does not retry its tasks on failure. The warning makes this implicit behavior visible.
  • An explicit task_retry_mode (NEVER or ON_FAILURE) is treated as a deliberate choice and does not warn — detected on the dynamic config value.
  • Wired into FastValidate, so it runs on bundle validate and bundle deploy.
  • Adds a NEXT_CHANGELOG.md entry under Bundles.

Best-practices checklist

  • Warning severity, matching every other diagnostic in bundle/config/validate and the requested behavior.
  • Deterministic output: iterates the ordered dyn config (dyn.MapByPattern), not a Go map.
  • Unit tests in continuous_task_retry_mode_test.go (warns when unset, no warn when set, no warn without a continuous block).
  • Acceptance test inputs under acceptance/bundle/validate/continuous_task_retry_mode/.

Test plan

  • go test ./bundle/config/validate/ -run TestContinuousTaskRetryMode
  • ./task test-update to generate output.txt + out.test.toml for the new acceptance test (and refresh any other bundles that now surface this warning)
  • ./task fmt && ./task lint

Note: the author could not compile/run locally — this environment had only Go 1.25.5 while the repo requires the Go 1.26.4 toolchain, with no network to fetch it. The acceptance test currently ships inputs only (databricks.yml, script); its generated output.txt/out.test.toml must be produced with ./task test-update before merge.

Adds a fast validation that emits a warning when a DABs job task does not
configure max_retries. Such tasks are never retried on failure, which is a
common source of avoidable job failures. An explicit max_retries (including 0)
is treated as a deliberate choice and does not warn.
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Approval status: pending

/acceptance/bundle/ - needs approval

Files: acceptance/bundle/validate/continuous_task_retry_mode/databricks.yml, acceptance/bundle/validate/continuous_task_retry_mode/script
Suggested: @denik
Also eligible: @pietern, @janniklasrose, @shreyas-goenka, @andrewnester, @anton-107, @lennartkats-db

/bundle/ - needs approval

Files: bundle/config/validate/continuous_task_retry_mode.go, bundle/config/validate/continuous_task_retry_mode_test.go, bundle/config/validate/fast_validate.go
Suggested: @denik
Also eligible: @pietern, @janniklasrose, @shreyas-goenka, @andrewnester, @anton-107, @lennartkats-db

General files (require maintainer)

Files: NEXT_CHANGELOG.md
Based on git history:

  • @denik -- recent work in bundle/config/validate/, ./

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

task_retry_mode is a property of a job's continuous block (enum NEVER |
ON_FAILURE, default NEVER) that controls whether a continuous job retries its
tasks. The warning now fires only when a job defines a continuous block without
task_retry_mode, rather than on every task missing max_retries.
@pranshupand-db pranshupand-db changed the title Add warning when a job task has no retry policy set Warn when a continuous job does not set task_retry_mode Jun 23, 2026
…rning

NEXT_CHANGELOG.md documents the new bundle validate warning. The acceptance
test under acceptance/bundle/validate/continuous_task_retry_mode covers a
continuous job missing task_retry_mode (warns) alongside one that sets it (no
warning). Its output.txt and out.test.toml must be generated with
`./task test-update`, which requires the Go 1.26.4 toolchain unavailable here.
@github-actions

Copy link
Copy Markdown
Contributor

An authorized user can trigger integration tests manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 5691
  • Commit SHA: 759cd19d5ac3d31269d8e2d8b3875d62dfe63e22

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant