feat: add ActiveDeadlineSeconds option for TrainJob configuration#415
feat: add ActiveDeadlineSeconds option for TrainJob configuration#415XploY04 wants to merge 2 commits intokubeflow:mainfrom
Conversation
Signed-off-by: XploY04 <2004agarwalyash@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Adds a new Kubernetes-only option to configure .spec.activeDeadlineSeconds on TrainJob resources, and threads that value through the Kubernetes backend spec generation.
Changes:
- Introduces
ActiveDeadlineSecondsoption with validation and application into the TrainJob spec. - Propagates
activeDeadlineSecondsthroughKubernetesBackend.train()into the generated TrainJob spec. - Adds unit tests covering option application, backend compatibility, and invalid (non-positive) values.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| kubeflow/trainer/options/kubernetes.py | Adds ActiveDeadlineSeconds option and applies it to .spec.activeDeadlineSeconds with backend validation. |
| kubeflow/trainer/options/init.py | Exports ActiveDeadlineSeconds from the options package. |
| kubeflow/trainer/options/kubernetes_test.py | Extends option validation/application tests and adds validation tests for invalid seconds. |
| kubeflow/trainer/backends/kubernetes/backend.py | Plumbs activeDeadlineSeconds from option-produced spec into TrainJobSpec creation. |
| kubeflow/trainer/backends/kubernetes/backend_test.py | Adds backend tests asserting activeDeadlineSeconds is present on the produced TrainJob spec. |
…er values Signed-off-by: XploY04 <2004agarwalyash@gmail.com>
|
/retest |
|
@XploY04: Cannot trigger testing until a trusted user reviews the PR and leaves an DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@andreyvelich can you review this ? the failing tests are because of the know issue, which will be solved after merging of kubeflow/trainer#3365 this pr. |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes #403
Checklist: