Releases: kubeflow/sdk
Releases · kubeflow/sdk
0.3.1
Immutable
release. Only release title and notes can be modified.
🐛 Bug Fixes
- fix(deps): Bump Kubeflow Trainer API to 2.1 version (#452 by @andreyvelich)
0.4.0
Immutable
release. Only release title and notes can be modified.
🚀 Features
- feat: Run dataset and model initializers in parallel (#313 by @priyank766)
- feat(trainer): replace PodTemplateOverrides with RuntimePatches API (#381 by @Fiona-Waters)
- feat(docs): Update README with Spark Support (#349 by @andreyvelich)
- feat(spark): Refactor unit tests to sdk coding standards (#293 by @digvijay-y)
- feat: add TrainerClient examples for local PyTorch distributed training (#312 by @MansiSingh17)
- feat: Add validate lockfile workflow to complement CVE scanning (#306 by @Fiona-Waters)
- feat(trainer): Support namespaced TrainingRuntime in the SDK (#130 by @shaikmoeed)
- feat: Adds a GitHub Actions workflow to check kubeflow/hub/OWNERS. (#280 by @muhammadjunaid8047)
- feat: Added examples to the documentation demonstrating different ways to handle ports (#243 by @osamaahmed17)
- feat: add SparkClient API for SparkConnect session management (#225 by @Shekharrajak)
- feat(trainer): add dataset and model initializer support to container backend (#188 by @HKanoje)
- feat: Add Kubeflow SDK docs website (#237 by @kramaranya)
- feat: add model registry client (#186 by @jonburdo)
🐛 Bug Fixes
- fix: support RC version format in Makefile release target (#398 by @Fiona-Waters)
- fix(trainer): add missing wildcard to .pt and .pth ignore patterns (#372 by @ghazariann)
- fix(trainer): Fix packages installation with extra notation (#385 by @andreyvelich)
- fix(trainer): ignore PEP 668 system python check (#384 by @robert-bell)
- fix: Make validate-lockfile action non-blocking (#361 by @Fiona-Waters)
- fix(trainer): adapt SDK to removal of numProcPerNode from TorchMLPolicySource (#360 by @tariq-hasan)
- fix(trainer): return TRAINJOB_COMPLETE when all steps are done (#340 by @priyank766)
- fix(optimizer): add missing get_job_events() to RuntimeBackend base c… (#325 by @ruskaruma)
- fix(optimizer): prevent input mutation in optimize() (#322 by @ruskaruma)
- fix(trainer): handle falsy values in get_args_from_peft_config (#328 by @krishdef7)
- fix: improve logging around packages_to_install (#269 by @briangallagher)
- fix: Fix runtime lookup fallback and test local SDK in E2E (#307 by @XploY04)
- fix: nightly security dependency updates (#296 by @github-actions[bot])
- fix: Improve CVE workflow (#267 by @Fiona-Waters)
- fix: preserve case for extended resource keys (#264 by @danish9039)
- fix: upgrade pyasn1 to v0.6.2 (#257 by @Fiona-Waters)
- fix(ci): Bump Kubernetes version for E2E tests (#253 by @andreyvelich)
- fix: Remove uv from tools in readthedocs (#242 by @kramaranya)
⚙️ Miscellaneous Tasks
- chore(deps): Bump Kubeflow Trainer API to 2.2.0 (#406 by @andreyvelich)
- chore(ci): bump aquasecurity/trivy-action from 0.34.0 to 0.34.1 in the actions group (#319 by @dependabot[bot])
- chore(trainer): Add API reference docs for kubeflow.trainer.options classes (#396 by @Fiona-Waters)
- chore(trainer): fix typos in TrainerClient docstrings (#394 by @andres75125)
- chore(spark): add Spark documentation and API reference (#364 by @Amir380-A)
- chore(spark): remove SDK-side validation from SparkClient (#345 by @YassinNouh21)
- chore(spark): change pyspark[connect] dependency (#357 by @alimaredia)
- chore(spark): migrate SDK to kubeflow_spark_api Pydantic models (#295 by @tariq-hasan)
- chore: fix docstrings in TrainerClient (#333 by @priyansh-saxena1)
- chore(deps): bump the python-minor group with 2 updates (#299 by @dependabot[bot])
- chore(ci): bump actions/setup-python from 5 to 6 (#298 by @dependabot[bot])
- chore(deps): bump pytest from 8.4.2 to 9.0.2 (#301 by @dependabot[bot])
- chore(ci): bump aquasecurity/trivy-action from 0.33.1 to 0.34.0 in the actions group (#297 by @dependabot[bot])
- chore(ci): bump actions/checkout from 4 to 6 (#278 by @dependabot[bot])
- chore(ci): bump peter-evans/create-pull-request from 6 to 8 (#277 by @dependabot[bot])
- chore(deps): bump the python-minor group across 1 directory with 4 updates (#291 by @dependabot[bot])
- chore(ci): bump astral-sh/setup-uv from 5 to 7 (#276 by @dependabot[bot])
- chore: upgrade code style for python3.10 (#288 by @jonburdo)
- chore: bump minimum model-registry version to 0.3.6 (#289 by @jonburdo)
- chore: added sdk docs website to readme (#284 by @jaiakash)
- chore: Confirm that a public ConfigMap exists to check version (#250 by @sameerdattav)
- chore(docs): Create symlink for CLAUDE.md (#270 by @andreyvelich)
- chore(deps): bump pytest from 8.4.1 to 8.4.2 (#255 by @dependabot[bot])
- chore(deps): bump ty from 0.0.13 to 0.0.14 in the python-minor group (#254 by @dependabot[bot])
- chore: add trivy cve scan and fix workflow (#266 by @Fiona-Waters)
- chore(docs): Update Copilot Instructions and AGENTS.md (#248 by @andreyvelich)
- chore(deps): bump kubernetes from 33.1.0 to 35.0.0 (#256 by @dependabot[bot])
- chore(hub): add kubeflow hub approver jonburdo (#252 by @jonburdo)
- chore(deps): bump the python-minor group with 8 updates (#247 by @dependabot[bot])
- chore(ci): bump astral-sh/setup-uv from 6 to 7 (#245 by @dependabot[bot])
- chore(ci): bump actions/upload-artifact from 4 to 6 (#246 by @dependabot[bot])
- chore(hub): add OWNERS file to kubeflow.hub (#244 by @jonburdo)
New Contributors
- @andres75125 made their first contribution in #394
- @ghazariann made their first contribution in #372
- @priyank766 made their first contribution in #313
- @robert-bell made their first contribution in #384
- @Amir380-A made their first contribution in #364
- @YassinNouh21 made their first contribution in #345
- @alimaredia made their first contribution in #357
- @ruskaruma made their first contribution in #325
- @digvijay-y made their first contribution in #293
- @priyansh-saxena1 made their first contribution in #333
- @MansiSingh17 made their first contribution in #312
- @krishdef7 made their first contribution in #328
- @XploY04 made their first contribution in #307
- @shaikmoeed made their first contribution in #130
- @github-actions[bot] made their first contribution in #296
- @muhammadjunaid8047 made their first contribution in #280
- @jonburdo made their first contribution in #288
- @HKanoje made their first contribution in #188
- @sameerdattav made their first contribution in #250
- @danish9039 made their first contribution in #264
0.4.0rc0
Kubeflow SDK Official Release 0.4.0rc0 (#397) Signed-off-by: Fiona-Waters <fiwaters6@gmail.com>
0.3.0
🚀 Features
- feat(ci): Switch to UV for Dependabot (#231 by @andreyvelich)
- feat: added git cliff for generating changelogs (#226 by @jaiakash)
- feat(docs): Add Kubeflow SDK YouTube demos (#229 by @andreyvelich)
- feat(docs): KEP- Spark Client for Kubeflow SDK (#163 by @Shekharrajak)
- feat(trainer): add get_job_events API to retrieve TrainJob events (#220 by @sksingh2005)
- feat(trainer): support NVIDIA MIG device resources in TrainJob device… (#204 by @LabsJS)
- feat: Add custom instructions for GitHub Copilot (#212 by @osamaahmed17)
- feat: Add callbacks to the wait_for_job_status() API (#205 by @osamaahmed17)
- feat(trainer): Allow to reference runtime by name (#214 by @andreyvelich)
- feat(trainer): Support optional image for CustomTrainer (#216 by @andreyvelich)
- feat: Add dependabot to Kubeflow SDK (#194 by @kramaranya)
- feat(docs): Update README with announcement blog post (#157 by @andreyvelich)
🐛 Bug Fixes
- fix: include full pr name for change log (#236 by @jaiakash)
- fix: Upgrade urllib3 to v2.6.3 (#230 by @Fiona-Waters)
- fix(trainer): Fix parsing for TrainJob events (#228 by @andreyvelich)
- fix: Upgrade urllib3 to v2.6.1 (#193 by @Fiona-Waters)
- fix(trainer): expose CustomTrainerContainer for import (#185 by @AndEsterson)
- fix: update permissions for welcome workflow to avoid 403 error (#181 by @aniketpati1121)
- fix(ci): Move permissions to the workflow root (#177 by @kramaranya)
- fix: pip install with --user argument fails with image running in python virtual environment (#162 by @briangallagher)
- fix(trainer): Remove namespace from ClusterTrainingRuntime exception messages (#166 by @astefanutti)
- fix(trainer): Use PyTorch static rendezvous in container backend (#168 by @astefanutti)
- fix(trainer): Fix listing containers with Podman backend (#154 by @astefanutti)
⚙️ Miscellaneous Tasks
- chore(deps): bump kubernetes from 33.1.0 to 35.0.0 (#235 by @dependabot[bot])
- chore(deps): bump pytest from 8.4.1 to 8.4.2 (#234 by @dependabot[bot])
- chore(deps): bump the python-minor group with 6 updates (#233 by @dependabot[bot])
- chore: Nominate @kramaranya as Kubeflow SDK approver (#206 by @andreyvelich)
- chore(ci): bump softprops/action-gh-release from 1 to 2 (#209 by @dependabot[bot])
- chore(ci): bump actions/upload-artifact from 4 to 6 (#208 by @dependabot[bot])
- chore(ci): bump actions/download-artifact from 6 to 7 (#207 by @dependabot[bot])
- chore(ci): bump amannn/action-semantic-pull-request from 5.5.3 to 6.1.1 (#210 by @dependabot[bot])
- chore(ci): bump actions/github-script from 7 to 8 (#201 by @dependabot[bot])
- chore(ci): bump actions/download-artifact from 4 to 6 (#200 by @dependabot[bot])
- chore(ci): bump actions/stale from 5 to 10 (#199 by @dependabot[bot])
- chore(ci): bump actions/setup-python from 5 to 6 (#198 by @dependabot[bot])
- chore(ci): bump actions/checkout from 4 to 6 (#202 by @dependabot[bot])
- chore(docs): Add new items to the roadmap (#187 by @kramaranya)
New Contributors
- @Shekharrajak made their first contribution in #163
- @sksingh2005 made their first contribution in #220
- @LabsJS made their first contribution in #204
- @osamaahmed17 made their first contribution in #212
- @AndEsterson made their first contribution in #185
0.2.1
New Features
- feat(docs): Update README with announcement blog post (#157) by @andreyvelich
Bug Fixes
- fix(ci): Move permissions to the workflow root (#177) by @kramaranya
- fix: pip install with --user argument fails with image running in python virtual environment (#162) by @briangallagher
- fix(trainer): Remove namespace from ClusterTrainingRuntime exception messages (#166) by @astefanutti
- fix(trainer): Use PyTorch static rendezvous in container backend (#168) by @astefanutti
- fix(trainer): Fix listing containers with Podman backend (#154) by @astefanutti
0.2.0
New Features
- feat(optimizer): Add get_best_results API to OptimizerClient (#152) by @kramaranya
- feat(trainer): Add local notebook examples to E2E (#149) by @Fiona-Waters
- feat(optimizer): Add get_job_logs API to OptimizerClient (#148) by @kramaranya
- feat(optimizer): Add wait_for_job_status and get_best_trial APIs to OptimizerClient (#145) by @kramaranya
- feat: Implement Training Options pattern for flexible TrainJob customization (#91) by @abhijeet-dhumal
- feat: Add ContainerBackend with Docker and Podman (#119) by @Fiona-Waters
- feat(trainer): add s3 initializers, add
ignore_patternsto hf initializers (#131) by @rudeigerc - feat(ci): add workflow to approve ok-to-test label (#138) by @aniketpati1121
- feat(trainer): Add CustomTrainerContainer to create TrainJobs from image (#127) by @andreyvelich
- feat: Hyperparameter Optimization APIs in Kubeflow SDK (#124) by @andreyvelich
- feat(trainer): KEP-2655: Support provisioning of cache with Kubeflow SDK (#112) by @akshaychitneni
- feat: Support LoraConfig in TorchTune BuiltinTrainer (#102) by @Electronic-Waste
- feat(docs): KEP-46-Hyperparameter Optimization in Kubeflow SDK (#123) by @kramaranya
Bug Fixes
- fix(ci): Update url for installing docker for use with local notebooks (#151) by @Fiona-Waters
- fix(trainer): Remove --user flag from packages install in local subprocess (#147) by @andreyvelich
- fix(trainer): Fix empty image for Runtime trainer (#143) by @andreyvelich
- fix: Update Kubeflow SDK diagram (#146) by @kramaranya
- fix(trainer): Fix S3 initializer implementation (#144) by @andreyvelich
- fix: Support custom images in ClusterTrainingRuntime for container backend (#140) by @Fiona-Waters
- fix: add --user when install python packages (#136) by @briangallagher
- fix(ci): Fix first-time PR welcome workflow (#117) by @kramaranya
- fix(ci): Skip release workflow on forks (#113) by @kramaranya
- fix(scripts): Use previous stable tag for changelog (#103) by @kramaranya
Maintenance
- chore: Add HPO support to readme and SDK diagram (#141) by @kramaranya
- chore(ci): Add pre-commit configuration and CI workflow (#134) by @aniketpati1121
- chore(docs): added AGENTS.MD (#106) by @hawkaii
- chore(docs): Add Spark Operator to the future supported projects (#109) by @andreyvelich
0.1.0
New Features
- feat(ci): Add automated release CI job (#65) by @kramaranya
- feat: Implement TrainerClient Backends & Local Process (#33) by @szaher
- feat: KEP-2 Local Execution Mode Proposal (#34) by @szaher
- feat(trainer): Add support for param unpacking in the training function call (#62) by @briangallagher
- feat: Support multiple pip index URLs in CustomTrainer (#79) by @wassimbensalem
- feat(trainer): Refactor get_job_logs() API with Iterator (#83) by @andreyvelich
- feat: Implement Kubernetes Backend (#68) by @szaher
- feat(docs): add ROADMAP of Kubeflow SDK (#44) by @kramaranya
- feat(trainer): Add
get_runtime_packages()API (#57) by @andreyvelich - feat(trainer): Support Framework Labels in Runtimes (#56) by @andreyvelich
- feat(trainer): Add environment variables argument to CustomTrainer (#54) by @astefanutti
- feat(trainer): Add
wait_for_job_status()API (#52) by @andreyvelich - feat(ci): Add GitHub action to verify PR titles (#42) by @andreyvelich
Bug Fixes
- fix(scripts): Use previous stable tag for changelog (#103) by @kramaranya
- fix(docs): Update links before SDK release (#98) by @kramaranya
- fix: trainer client backend public (#78) by @jaiakash
- fix(trainer): Keep the original runtime command in get_runtime_packages() API (#64) by @andreyvelich
- fix(trainer): fix all import. (#43) by @Electronic-Waste
- fix: Expose BuiltinTrainer API to users (#28) by @Electronic-Waste
Maintenance
- chore: Ignore PRs titles with area/release labels in CI (#101) by @kramaranya
- chore: Add proper ruff configuration (#69) by @szaher
- chore: Update CONTRIBUTING.md to use uv (#41) by @szaher
- chore: Add welcome new contributors CI (#82) by @kramaranya
- chore(trainer): Use explicit exception chaining (#80) by @andreyvelich
- chore: Nominate @kramaranya and @szaher as Kubeflow SDK reviewers (#76) by @andreyvelich
- chore: Enable parallel builds for coveralls (#81) by @kramaranya
- chore: Remove tool.hatch.build.targets from pyproject (#73) by @kramaranya
- chore: Move dev extras to dependency-groups (#71) by @kramaranya
- chore: Update README.md (#67) by @kramaranya
- chore: move pyproject.toml to root (#61) by @kramaranya
- chore(ci): Align Kubernetes versions from Trainer for e2e tests (#58) by @astefanutti
- chore(ci): Add dev tests with master dependencies (#55) by @kramaranya
- chore(docs): Add Coveralls Badge to the README (#53) by @andreyvelich
- chore(trainer): Remove accelerator label from the runtimes (#51) by @andreyvelich
Other Changes
- Kubeflow SDK Official Release 0.1.0rc1 (#100) by @kramaranya
- add unit test for trainer sdk (#17) by @briangallagher
- add e2e notebook tests (#27) by @briangallagher
- Update pyproject.toml project links (#40) by @szaher
- Add support for UV & Ruff (#38) by @szaher
- Step down from sdk ownership role (#37) by @tenzen-y
- Add CONTRIBUTING.md (#30) by @abhijeet-dhumal
- Reflect owners updates from KF Trainer (#32) by @tenzen-y
- Consume Trainer models from external package kubeflow_trainer_api (#15) by @kramaranya
- Add pre-commit and flake8 configs (#6) by @eoinfennessy
- Add Stale GitHub action (#7) by @kramaranya
- Add GitHub issue and PR templates (#5) by @eoinfennessy
0.1.0rc1
Kubeflow SDK Official Release 0.1.0rc1 (#100) Signed-off-by: kramaranya <kramaranya15@gmail.com>