0.19.15
Services
Rolling deployments
This update introduces rolling deployments, which help avoid downtime when deploying new versions of your services.
When you apply an updated service configuration, dstack will gradually replace old service replicas with new ones. You can track the progress in the dstack apply output — the deployment number will be lower for old replicas and higher for new ones.
> dstack apply -f my-service.dstack.yml
Active run my-service already exists. Detected configuration changes that can be updated in-place: ['image', 'env', 'commands']
Update the run? [y/n]: y
⠋ Launching my-service...
NAME BACKEND RESOURCES PRICE STATUS SUBMITTED
my-service deployment=1 running 11 mins ago
replica=0 job=0 deployment=0 aws (us-west-2) cpu=2 mem=1GB disk=100GB (spot) $0.0026 terminating 11 mins ago
replica=1 job=0 deployment=1 aws (us-west-2) cpu=2 mem=1GB disk=100GB (spot) $0.0026 running 1 min agoCurrently, the following service configuration properties can be updated using rolling deployments: resources, volumes, image, user, privileged, entrypoint, python, nvcc, single_branch, env, shell, and commands.
Future releases will allow updating more properties and deploying new git repo commits.
Clusters
Updated default Docker images
If you don't specify a custom image in the run configuration, dstack uses its default images. These images have been improved for cluster environments and now include mpirun and NCCL tests. Additionally, if you are running on AWS EFA-capable instances, dstack will now automatically select an image with the appropriate EFA drivers. See our new AWS EFA guide for more details.
Server
Health metrics
The dstack server now exports some operational Prometheus metrics that allow to monitor its health. If you are running your own production-grade dstack server installation, refer to the metrics docs for details.
What's changed
- Set logsWaitDuration to 5m by @r4victor in #2794
- Add health metrics (Part 1) by @Nadine-H in #2760
- Add public projects by @haydnli-shopify in #2759
- Fix is_public allowing null by @r4victor in #2798
- Retry on
VOLUME_ERRORandINSTANCE_UNREACHABLEby @jvstme in #2805 - Rework default Docker images by @peterschmidt85 in #2799
- Fix volume error status message by @jvstme in #2806
- [Docs] Added EFA example by @peterschmidt85 in #2820
- [Bug]: Empty spaces on User Details page by @olgenn in #2815
- Rolling deployment for services by @jvstme in #2821
- Fix building
dstackpackage by @jvstme in #2823
New Contributors
- @haydnli-shopify made their first contribution in #2759
Full Changelog: 0.19.13...0.19.15