0.19.19
Fleets
SSH fleets in-place updates
You can now add and remove instances in SSH fleets without recreating the entire fleet.
type: fleet
name: ssh-fleet
ssh_config:
user: dstack
identity_file: ~/.ssh/dstack
hosts:
- 10.0.0.1
- 10.0.0.2$ dstack apply -f fleet.dstack.yml
...
Fleet ssh-fleet does not exist yet.
Create the fleet? [y/n]: y
...
FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED
ssh-fleet 0 ssh (remote) cpu=4 mem=4GB disk=30GB $0 idle 09:08
1 ssh (remote) cpu=2 mem=4GB disk=30GB $0 idle 09:08
Then, if you update the hosts configuration property to
hosts:
#- 10.0.0.1 # removed
- 10.0.0.2
- 10.0.0.3 # addedand apply the same configuration again, the fleet will be updated in-place, meaning that you don't need to stop runs on the fleet instances if they are not affected by the changes (in this example, it's okay if the instance 1 is currenty busy, you can still apply the configuration).
$ dstack apply -f fleet.dstack.yml
...
Found fleet ssh-fleet. Configuration changes detected.
Update the fleet in-place? [y/n]: y
...
FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED
ssh-fleet 1 ssh (remote) cpu=2 mem=4GB disk=30GB $0 idle 09:08
2 ssh (remote) cpu=8 mem=4GB disk=30GB $0 idle 09:12
Note
For in-place updates it's only allowed to add and/or remove instances, the root configuration and configurations of hosts that are not changed must not be changed, otherwise the full fleet recreation is triggered, as before. This restriction may be lifted in the future.
Volumes
Automatic cleanup of unused volumes
The volume configuration gets a new auto_cleanup_duration property:
type: volume
name: my-volume
backend: aws
region: eu-west-1
availability_zone: eu-west-1a
auto_cleanup_duration: 1hThe volume will be automatically deleted after it's not being used for the specified duration.
Logs
Browsable, queryable, and searchable logs
dstack now stores run logs in plaintext, which were previously base64-encoded. This allows you to use the configured log storage, be it AWS CloudWatch or GCP Logging, to browse and query dstack run logs.
Note
Logs generated before this release will be shown as base64-encoded in the UI and CLI after the update.
Server
Faster API response times
The dstack server API has been optimized to serialize json responses faster. The API endpoints are up to 2x faster than before.
Benchmarks
Benchmarking AMD GPUs: bare-metal, containers, partitions
Our new benchmark explores two important areas for optimizing AI workloads on AMD GPUs: First, do containers introduce a performance penalty for network-intensive tasks compared to a bare-metal setup? Second, how does partitioning a powerful GPU like the MI300X affect its real-world performance for different types of AI workloads?
What's Changed
- [Internal] Some runner tests fail on macOS by @peterschmidt85 in #2879
- Introduce job_submissions_limit for /api/runs/list by @r4victor in #2883
- Speed up json serialization with orjson and custom FastAPI responses by @r4victor in #2880
- [Docs]: Service rolling deployments by @jvstme in #2870
- Do not lose
provisioninggateways on restart by @jvstme in #2887 - Add/remove SSH instances via in-place update by @un-def in #2884
- [Docs]: Add example of setting a PostgreSQL URL by @jvstme in #2888
- [Blog] Added new changelog by @peterschmidt85 in #2891
- Fix job_submissions_limit backward compatibility by @r4victor in #2894
- Fix run and job status_message calculation by @r4victor in #2889
- Fix 500 errors when requesting file logs by @r4victor in #2896
- Rolling deployments for
portby @jvstme in #2893 - [Feature] Strip ANSI codes from run logs and store them as plain text instead of bytes by @peterschmidt85 in #2876
- [Feature]: Add ability to disable background processing and only run Web UI and API server #2901 by @james-boydell in #2902
- [shim] Don't check image downloaded size by @un-def in #2903
- Fix rolling deployment migration locking by @r4victor in #2904
- feat: add volume idle duration cleanup feature (#2497) by @haydnli-shopify in #2842
- [Blog] Benchmarking AMD GPUs: bare-metal, containers, partitions by @peterschmidt85 in #2905
- Fix /users/list by @r4victor in #2908
- Return logs in base64 for backward compatibility by @r4victor in #2910
Full Changelog: 0.19.18...0.19.19