Releases: dstackai/dstack
0.18.33
This update fixes TPU v6e support and a potential gateway upgrade issue.
What's Changed
- Fix runtime version for TPU v6e by @r4victor in #2149
- Update
state.jsonmigration on gateways by @jvstme in #2152 - Optimize gateway startup and service update time by @jvstme in #2153
Full Changelog: 0.18.32...0.18.33
0.18.32
TPU
Trillium (v6e)
dstack adds support for the latest Trillium TPU (v6e), which became generally available in GCP on December 12th. The new TPU generation doubles the TPU memory and bursts performance, supporting larger workloads.
Resources
dstack now includes CPU, RAM, and TPU memory in Google Cloud TPU offers:
$ dstack apply --gpu tpu
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 gcp europe-west4 v5litepod-1 24xCPU, 48GB, 1xv5litepod-1 (16GB), 100.0GB (disk) no $1.56
2 gcp europe-west4 v6e-1 44xCPU, 176GB, 1xv6e-1 (32GB), 100.0GB (disk) no $2.97
3 gcp europe-west4 v2-8 96xCPU, 334GB, 1xv2-8 (64GB), 100.0GB (disk) no $4.95 Volumes
By default, TPU VMs contain a 100GB boot disk, and its size cannot be changed. Now, you can add more storage using Volumes.
Gateways
In this update, we've greatly refactored Gateways, improving their reliability and fixing several bugs.
Note
If you are running multiple replicas of the dstack server, ensure all replicas are upgraded promptly. Leaving some replicas on an older version may prevent them from creating or deleting services and could result in minor errors in their logs.
Warning
Ensure you update to 0.18.33, which includes critical hot-fixes for important issues.
What's changed
- [
dstack-shim] Rework resource management by @un-def in #2093 - [Gateways] Restore
dstack-proxystate on gateway restarts by @jvstme in #2119 - [TPU] Support TPU v6e by @r4victor in #2124
- [UI] Updated
Backend configInfo section by @peterschmidt85 in #2125 - [UI] It's not possible to manage fleets by @olgenn in #2126
- [UI] Improvements by @olgenn in #2127
- [Gateways] Add migration from
state.jsonon gateways by @jvstme in #2128 - [Volumes] Forbid deleting backends with active instances or volumes by @r4victor in #2131
- [TPU] Fix backward compatibility with new TPUs by @r4victor in #2138
- Update
gpuhuntto0.0.17by @r4victor in #2139 - [Docs] Improve docs by @r4victor in #2135
- [Gateways] Fix certbot process getting stuck in
dstack-proxyby @jvstme in #2143 - [Gateways] Run
dstack-proxyon gateways by @jvstme in #2136 - [Volumes] Support volumes for TPUs by @r4victor in #2144
- [Gateways] Optimize
dstack-gatewayinstallation time by @jvstme in #2146 - [Gateways] Fix OpenAI endpoint on Kubernetes gateways by @jvstme in #2147
Full changelog: 0.18.31...0.18.32
0.18.31
GCP
Running VMs on behalf of a service account
Like all major clouds, GCP supports running a VM on behalf of a managed identity using a service account. Now you can assign a service account to a GCP VM with dstack by specifying the vm_service_account property in the GCP config:
type: gcp
project_id: myproject
vm_service_account: sa@myproject.iam.gserviceaccount.com
creds:
type: defaultAssigning a service account to a VM can be used to access GCP resources from within runs. Another use case is using firewall rules that rely on the service account as the target. Such rules are typical for Shared VPC setups when admins of a host project can create firewall rules for service projects based on their service accounts.
Volumes
Creating user home directory automatically
Following support for non-root users in Docker images, dstack improves handling of users' home directories. Most importantly, the HOME environment variable is set according to /etc/passwd, and the home directory is created automatically if it does not exist.
The update opens up new possibilities including the use of an empty volume for /home:
type: dev-environment
ide: vscode
image: ubuntu
user: ubuntu
volumes:
- volume-aws:/homeAWS volumes with non-Nitro instances
dstack users previously reported AWS Volumes not working with some instance types. This is now fixed and tested for all instance types supported by dstack including older Xen-based instances like the P3 family.
Deprecations
- The
home_dirandsetupparameters in run configurations have been deprecated. If you're usingsetup, movesetupcommands to the top ofinit.
What's changed
- [
dstack-shim] Implement multi-task state by @un-def in #2078 - [AWS] Support AWS volumes for Xen-based instances by @r4victor in #2088
- Handle empty user when processing image manifest by @un-def in #2090
- [Docs] Move Reference to a separate page for more space and better st… by @peterschmidt85 in #2092
- Init VirtualRepo when
--no-repospecified by @r4victor in #2098 - [Docs] Add missing backends docs reference by @r4victor in #2099
- [gateways] Support gateway features in
dstack-proxyby @jvstme in #2087 - [Docs] Add
Repospage insideConceptsto explain how repos work #2096 by @peterschmidt85 in #2097 - [GCP] Allow specifying
vm_service_accountin GCP config by @r4victor in #2110 - [
dstack-shim] Create user home directory if it doesn't exist by @un-def in #2109 - [Tests] Disallow remote network connections in tests by @un-def in #2111
- [Docs] Add Developers page featuring community links, ambassador program, contributing links, etc #2103 by @peterschmidt85 in #2104
- [Docs] Refactor the reference guide #2112 by @peterschmidt85 in #2113
- [Tests] Support tests that access db from a new thread by @r4victor in #2116
- [Deprecation] Deprecate
home_dirandsetupby @un-def in #2115
Full changelog: 0.18.30...0.18.31
0.18.30
AWS Capacity Reservations and Capacity Blocks
dstack now allows provisioning AWS instances using Capacity Reservations and Capacity Blocks. Given a CapacityReservationId, you can specify it in a fleet or a run configuration:
type: fleet
nodes: 1
name: my-cr-fleet
reservation: cr-0f45ab39cd64a1ceeThe instance will use the reserved capacity, so as long as you have enough, the provisioning is guaranteed to succeed.
Non-root users in Docker images
Previously, dstack always executed the workload as root, ignoring the user property set in the image. Now, dstack executes the workload with the default image user, and you can override it with a new user property:
type: task
image: nvcr.io/nim/meta/llama-3.1-8b-instruct
user: nimThe format of the user property is the same as Docker uses: username[:groupname], uid[:gid], and so on.
Improved dstack apply and repos UX
Previously, dstack apply used the current directory as the repo that's made available within the run at /workflow. The directory had to be initialized with dstack init before running dstack apply.
Now you can pass --repo to dstack apply. It can be a path to a local directory or a remote Git repo URL. The specified repo will be available within the run at /workflow. You can also specify --no-repo if the run doesn't need any repo. With --repo or --no-repo specified, you don't need to run dstack init:
$ dstack apply -f task.dstack.yaml --repo .
$ dstack apply -f task.dstack.yaml --repo ../parent_dir
$ dstack apply -f task.dstack.yaml --repo https://github.com/dstackai/dstack.git
$ dstack apply -f task.dstack.yaml --no-repoSpecifying --repo explicitly can be useful when running dstack apply from scripts, pipelines, or CI. dstack init stays relevant for use cases when you work with dstack apply interactively and want to set up the repo to work with once.
Lightweight pip install dstack
pip install dstack used to install all the dstack server dependencies. Now pip install dstack installs only the CLI and Python API, which is optimal for use cases when a remote dstack server is used. You can do pip install "dstack[server]" to install the server or do pip install "dstack[all]" to install the server with all backends supported.
Breaking changes
pip install dstackno longer install the server dependencies. If you relied on it to install the server, ensure you usepip install "dstack[server]"orpip install "dstack[all]".
What's Changed
- [chore]: Move
run_asyncto_internal/utilsby @jvstme in #2057 - Move server deps to dstack[server] extra by @r4victor in #2058
- Add
userproperty to run configurations by @un-def in #2055 - [Blog] Exploring inference memory saturation effect: H100 vs MI300x by @peterschmidt85 in #2061
- [Internal]: Fix building docs in CI by @jvstme in #2063
- [chore]: Drop unused gateway-related runner code by @jvstme in #2062
- [shim] Clean up and document API by @un-def in #2060
- Improve RESP API docs by @r4victor in #2064
- Allow underscores in custom GCP tags by @r4victor in #2065
- Make repo optional when submitting runs via HTTP API by @r4victor in #2066
- Fix changing configuration type with dstack apply by @r4victor in #2070
- Fix instances stuck in busy status by @r4victor in #2071
- [Minor] If errors should be passed silently, then in pythonic way by @dimitriillarionov in #2075
- AWS Capacity Reservation support by @solovyevt in #1977
- [Blog] Beyond Kubernetes: 2024 recap and what's next for AI infra by @peterschmidt85 in #2074
- Fix
reservationproperty backward compatibility by @un-def in #2077 - Fix ~/.ssh write permissions check by @r4victor in #2079
- Fix errors exit codes in dstack apply by @r4victor in #2081
- Fix RESERVATIONS display in fleets table by @r4victor in #2082
- Support --repo, --no-repo, and autoinit in dstack apply by @r4victor in #2080
- Support AWS partitioned volumes by @r4victor in #2084
- [shim] Update OpenAPI doc by @un-def in #2085
New Contributors
- @dimitriillarionov made their first contribution in #2075
- @solovyevt made their first contribution in #1977
Full Changelog: 0.18.29...0.18.30
0.18.29
Support internal_ip for SSH fleet clusters
It's now possible to specify instance IP addresses used for communication inside SSH fleet clusters using the internal_ip property:
type: fleet
name: my-ssh-fleet
placement: cluster
ssh_config:
user: ubuntu
identity_file: ~/.ssh/dstack/key.pem
hosts:
- hostname: "3.79.203.200"
internal_ip: "172.17.0.1"
- hostname: "18.184.67.100"
internal_ip: "172.18.0.2"If internal_ip is not specified, dstack automatically detects internal IPs by inspecting network interfaces. This works when all instances have IPs belonging to the same subnet and are accessible on those IPs. The explicitly specified internal_ip enables networking configurations when the instances are accessible on IPs that do not belong to the same subnet.
UX enhancements for dstack apply
The dstack apply command gets many improvements including more concise and consistent output and better error reporting. When applying run configurations, dstack apply now prints a table similar to the dstack ps output:
✗ dstack apply
Project main
User admin
...
Submit a new run? [y/n]: y
NAME BACKEND RESOURCES PRICE STATUS SUBMITTED
spicy-tiger-1 gcp 2xCPU, 8GB, $0.06701 running 14:52
(us-central1) 100.0GB (disk)
spicy-tiger-1 provisioning completed (running)
What's Changed
- [UX]: live table when provisioning dstack configuration runs #1978 by @Tob-iee in #2036
- Fix returning metrics from deleted runs by @jvstme in #2038
- [UI] Migrate the chat components to the new CloudScape chat componets by @olgenn in #2044
- Recover unreachable instances by @un-def in #2043
- UX enhancements for
dstack applyby @jvstme in #2045 - Implement /api/fleets/list endpoint by @r4victor in #2050
- Remove padding in
dstack applylive tables by @jvstme in #2048 - Fix typo in
dstack attach --helpby @jvstme in #2054 - Support specifying internal_ip for SSH fleet hosts by @r4victor in #2056
New Contributors
Full Changelog: 0.18.28...0.18.29
0.18.28
CLI improvements
- Added alias
-Rfor--reusewithdstack apply - Shorten model URL output
dstack applyanddstack attachno longer rely on external tools such aspsandgrepon Unix-like systems andpowershellon Windows. With this change, it's now possible to usedstackCLI client in minimal environments such as Docker containers, including the official dstackai/dstack image
What's Changed
- Add
DSTACK_{RUNNER,SHIM}_DOWNLOAD_URLenv vars by @un-def in #2023 - [Feature] Add alias
-Rfor--reusewithdstack applyby @peterschmidt85 in #2032 - Replace
ps | grepwith psutil in SSHAttach by @un-def in #2029 - Shorten model URL output in CLI by @jvstme in #2035
Full Changelog: 0.18.27...0.18.28
0.18.27
UI/UX improvements
This release fixes a login issue in the control plane UI and introduces other UI/UX improvements.
What's Changed
- Another batch of many minor improvements to the docs by @peterschmidt85 in #2016
- Show OpenAI-compatible endpoint URL in CLI by @jvstme in #2022
- [Bug]: Cannot open UI login screen by @olgenn in #2025
- [UI] Model page code snippets fixes and improvements by @olgenn in #2026
- [UI]: Fix curl sample in model code button by @jvstme in #2027
Full Changelog: 0.18.26...0.18.27
0.18.26
Git
Previously, when you called dstack init, Git credentials were reused between users of the same project and repository.
Starting with this release, to improve security, dstack no longer shares Git credentials across users.
Warning
If you submitted credentials earlier with dstack init, they will continue to work. However, it is recommended that each user call dstack init again to ensure they do not reuse credentials from other users.
Deleting legacy credentials
To ensure no credentials submitted earlier are shared across users, you can run the following SQL statements:
UPDATE repos SET creds = NULL;UI
This update brings a few UI improvements:
- Added
Deletebutton to theVolumespage - Added
Refreshbutton to all pages with lists:Runs,Models,Fleets,Volumes,Projects - Improved
Codebutton on the model page
What's changed
- Implement per-user repo creds storage by @un-def in #2004
- [UI] Add Refresh button to all pages with lists by @olgenn in #2007
- [UI] Include base URL and authentication token in the code snippets by @olgenn in #2006
- [UI] The Code button improvements on the Model page by @olgenn in #2001
- [UI] It's not possible to select and delete volumes by @olgenn in #2000
- [UI] [Bug]: Services without model mapping are displayed in Models UI by @olgenn in #1993
- Ensure sshd privsep dir in container is properly set up by @un-def in #2008
- [Docs] Many minor improvements to docs and examples by @peterschmidt85 in #2013
- [Docs] Services without a gateway by @jvstme in #2011
- [Docs] Add deployment section with vLLM, TGI and NIM. Remove alignment handbook by @Bihan in #1990
- [Docs] Updated Installation and Server deployment guides to include CloudFormation by @peterschmidt85
- [Docs] Update services docs to reflect that gateway is now optional by @peterschmidt85 in #2005
- [Examples] Add a CloudFormation template showing how to deploy dstack server to AWS by @peterschmidt85 in #1944
- [Examples] Add Airflow example by @r4victor in #1991
Full changelog: 0.18.25...0.18.26
0.18.25
Multiple volumes per mount point
It's now possible to specify a list of volumes for a mount point in run configurations:
...
volumes:
- name: [my-aws-eu-west-1-volume, my-aws-us-east-1-volume]
path: /volume_datadstack will choose and mount one volume from the list. This can be used to increase GPU availability by specifying different volumes for different regions, which is desirable for use cases like caching. Previously, it was possible to specify only one volume per mount point, so if there was no compute capacity in the volume's region, provisioning would fail.
DSTACK_NODES_IPS environment variable
A new DSTACK_NODES_IPS environment variable is now available for multi-node tasks. It contains a list of internal IP addresses of all nodes in the cluster, e.g. DSTACK_NODES_IPS="10.128.0.47\n10.128.0.48\n10.128.0.49". This feature enables cluster workloads that require configuring IP addresses of all the nodes.
What's Changed
- Adding an example of NIM by @deep-diver in #1853
- Support specifying multiple volumes per mount point by @r4victor in #1983
- Expose DSTACK_NODES_IPS env var by @r4victor in #1985
- Set minimum paramiko version to 3.2.0 by @un-def in #1984
- Limit azure-mgmt-network>=23.0.0,<28.0.0 by @r4victor in #1988
Full Changelog: 0.18.24...0.18.25
0.18.24
Backward compatibility
This update includes a hotfix for a backward compatibility issue that prevented CLI v0.18.23 from working with older versions of the dstack server.
What's changed
Full changelog: 0.18.23...0.18.24