Skip to content

Releases: dstackai/dstack

0.18.43

26 Feb 11:27
a78dab6

Choose a tag to compare

CLI autocompletion

The dstack CLI now supports shell autocompletion for bash and zsh. It suggests completions for subcommands:

✗ dstack s
server  -- Start a server
stats   -- Show run stats
stop    -- Stop a run

and dynamic completions for resource names:

✗ dstack logs m
mighty-chicken-1  mighty-crab-1  my-dev  --

To set up the CLI autocompletion for your shell, follow the Installation guide.

max_duration set to off by default

The max_duration parameter that controls how long a run is allowed to run before stopping automatically is now set to off by default for all run configuration types. This means that dstack won't stop runs automatically unless max_duration is specified explicitly.

Previously, the max_duration defaults were 72h for tasks, 6h for dev environments, and off for services. This led to unintended runs termination and caused confusion for users unaware of max_duration. The new default makes max_duration opt-in and, thus, predictable.

If you relied on the previous max_duration defaults, ensure you've added max_duration to your run configurations.

GCP Logging for run logs

The dstack server requires storing run logs externally when for multi-replica server deployments. Previously, the only supported external storage was AWS CloudWatch, which limited production server deployments to AWS. Now the dstack server adds support for GCP Logging to store run logs. Follow the Server deployment guide for more information.

Custom IAM instance profile for AWS

The AWS backend config gets the new iam_instance_profile parameter that allows specifying IAM instance profile that will be associated with provisioned EC2 instances. You can also specify the IAM role name for roles created via the AWS console as AWS automatically creates an instance profile and gives it the same name as the role:

projects:
- name: main
  backends:
  - type: aws
    iam_instance_profile: dstack-test-role
    creds:
      type: default

This can be used to access AWS resources from runs without passing credentials explicitly.

Oracle Cloud spot instances

The oci backend can now provision interruptible spot instances, providing more cost-effective GPUs for workloads that can recover from interruptions.

> dstack apply --gpu 1.. --spot -b oci
 #  BACKEND  REGION          INSTANCE   RESOURCES                                    SPOT  PRICE     
 1  oci      eu-frankfurt-1  VM.GPU2.1  24xCPU, 72GB, 1xP100 (16GB), 50.0GB (disk)   yes   $0.6375   
 2  oci      eu-frankfurt-1  VM.GPU3.1  12xCPU, 90GB, 1xV100 (16GB), 50.0GB (disk)   yes   $1.475    
 3  oci      eu-frankfurt-1  VM.GPU3.2  24xCPU, 180GB, 2xV100 (16GB), 50.0GB (disk)  yes   $2.95

Breaking changes

  • Dropped support for python: 3.8 in run configuration.
  • Set max_duration to off by default for all run configuration types.

What's Changed

  • Replace pagination with lazy loading in dstack UI by @olgenn in #2309
  • Dynamic CLI completion by @solovyevt in #2285
  • Remove excessive project_id check for GCP by @r4victor in #2312
  • [Docs] GPU blocks and proxy jump blog post (WIP) by @peterschmidt85 in #2307
  • [Docs] Add blocks description to Concepts/Fleets by @un-def in #2308
  • Replace pagination with lazy loading on Fleet list by @olgenn in #2320
  • Improve GCP creds validation by @r4victor in #2322
  • [UI]: Fix job details for multi-job runs by @jvstme in #2321
  • Fix instance filtering by backend to use base backend by @r4victor in #2324
  • [Docs]: Fix inactivity duration blog post by @jvstme in #2327
  • Fix CLI instance status for instances with blocks by @jvstme in #2332
  • Partially fixes openapi spec by @haringsrob in #2330
  • [Bug]: UI does not show logs of distributed tasks and replicated services by @olgenn in #2334
  • [Feature]: Replace pagination with lazy loading in Instances list by @olgenn in #2335
  • [Feature]: Replace pagination with lazy loading in volume list by @olgenn in #2336
  • [Bug]: Finished jobs included in run price by @olgenn in #2338
  • Fix DSTACK_GPUS_PER_NODE|DSTACK_GPUS_NUM when blocks are used by @un-def in #2333
  • Support storing run logs using GCP Logging by @r4victor in #2340
  • Support OCI spot instances by @jvstme in #2337
  • [Feature]: Replace pagination with lazy loading in models list by @olgenn in #2351
  • [UI] Remember filter settings in local storage by @olgenn in #2352
  • [Internal]: Minor tweaks in packer docs and CI by @jvstme in #2356
  • Use unique names for backend resources by @r4victor in #2350
  • Set max_duration to off by default for all run configurations by @r4victor in #2357
  • Print message on dstack attach exit by @r4victor in #2358
  • Forbid python: 3.8 in run configurations by @jvstme in #2354
  • Fix Fabric Manager in AWS/GCP/Azure/OCI OS images by @jvstme in #2355
  • Install DCGM Exporter on dstack-built OS images by @un-def in #2360
  • Fix volume detachment for runs started before 0.18.41 by @r4victor in #2362
  • Increase Lambda provisioning timeout and refactor by @jvstme in #2353
  • Bump default OS image version by @jvstme in #2363
  • Support iam_instance_profile for AWS by @r4victor in #2365

New Contributors

Full Changelog: 0.18.42...0.18.43

0.18.42

17 Feb 10:15
92f342e

Choose a tag to compare

Volume attachments

It's now possible to see volume attachments when listing volumes. The dstack volume -v command shows which fleets the volumes are attached to in the ATTACHED column:

✗ dstack volume -v
 NAME             BACKEND  REGION                       STATUS  ATTACHED  CREATED      ERROR 
 my-gcp-volume-1  gcp      europe-west4                 active  my-dev    1 weeks ago        
                           (europe-west4-c)                                                  
 my-aws-volume-1  aws      eu-west-1 (eu-west-1a)       active  -         3 days ago         

This can help you decide if you should use an existing volume for a run or create a new volume if all volumes are occupied.

You can also check which volumes are currently attached and which are not via the API:

import os
import requests

url = os.environ["DSTACK_URL"]
token = os.environ["DSTACK_TOKEN"]
project = os.environ["DSTACK_PROJECT"]

print("Getting volumes...")
resp = requests.post(
    url=f"{url}/api/project/{project}/volumes/list",
    headers={"Authorization": f"Bearer {token}"},
)
volumes = resp.json()

print("Checking volumes attachments...")
for volume in volumes:
    is_attached = len(volume["attachments"]) > 0
    print(f"Volume {volume['name']} attached: {is_attached}")
✗ python check_attachments.py
Getting volumes...
Checking volumes attachments...
Volume my-gcp-volume-1 attached: True
Volume my-aws-volume-1 attached: False

Bugfixes

This release contains several important bugfixes including a bugfix for fleets with placement: cluster (#2302).

What's Changed

Full Changelog: 0.18.41...0.18.42

0.18.41

13 Feb 11:05
ae0835e

Choose a tag to compare

GPU blocks

Previously, dstack could process only one workload per instance at a time, even if the instance had enough resources to handle multiple workloads. With a new blocks fleet property you can split the instance into blocks (virtual subinstances), allowing users to run workloads simultaneously, utilizing a fraction of GPU, CPU and memory resources.

Cloud fleet

type: fleet

name: my-fleet
nodes: 1

resources:
  gpu: 8:24GB

blocks: 4  # split into 4 blocks, 2 GPU per block

SSH fleet

type: fleet

name: my-fleet

ssh_config:
  user: ubuntu
  identity_file: ~/.ssh/id_rsa
  hosts:
    - hostname: 3.255.177.51
      blocks: auto   # as many as possible, e.g., 8 GPUS -> 8 blocks

You can see how many instance blocks are currently busy in the dstack fleet output:

$ dstack fleet
 FLEET         INSTANCE  BACKEND       RESOURCES                                         PRICE  STATUS    CREATED
 fleet-gaudi2  0         ssh (remote)  152xCPU, 1007GB, 8xGaudi2 (96GB), 387.0GB (disk)  $0.0   3/8 busy  56 sec ago

The remaining blocks can be used for new runs.

SSH fleets with head node

With a new proxy_jump fleet property, dstack now supports network configurations where worker nodes are located behind a head node and are not reachable directly:

type: fleet
name: my-fleet

ssh_config:
  user: ubuntu
  identity_file: ~/.ssh/worker_node_key
  hosts:
    # worker nodes
    - 3.255.177.51
    - 3.255.177.52
  # head node proxy; can also be configured per worker node
  proxy_jump: 
    hostname: 3.255.177.50
    user: ubuntu
    identity_file: ~/.ssh/head_node_key

Check the documentation for details.

Inactivity duration

You can now configure dev environments to automatically stop after a period of inactivity by specifying inactivity_duration:

type: dev-environment
ide: vscode
# Stop if inactive for 2 hours
inactivity_duration: 2h

A dev environment is considered inactive if there are no SSH connections to it, including VS Code connections, ssh <run name> shells, and attached dstack apply or dstack attach commands. For more details on using inactivity_duration, see the docs.

Multiple EFA interfaces

dstack now attaches the maximum possible number of EFA interfaces when provisioning AWS instances with EFA support. For example, when provisioning p5.48xlarge instance, dstack configures an optimal set up with 32 interfaces providing total network bandwidth capacity of 3,200 Gbps, of which up to 800 Gbps can be utilized for IP network traffic.

Note: Multiple EFA interface are enabled if the aws backend config has public_ips: false set. If instances have public IPs, only one EFA interface is enabled per instance due to AWS limitations.

Volumes for distributed tasks

You can now use single-attach volumes such as AWS EBS with distributed tasks by attaching different volumes to different nodes. This is done using dstack variable interpolation:

type: task
nodes: 8
commands:
  - ...
volumes:
  - name: data-volume-${{ dstack.node_rank }}
    path: /volume_data

Tip: To create volumes for all nodes using one volume configuration, specify volume name with -n:

$ for i in {0..7}; do dstack apply -f vol.dstack.yml -n data-volume-$i -y; done

Availability zones

It's now possible to specify availability_zone in volume configurations:

type: volume
name: my-volume
backend: aws
region: eu-west-1
availability_zone: eu-west-1c
size: 100GB

and availability_zones in fleet and run configurations

type: fleet
name: my-fleet
nodes: 2
availability_zones: [eu-west-1c]

This has multiple use cases:

  • Specify the same availability zone when provisioning volumes and fleets to ensure they can be used together.
  • Specify a volume availability zone that has instance types that you work with.
  • Create volumes for all availability zones to be able to use any zone and improve GPU availability.

The dstack fleet -v and dstack volumes -v commands now display availability zones along with regions.

Deployment considerations

  • If you deploy the dstack server using rolling deployments (old and new replicas co-exist), it's advised to stop runs and fleets before deploying 0.18.41. Otherwise, you may see error logs from the old replica. It should not have major implications.

What's Changed

  • Implement GPU blocks property by @un-def in #2253
  • Show deleted runs in the UI by @olgenn in #2272
  • [Bug]: The UI issues many API requests when stopping multiple runs by @olgenn in #2273
  • Ensure frontend displays errors when getting 400 from the server by @olgenn in #2275
  • Support --name for all configurations by @r4victor in #2269
  • Support per-job volumes by @r4victor in #2276
  • Full EFA attachment for non-public IPs by @solovyevt in #2271
  • Return deleted runs in /api/runs/list by @r4victor in #2158
  • Fix process_submitted_jobs instance lock by @un-def in #2279
  • Change dstack fleet STATUS for block instances by @un-def in #2280
  • [Docs] Restructure concept pages to ensure dstack apply is not lost at the end of the page by @peterschmidt85 in #2283
  • Allow specifying Azure resource_group by @r4victor in #2288
  • Allow configuring availability zones by @r4victor in #2266
  • Track SSH connections in dstack-runner by @jvstme in #2287
  • Add the inactivity_timeout configuration option by @jvstme in #2289
  • Show dev environment inactivity in dstack ps -v by @jvstme in #2290
  • Support non-root Docker images in RunPod by @jvstme in #2286
  • Fix terminating runs when job is terminated by @jvstme in #2295
  • [Docs]: Dev environment inactivity duration by @jvstme in #2296
  • [Docs]: Add availability_zones to offer filters by @jvstme in #2297
  • Add head node support for SSH fleets by @un-def in #2292
  • Support services with head node setup by @un-def in #2299

Full Changelog: 0.18.40...0.18.41

0.18.40

05 Feb 14:21
d03ac25

Choose a tag to compare

Volumes

Optional instance volumes

Instance volumes can now be made optional. When a volume is marked as optional, it will be mounted only if the backend supports instance volumes; otherwise, it will not be mounted.

type: dev-environment

ide: vscode

volumes:
  - instance_path: /dstack-cache
    path: /root/.cache/
    optional: true

Optional instance volumes are useful for caching, allowing runs to work with backends that don’t support them, such as runpod, vastai, and kubernetes.

Services

Path prefix

Previously, if you were running services without a gateway, it was not possible to deploy certain web apps, such as Dash. This was due to the path prefix /proxy/services/<project name>/<run name>/ in the endpoint URL.

With this new update, it’s now possible to configure a service so that such web apps work without a gateway. To do this, set the strip_prefix property to false and pass the prefix to the web app. Here’s an example with a Dash app:

type: service
name: my-dash-app

gateway: false

# Disable authorization
auth: false

# Do not strip the path prefix
strip_prefix: false

env:
  # Configure Dash to work with a path prefix
  - DASH_ROUTES_PATHNAME_PREFIX=/proxy/services/main/my-dash-app/

commands:
  - pip install dash
  - python app.py

port: 8050

Git

Branches

When you run dstack apply, before dstack starts a container, it fetches the code from the repository where dstack apply was invoked. If the repository is a remote Git repo, dstack clones it using the user’s Git credentials.

Previously, dstack always cloned only a single branch in this scenario (to ensure faster startup).

With this update, for development environments, dstack now clones all branches by default. You can override this behavior using the new single_branch property.

SSH

If you override the user property in your run configuration, dstack runs the container as that user. Previously, when accessing the dev environment via VS Code or connecting to the run with the ssh <run name> command, you were still logged in as the root user and had to switch manually. Now, you are automatically logged in as the configured user.

What's changed

Full changelog: 0.18.39...0.18.40

0.18.39

30 Jan 12:28
b929fe0

Choose a tag to compare

This release fixes a backward compatibility bug introduced in 0.18.38. The bug caused the CLI version 0.18.38 fail with older servers when applying fleet configurations.

What's Changed

  • Handle stop_duration backward compatibility for fleet spec by @r4victor in #2243

Full Changelog: 0.18.38...0.18.39

0.18.38

30 Jan 11:22
2c3d83a

Choose a tag to compare

Intel Gaudi

dstack now supports Intel Gaudi accelerators with SSH fleets.

To use Intel Gaudi with dstack, create an SSH fleet, and once it's up, feel free to specify gaudi, gaudi2, or gaudi3 as a GPU name (or intel as a vendor name) in your run configuration:

type: dev-environment

python: "3.12"
ide: vscode

resources:
  gpu: gaudi2:8  # 8 × Gaudi 2 

Note

To use SSH fleets with Intel Gaudi, ensure that the Gaudi software and drivers are installed on each host. This should include the drivers, hl-smi, and Habana Container Runtime.

Volumes

Stop duration and force detachment

In some cases, a volume may get stuck in the detaching state. When this happens, the run is marked as stopped, but the instance remains in an inconsistent state, preventing its deletion or reuse. Additionally, the volume cannot be used with other runs.

To address this, dstack now ensures that the run remains in the terminating state until the volume is fully detached. By default, dstack waits for 5m before forcing a detach. You can override this using stop_duration by setting a different duration or disabling it (off) for an unlimited duration.

Note

Force detaching a volume may corrupt the file system and should only be used as a last resort. If volumes frequently require force detachment, contact your cloud provider’s support to identify the root cause.

Bug-fixes

This update also resolves an issue where dstack mistakenly marked a volume as attached even though it was actually detached.

UI

Fleets

The UI has been updated to simplify fleet and instance management. The Fleets page now allows users to terminate fleets and displays both active and terminated fleets. The new Instances page shows active and terminated instances across all fleets.

What's changed

Full changelog: 0.18.37...0.18.38

0.18.37

24 Jan 12:25
48c0e96

Choose a tag to compare

This release fixes a bug introduced in 0.18.36.

What's Changed

  • [Docs]: Update Distributed Tasks docs by @jvstme in #2212
  • Fix Internal Server Error in services w/o gateways by @jvstme in #2225

Full Changelog: 0.18.36...0.18.37

0.18.36

24 Jan 12:25
372ae41

Choose a tag to compare

Vultr

Cluster placement

The vultr backend can now provision fleets with cluster placement.

type: fleet

nodes: 4
placement: cluster

resources:
  gpu: MI300X:8

backends: [vultr]

Nodes in such a cluster will be interconnected and can be used to run distributed tasks.

Performance

The update optimizes the performance of dstack server, allowing a single server replica to handle up to 150 active runs, jobs, and instances. Capacity can be further increased by using PostgreSQL and running multiple server replicas.

Last, getting instance offers from backends when you run dstack apply has also been optimized and now takes less time.

What's changed

Full changelog: 0.18.35...0.18.36

0.18.35

16 Jan 09:29
7be3581

Choose a tag to compare

Vultr

This update introduces initial integration with Vultr. This cloud provider offers a diverse range of NVIDIA and AMD accelerators, from cost-effective fractional GPUs to multi-GPU bare-metal hosts.

$ dstack apply -f examples/.dstack.yml

 #   BACKEND  REGION  RESOURCES                                      PRICE
 1   vultr    ewr     2xCPU, 8GB, 1xA16 (2GB), 50.0GB (disk)         $0.059
 2   vultr    ewr     1xCPU, 5GB, 1xA40 (2GB), 90.0GB (disk)         $0.075
 3   vultr    ewr     1xCPU, 6GB, 1xA100 (4GB), 70.0GB (disk)        $0.123
 ...
 18  vultr    ewr     32xCPU, 375GB, 2xL40S (48GB), 2200.0GB (disk)  $3.342
 19  vultr    ewr     24xCPU, 240GB, 2xA100 (80GB), 1400.0GB (disk)  $4.795
 20  vultr    ewr     96xCPU, 960GB, 16xA16 (16GB), 1700.0GB (disk)  $7.534
 21  vultr    ewr     96xCPU, 1024GB, 4xA100 (80GB), 450.0GB (disk)  $9.589

See the docs for detailed instructions on configuring the vultr backend.

Note

This release includes all dstack features except support for volumes and clusters. These features will be added in an upcoming update.

Vast.ai

Previously, the vastai backend only allowed using Docker images where root is the default user. This limitation has been removed, so you can now run NVIDIA NIM or any other image regardless of the user.

Backward compatibility

If you are going to configure the vultr backend, make sure you update all your dstack CLI and API clients to the latest version. Clients prior to 0.18.35 will not work when Vultr is configured.

What's changed

Full changelog: 0.18.34...0.18.35

0.18.34

09 Jan 11:28
2135175

Choose a tag to compare

Idle duration

If provisioned fleet instances aren’t used, they are marked as idle for reuse within the configured idle duration. After this period, instances are automatically deleted. This behavior was previously configured using the termination_policy and termination_idle_time properties in run or fleet configurations.

With this update, we replace these two properties with idle_duration, a simpler way to configure this behavior. This property can be set to a specific duration or to off for unlimited time.

type: dev-environment
name: vscode

python: "3.11"
ide: vscode

# Terminate instances idle for more than 1 hour
idle_duration: 1h

resources:
  gpu: 24GB

Docker

Previously, dstack had limitations on Docker images for dev environments, tasks, and services. These have now been lifted, allowing images based on various Linux distributions like Alpine, Rocky Linux, and Fedora.

dstack now also supports Docker images with built-in OpenSSH servers, which previously caused issues.

Documentation

The documentation has been significantly improved:

  • Backend configuration has been moved from the Reference page to Concepts→Backends.
  • Major examples related to dev environments, tasks, and services have been relocated from the Reference page to their respective Concepts pages.

Deprecations

  • The termination_idle_time and termination_policy parameters in run configurations have been deprecated in favor of idle_duration.

What's changed

  • [dstack-shim] Implement Future API by @un-def in #2141
  • [API] Add API support to get runs by id by @r4victor in #2157
  • [TPU] Update TPU v5e runtime and update vllm-tpu example by @Bihan in #2155
  • [Internal] Skip docs-build on PRs from forks by @r4victor in #2159
  • [dstack-shim] Add API v2 compat support to ShimClient by @un-def in #2156
  • [Run configurations] Support Alpine and more RPM-based images by @un-def in #2151
  • [Internal] Omit id field in (API) Client.runs.get() method by @un-def in #2174
  • [dstack-shim] Remove API v1 by @un-def in #2160
  • [Volumes] Fix volume attachment with dstack backend by @un-def in #2175
  • Replace termination_policy and termination_idle_time with idle_duration: int|str|off by @peterschmidt85 in #2167
  • Allow running sshd in dstack runs by @jvstme in #2178
  • [Docs] Many docs improvements by @peterschmidt85 in #2171

Full changelog: 0.18.33...0.18.34