|
| 1 | +--- |
| 2 | +title: "Rolling deployment, Secrets, Files, Tenstorrent, and more" |
| 3 | +date: 2025-07-10 |
| 4 | +description: "TBA" |
| 5 | +slug: changelog-07-25 |
| 6 | +image: https://dstack.ai/static-assets/static-assets/images/changelog-07-25.png |
| 7 | +categories: |
| 8 | + - Changelog |
| 9 | +--- |
| 10 | + |
| 11 | +# Rolling deployment, Secrets, Files, Tenstorrent, and more |
| 12 | + |
| 13 | +Thanks to feedback from the community, `dstack` continues to evolve. Here’s a look at what’s new. |
| 14 | + |
| 15 | +#### Rolling deployments |
| 16 | + |
| 17 | +Previously, updating running services could cause downtime. The latest release fixes this with [rolling deployments](../../docs/concepts/services.md/#rolling-deployment). Replicas are now updated one by one, allowing uninterrupted traffic during redeployments. |
| 18 | + |
| 19 | +<div class="termy"> |
| 20 | + |
| 21 | +```shell |
| 22 | +$ dstack apply -f my-service.dstack.yml |
| 23 | + |
| 24 | +Active run my-service already exists. Detected changes that can be updated in-place: |
| 25 | +- Repo state (branch, commit, or other) |
| 26 | +- File archives |
| 27 | +- Configuration properties: |
| 28 | + - env |
| 29 | + - files |
| 30 | + |
| 31 | +Update the run? [y/n]: |
| 32 | +``` |
| 33 | + |
| 34 | +</div> |
| 35 | + |
| 36 | +#### Secrets |
| 37 | + |
| 38 | +Secrets let you centrally manage sensitive data like API keys and credentials. They’re scoped to a project, managed by project admins, and can be [securely referenced](../../docs/concepts/secrets.md) in run configurations. |
| 39 | + |
| 40 | +<div editor-title=".dstack.yml"> |
| 41 | + |
| 42 | +```yaml hl_lines="7" |
| 43 | +type: task |
| 44 | +name: train |
| 45 | + |
| 46 | +image: nvcr.io/nvidia/pytorch:25.05-py3 |
| 47 | +registry_auth: |
| 48 | + username: $oauthtoken |
| 49 | + password: ${{ secrets.ngc_api_key }} |
| 50 | + |
| 51 | +commands: |
| 52 | + - git clone https://github.com/pytorch/examples.git pytorch-examples |
| 53 | + - cd pytorch-examples/distributed/ddp-tutorial-series |
| 54 | + - pip install -r requirements.txt |
| 55 | + - | |
| 56 | + torchrun \ |
| 57 | + --nproc-per-node=$DSTACK_GPUS_PER_NODE \ |
| 58 | + --nnodes=$DSTACK_NODES_NUM \ |
| 59 | + multinode.py 50 10 |
| 60 | +
|
| 61 | +resources: |
| 62 | + gpu: H100:1..2 |
| 63 | + shm_size: 24GB |
| 64 | +``` |
| 65 | +
|
| 66 | +</div> |
| 67 | +
|
| 68 | +#### Files |
| 69 | +
|
| 70 | +By default, `dstack` mounts the repo directory (where you ran `dstack init`) to all runs. |
| 71 | + |
| 72 | +If the directory is large or you need files outside of it, use the new [files](../../docs/concepts/dev-environments/#files) property to map specific local paths into the container. |
| 73 | + |
| 74 | +<div editor-title=".dstack.yml"> |
| 75 | + |
| 76 | +```yaml |
| 77 | +type: task |
| 78 | +name: trl-sft |
| 79 | +
|
| 80 | +files: |
| 81 | + - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` |
| 82 | + - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rs |
| 83 | + |
| 84 | +python: 3.12 |
| 85 | + |
| 86 | +env: |
| 87 | + - HF_TOKEN |
| 88 | + - HF_HUB_ENABLE_HF_TRANSFER=1 |
| 89 | + - MODEL=Qwen/Qwen2.5-0.5B |
| 90 | + - DATASET=stanfordnlp/imdb |
| 91 | + |
| 92 | +commands: |
| 93 | + - uv pip install trl |
| 94 | + - | |
| 95 | + trl sft \ |
| 96 | + --model_name_or_path $MODEL --dataset_name $DATASET |
| 97 | + --num_processes $DSTACK_GPUS_PER_NODE |
| 98 | +
|
| 99 | +resources: |
| 100 | + gpu: H100:1 |
| 101 | +``` |
| 102 | +
|
| 103 | +</div> |
| 104 | +
|
| 105 | +#### Tenstorrent |
| 106 | +
|
| 107 | +`dstack` remains committed to supporting multiple GPU vendors—including NVIDIA, AMD, TPUs, and more recently, [Tenstorrent :material-arrow-top-right-thin:{ .external }](https://tenstorrent.com/){:target="_blank"}. The latest release improves Tenstorrent support by handling hosts with multiple N300 cards and adds Docker-in-Docker support. |
| 108 | + |
| 109 | +<img src="https://dstack.ai/static-assets/static-assets/images/dstack-tenstorrent-n300.png" width="630"/> |
| 110 | + |
| 111 | +Huge thanks to the Tenstorrent community for testing these improvements! |
| 112 | + |
| 113 | +#### Docker in Docker |
| 114 | + |
| 115 | +Using Docker inside `dstack` run configurations is now even simpler. Just set `docker` to `true` to [enable the use of Docker CLI](../../docs/concepts/tasks.md#docker-in-docker) in your runs—allowing you to build images, run containers, use Docker Compose, and more. |
| 116 | + |
| 117 | +<div editor-title=".dstack.yml"> |
| 118 | + |
| 119 | +```yaml |
| 120 | +type: task |
| 121 | +name: docker-nvidia-smi |
| 122 | +
|
| 123 | +docker: true |
| 124 | +
|
| 125 | +commands: |
| 126 | + - | |
| 127 | + docker run --gpus all \ |
| 128 | + nvidia/cuda:12.3.0-base-ubuntu22.04 \ |
| 129 | + nvidia-smi |
| 130 | +
|
| 131 | +resources: |
| 132 | + gpu: H100:1 |
| 133 | +``` |
| 134 | + |
| 135 | +</div> |
| 136 | + |
| 137 | +#### AWS EFA |
| 138 | + |
| 139 | +EFA is a network interface for EC2 that enables low-latency, high-bandwidth communication between nodes—crucial for scaling distributed deep learning. With `dstack`, EFA is automatically enabled when using supported instance types in fleets. Check out our [example](../../examples/clusters/efa/index.md) |
| 140 | + |
| 141 | +#### Default Docker images |
| 142 | + |
| 143 | +If no `image` is specified, `dstack` uses a base Docker image that now comes pre-configured with `uv`, `python`, `pip`, essential CUDA drivers, InfiniBand, and NCCL tests (located at `/opt/nccl-tests/build`). |
| 144 | + |
| 145 | +<div editor-title="examples/clusters/nccl-tests/.dstack.yml"> |
| 146 | + |
| 147 | +```yaml |
| 148 | +type: task |
| 149 | +name: nccl-tests |
| 150 | +
|
| 151 | +nodes: 2 |
| 152 | +
|
| 153 | +startup_order: workers-first |
| 154 | +stop_criteria: master-done |
| 155 | +
|
| 156 | +env: |
| 157 | + - NCCL_DEBUG=INFO |
| 158 | +commands: |
| 159 | + - | |
| 160 | + if [ $DSTACK_NODE_RANK -eq 0 ]; then |
| 161 | + mpirun \ |
| 162 | + --allow-run-as-root \ |
| 163 | + --hostfile $DSTACK_MPI_HOSTFILE \ |
| 164 | + -n $DSTACK_GPUS_NUM \ |
| 165 | + -N $DSTACK_GPUS_PER_NODE \ |
| 166 | + --bind-to none \ |
| 167 | + /opt/nccl-tests/build/all_reduce_perf -b 8 -e 8G -f 2 -g 1 |
| 168 | + else |
| 169 | + sleep infinity |
| 170 | + fi |
| 171 | +
|
| 172 | +resources: |
| 173 | + gpu: nvidia:1..8 |
| 174 | + shm_size: 16GB |
| 175 | +``` |
| 176 | + |
| 177 | +</div> |
| 178 | + |
| 179 | +These images are optimized for common use cases and kept lightweight—ideal for everyday development, training, and inference. |
| 180 | + |
| 181 | +#### Server performance |
| 182 | + |
| 183 | +Server-side performance has been improved. With optimized handling and background processing, each server replica can now handle more runs. |
| 184 | + |
| 185 | +#### Google SSO |
| 186 | + |
| 187 | +Alongside the open-source version, `dstack` also offers [dstack Enterprise](https://github.com/dstackai/dstack-enterprise)—which adds dedicated support and extra integrations like Single Sign-On (SSO). The latest release introduces support for configuring your company’s Google account for authentication. |
| 188 | + |
| 189 | +<img src="https://dstack.ai/static-assets/static-assets/images/dstack-enterprise-google-sso.png" width="630"/> |
| 190 | + |
| 191 | +If you’d like to learn more about `dstack` Enterprise, [let us know](https://calendly.com/dstackai/discovery-call). |
| 192 | + |
| 193 | +That’s all for now. |
| 194 | + |
| 195 | +!!! info "What's next?" |
| 196 | + Give dstack a try, and share your feedback—whether it’s [GitHub :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack){:target="_blank"} issues, PRs, or questions on [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd){:target="_blank"}. We’re eager to hear from you! |
0 commit comments