[Docs] Improve fleet documentation to reflect fleet-first UX changes

peterschmidt85 · peterschmidt85 · commit f96612603a69 · 2025-11-01T17:28:40.000-07:00
diff --git a/docker/server/README.md b/docker/server/README.md
@@ -39,7 +39,7 @@ Configuration is updated at ~/.dstack/config.yml
 ## Create SSH fleets
     
 If you want the `dstack` server to run containers on your on-prem servers,
-use [fleets](https://dstack.ai/docs/concepts/fleets#ssh).
+use [fleets](https://dstack.ai/docs/concepts/fleets#ssh-fleets).
 
 ## More information
 
diff --git a/docs/blog/posts/amd-mi300x-inference-benchmark.md b/docs/blog/posts/amd-mi300x-inference-benchmark.md
@@ -10,7 +10,7 @@ categories:
 
 # Benchmarking Llama 3.1 405B on 8x AMD MI300X GPUs
 
-At `dstack`, we've been adding support for AMD GPUs with [SSH fleets](../../docs/concepts/fleets.md#ssh), 
+At `dstack`, we've been adding support for AMD GPUs with [SSH fleets](../../docs/concepts/fleets.md#ssh-fleets), 
 so we saw this as a great chance to test our integration by benchmarking AMD GPUs. Our friends at 
 [Hot Aisle :material-arrow-top-right-thin:{ .external }](https://hotaisle.xyz/){:target="_blank"}, who build top-tier 
 bare metal compute for AMD GPUs, kindly provided the hardware for the benchmark.
@@ -34,7 +34,7 @@ Here is the spec of the bare metal machine we got:
 ??? info "Set up an SSH fleet"
 
     Hot Aisle provided us with SSH access to the machine. To make it accessible via `dstack`,
-    we created an [SSH fleet](../../docs/concepts/fleets.md#ssh) using the following configuration:
+    we created an [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets) using the following configuration:
 
     <div editor-title="hotaisle.dstack.yml"> 
 
@@ -215,7 +215,7 @@ If you have questions, feedback, or want to help improve the benchmark, please r
 is the primary sponsor of this benchmark, and we are sincerely grateful for their hardware and support.  
 
 If you'd like to use top-tier bare metal compute with AMD GPUs, we recommend going
-with Hot Aisle. Once you gain access to a cluster, it can be easily accessed via `dstack`'s [SSH fleet](../../docs/concepts/fleets.md#ssh) easily.
+with Hot Aisle. Once you gain access to a cluster, it can be easily accessed via `dstack`'s [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets) easily.
 
 ### RunPod
 If you’d like to use on-demand compute with AMD GPUs at affordable prices, you can configure `dstack` to
diff --git a/docs/docs/concepts/backends.md b/docs/docs/concepts/backends.md
@@ -9,9 +9,8 @@ They can be configured via `~/.dstack/server/config.yml` or through the [project
   * [Container-based](#container-based) – use either `dstack`'s native integration with cloud providers or Kubernetes to orchestrate container-based runs; provisioning in this case is delegated to the cloud provider or Kubernetes.  
   * [On-prem](#on-prem) – use `dstack`'s native support for on-prem servers without needing Kubernetes.  
 
-??? info "dstack Sky"
-    If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},  
-    you can either configure your own backends or use the pre-configured backend that gives you access to compute from the GPU marketplace.
+!!! info "dstack Sky"
+    If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}, backend configuration is optional. dstack Sky lets you use pre-configured backends to access GPU marketplace.
 
 See the examples of backend configuration below.
 
diff --git a/docs/docs/concepts/fleets.md b/docs/docs/concepts/fleets.md
@@ -4,36 +4,41 @@ Fleets act both as pools of instances and as templates for how those instances a
 
 `dstack` supports two kinds of fleets: 
 
-* [Standard fleets](#standard) – dynamically provisioned through configured backends; they are supported with any type of backends: [VM-based](backends.md#vm-based), [container-based](backends.md#container-based), and [Kubernetes](backends.md#kubernetes)
+* [Backend fleets](#backend) – dynamically provisioned through configured backends; they are supported with any type of backends: [VM-based](backends.md#vm-based) and [container-based](backends.md#container-based) (incl. [`kubernetes`](backends.md#kubernetes))
 * [SSH fleets](#ssh) – created using on-prem servers; do not require backends
 
-## Standard fleets { #standard }
+When you run `dstack apply` to start a dev environment, task, or service, `dstack` will reuse idle instances from an existing fleet whenever available.
 
-When you run `dstack apply` to start a dev environment, task, or service, `dstack` will reuse idle instances  
-from an existing fleet whenever available.
+## Backend fleets { #backend-fleets }
 
-If no fleet meets the requirements or has idle capacity, `dstack` can create a new fleet on the fly.  
-However, it’s generally better to define fleets explicitly in configuration files for greater control.  
+If you configured [backends](backends.md), `dstack` can provision fleets on the fly.
+However, it’s recommended to define fleets explicitly.
 
 ### Apply a configuration
 
-Define a fleet configuration as a YAML file in your project directory. The file must have a
+To create a backend fleet, define a configuration as a YAML file in your project directory. The file must have a
 `.dstack.yml` extension (e.g. `.dstack.yml` or `fleet.dstack.yml`).
 
 <div editor-title="examples/misc/fleets/.dstack.yml">
     
     ```yaml
     type: fleet
     # The name is optional, if not specified, generated randomly
-    name: my-fleet
+    name: default-fleet
     
     # Can be a range or a fixed number
-    nodes: 2
+    # Allow to provision of up to 2 instances
+    nodes: 0..2
+
     # Uncomment to ensure instances are inter-connected
     #placement: cluster
+
+    # Deprovision instances above the minimum if they remain idle
+    idle_duration: 1h
     
     resources:
-      gpu: 24GB
+      # Allow to provision up to 8 GPUs
+      gpu: 0..8
     ```
     
 </div>
@@ -48,63 +53,49 @@ $ dstack apply -f examples/misc/fleets/.dstack.yml
 Provisioning...
 ---> 100%
 
- FLEET     INSTANCE  BACKEND              GPU             PRICE    STATUS  CREATED 
- my-fleet  0         gcp (europe-west-1)  L4:24GB (spot)  $0.1624  idle    3 mins ago      
-           1         gcp (europe-west-1)  L4:24GB (spot)  $0.1624  idle    3 mins ago    
+ FLEET     INSTANCE  BACKEND  GPU  PRICE  STATUS  CREATED 
+ my-fleet  -         -        -    -      -       -
 ```
 
 </div>
 
-Once the status of instances changes to `idle`, they can be used by dev environments, tasks, and services.
+`dstack` always keeps the minimum number of nodes provisioned. Additional instances, up to the maximum limit, are provisioned on demand.
 
-??? info "Container-based backends"
-    [Container-based](backends.md#container-based) backends don’t support pre-provisioning,
-    so `nodes` can only be set to a range starting with `0`.
-    
-    This means instances are created only when a run starts, and once it finishes, they’re terminated and released back to the provider (either a cloud service or Kubernetes).
+!!! info "Container-based backends"
+    For [container-based](backends.md#container-based) backends  (such as `kubernetes`, `runpod`, etc), `nodes` must be defined as a range starting with `0`. In these cases, instances are provisioned on demand as needed.
 
-    <div editor-title=".dstack.yml">
+    <!-- TODO: Ensure the user sees the error or warning otherwise -->
 
-    ```yaml
-    type: fleet
-    # The name is optional, if not specified, generated randomly
-    name: my-fleet
-    
-    # Specify the number of instances
-    nodes: 0..2
-    # Uncomment to ensure instances are inter-connected
-    #placement: cluster
-    
-    resources:
-      gpu: 24GB
-    ```
+??? info "Target number of nodes"
 
-    </div>
+    If `nodes` is defined as a range, you can start with more than the minimum number of instances by using the `target` parameter when creating the fleet.
 
-### Configuration options
+    <div editor-title=".dstack.yml"> 
 
-#### Nodes { #nodes }
+    ```yaml
+    type: fleet
 
-The `nodes` property controls how many instances to provision and maintain in the fleet:
+    name: my-fleet
 
-<div editor-title=".dstack.yml"> 
+    nodes:
+      min: 0
+      max: 2
 
-```yaml
-type: fleet
+      # Provision 2 instances initially
+      target: 2
 
-name: my-fleet
+    # Deprovision instances above the minimum if they remain idle
+    idle_duration: 1h
+    ```
 
-nodes:
-  min: 1 # Always maintain at least 1 idle instance. Can be 0.
-  target: 2 # (Optional) Provision 2 instances initially
-  max: 3 # (Optional) Do not allow more than 3 instances
-```
+    </div>
 
-</div>
+By default, when you submit a [dev environment](dev-environments.md), [task](tasks.md), or [service](services.md), `dstack` tries all available fleets. However, you can explicitly specify the [`fleets`](../reference/dstack.yml/dev-environment.md#fleets) in your run configuration
+or via [`--fleet`](../reference/cli/dstack/apply.md#fleet) with `dstack apply`.
 
-`dstack` ensures the fleet always has at least `nodes.min` instances, creating new instances in the background if necessary. If you don't need to keep instances in the fleet forever, you can set `nodes.min` to `0`. By default, `dstack apply` also provisions `nodes.min` instances. The `nodes.target` property allows provisioning more instances initially than needs to be maintained.
+### Configuration options
 
-#### Placement { #standard-placement }
+#### Placement { #backend-placement }
 
 To ensure instances are interconnected (e.g., for
 [distributed tasks](tasks.md#distributed-tasks)), set `placement` to `cluster`. 
@@ -190,9 +181,9 @@ and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10
 > If you’re unsure which offers (hardware configurations) are available from the configured backends, use the
 > [`dstack offer`](../reference/cli/dstack/offer.md#list-gpu-offers) command to list them.
 
-#### Blocks { #standard-blocks }
+#### Blocks { #backend-blocks }
 
-For standard fleets, `blocks` function the same way as in SSH fleets. 
+For backend fleets, `blocks` function the same way as in SSH fleets. 
 See the [`Blocks`](#ssh-blocks) section under SSH fleets for details on the blocks concept.
 
 <div editor-title=".dstack.yml">
@@ -272,13 +263,13 @@ retry:
 </div>
 
 !!! info "Reference"
-    Standard fleets support many more configuration options,
+    Backend fleets support many more configuration options,
     incl. [`backends`](../reference/dstack.yml/fleet.md#backends), 
     [`regions`](../reference/dstack.yml/fleet.md#regions), 
     [`max_price`](../reference/dstack.yml/fleet.md#max_price), and
     among [others](../reference/dstack.yml/fleet.md).
 
-## SSH fleets { #ssh }
+## SSH fleets { #ssh-fleets }
 
 If you have a group of on-prem servers accessible via SSH, you can create an SSH fleet.
 
diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md
@@ -16,7 +16,7 @@ $ mkdir quickstart && cd quickstart
 
 ## Create a fleet
 
-Before submitting runs, you need to create a fleet where new instances will be provisioned.
+If [backends](concepts/backends.md) are configured, `dstack` can create a new [backend fleet](concepts/fleets.md#backend-fleets) on the fly. However, it’s recommended to create them explicitly.
 
 <h3>Define a configuration</h3>
 
@@ -28,7 +28,15 @@ Create the following fleet configuration inside your project folder:
 type: fleet
 name: default
 
-nodes: 0..
+# Allow to provision of up to 2 instances
+nodes: 0..2
+
+# Deprovision instances above the minimum if they remain idle
+idle_duration: 1h
+
+resources:
+  # Allow to provision up to 8 GPUs
+  gpu: 0..8
 ```
 
 </div>
@@ -55,13 +63,15 @@ Create the fleet? [y/n]: y
 
 </div>
 
+Alternatively, you can create an [SSH fleet](concepts/fleets#ssh-fleets).
+
 ## Submit your first run
 
 `dstack` supports three types of run configurations.
 
 === "Dev environment"
 
-    A dev environment lets you provision an instance and access it with your desktop IDE.
+    A [dev environment](concepts/dev-environments.md) lets you provision an instance and access it with your desktop IDE.
 
     <h3>Define a configuration</h3>
 
@@ -117,7 +127,7 @@ Create the fleet? [y/n]: y
 
 === "Task"
 
-    A task allows you to schedule a job or run a web app. Tasks can be distributed and can forward ports.
+    A [task](concepts/tasks.md) allows you to schedule a job or run a web app. Tasks can be distributed and can forward ports.
 
     <h3>Define a configuration</h3>
 
@@ -181,7 +191,7 @@ Create the fleet? [y/n]: y
 
 === "Service"
 
-    A service allows you to deploy a model or any web app as an endpoint.
+    A [service](concepts/services.md) allows you to deploy a model or any web app as an endpoint.
 
     <h3>Define a configuration</h3>
 
diff --git a/examples/clusters/nccl-tests/README.md b/examples/clusters/nccl-tests/README.md
@@ -1,27 +1,9 @@
 # NCCL tests
 
-This example shows how to run distributed [NCCL tests :material-arrow-top-right-thin:{ .external }](https://github.com/NVIDIA/nccl-tests){:target="_blank"} with MPI using `dstack`.
+This example shows how to run [NCCL tests :material-arrow-top-right-thin:{ .external }](https://github.com/NVIDIA/nccl-tests){:target="_blank"} on a cluster using [distributed tasks](https://dstack.ai/docs/concepts/tasks#distributed-tasks).
 
-??? info "Fleet"
-    Before running NCCL tests, make sure to create a fleet with `placement: cluster`. Here's a fleet configuration suitable for this example:
-
-    <div editor-title="examples/clusters/nccl-tests/fleet.dstack.yml">
-
-    ```yaml
-    type: fleet
-    name: cluster-fleet
-
-    nodes: 2
-    placement: cluster
-
-    resources:
-      gpu: nvidia:1..8
-      shm_size: 16GB
-    ```
-
-    </div>
-
-    > For more details on how to use clusters with `dstack`, check the [Clusters](https://dstack.ai/docs/guides/clusters) guide.
+!!! info "Prerequisites"
+    Before running a distributed task, make sure to create a fleet with `placement` set to `cluster` (can be a [managed fleet](https://dstack.ai/docs/concepts/fleets#backend-placement) or an [SSH fleet](https://dstack.ai/docs/concepts/fleets#ssh-placement)).
 
 ## Running as a task
 
@@ -97,5 +79,5 @@ The source-code of this example can be found in
 
 ## What's next?
 
-1. Check [dev environments](https://dstack.ai/docs/dev-environments), [tasks](https://dstack.ai/docs/tasks), 
-   [services](https://dstack.ai/docs/services), and [fleets](https://dstack.ai/docs/concepts/fleets).
+1. Check [dev environments](https://dstack.ai/docs/concepts/dev-environments), [tasks](https://dstack.ai/docs/concepts/tasks), 
+   [services](https://dstack.ai/docsconcepts/services), and [fleets](https://dstack.ai/docs/concepts/fleets).
diff --git a/examples/clusters/rccl-tests/README.md b/examples/clusters/rccl-tests/README.md
@@ -1,26 +1,9 @@
 # RCCL tests
 
-This example shows how to run distributed [RCCL tests :material-arrow-top-right-thin:{ .external }](https://github.com/ROCm/rccl-tests){:target="_blank"} with MPI using `dstack`.
+This example shows how to run distributed [RCCL tests :material-arrow-top-right-thin:{ .external }](https://github.com/ROCm/rccl-tests){:target="_blank"} using [distributed tasks](https://dstack.ai/docs/concepts/tasks#distributed-tasks).
 
-??? info "Fleet"
-    Before running RCCL tests, make sure to create a fleet with `placement: cluster`. Here's a fleet configuration suitable for this example:
-
-    <div editor-title="examples/clusters/rccl-tests/fleet.dstack.yml">
-
-    ```yaml
-    type: fleet
-    name: cluster-fleet
-
-    nodes: 2
-    placement: cluster
-
-    resources:
-      gpu: MI300X:8
-    ```
-
-    </div>
-
-    > For more details on how to use clusters with `dstack`, check the [Clusters](https://dstack.ai/docs/guides/clusters) guide.
+!!! info "Prerequisites"
+    Before running a distributed task, make sure to create a fleet with `placement` set to `cluster` (can be a [managed fleet](https://dstack.ai/docs/concepts/fleets#backend-placement) or an [SSH fleet](https://dstack.ai/docs/concepts/fleets#ssh-placement)).
 
 
 ## Running as a task
diff --git a/examples/distributed-training/axolotl/README.md b/examples/distributed-training/axolotl/README.md
@@ -1,38 +1,9 @@
 # Axolotl
 
-This example walks you through how to run distributed fine-tune using [Axolotl :material-arrow-top-right-thin:{ .external }](https://github.com/axolotl-ai-cloud/axolotl){:target="_blank"} with `dstack`.
+This example walks you through how to run distributed fine-tune using [Axolotl :material-arrow-top-right-thin:{ .external }](https://github.com/axolotl-ai-cloud/axolotl){:target="_blank"} and [distributed tasks](https://dstack.ai/docs/concepts/tasks#distributed-tasks).
 
-??? info "Prerequisites"
-    Once `dstack` is [installed](https://dstack.ai/docs/installation), clone the repo with examples.
-
-    <div class="termy">
- 
-    ```shell
-    $ git clone https://github.com/dstackai/dstack
-    $ cd dstack
-    ```
-    </div>
-
-??? info "Fleet"
-    Before submitting distributed training runs, make sure to create a fleet with `placement: cluster`. Here's a fleet configuration suitable for this example:
-
-    <div editor-title="examples/distributed-training/axolotl/fleet.dstack.yml">
-
-    ```yaml
-    type: fleet
-    name: axolotl-fleet
-
-    nodes: 2
-    placement: cluster
-
-    resources:
-      gpu: 80GB:8
-      shm_size: 128GB
-    ```
-
-    </div>
-
-    > For more details on how to use clusters with `dstack`, check the [Clusters](https://dstack.ai/docs/guides/clusters) guide.
+!!! info "Prerequisites"
+    Before running a distributed task, make sure to create a fleet with `placement` set to `cluster` (can be a [managed fleet](https://dstack.ai/docs/concepts/fleets#backend-placement) or an [SSH fleet](https://dstack.ai/docs/concepts/fleets#ssh-placement)).
 
 ## Define a configuration
 
diff --git a/examples/distributed-training/ray-ragen/README.md b/examples/distributed-training/ray-ragen/README.md
diff --git a/examples/distributed-training/trl/README.md b/examples/distributed-training/trl/README.md
diff --git a/examples/misc/ray/README.md b/examples/misc/ray/README.md
diff --git a/examples/misc/spark/README.md b/examples/misc/spark/README.md