Skip to content

Commit 3e25867

Browse files
Resolved conflict
1 parent 5f81c69 commit 3e25867

1 file changed

Lines changed: 3 additions & 12 deletions

File tree

docs/docs/concepts/services.md

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -233,16 +233,6 @@ Setting the minimum number of replicas to `0` allows the service to scale down t
233233
??? info "Disaggregated serving"
234234
Native support for disaggregated prefill and decode, allowing both worker types to run within a single service, is coming soon.
235235

236-
### Model
237-
238-
If the service is running a chat model with an OpenAI-compatible interface (i.e., `/v1/chat/completions`),
239-
set the [`model`](../reference/dstack.yml/service.md#model) property to make the model accessible via `dstack`'s
240-
global OpenAI-compatible endpoint, and also accessible via `dstack`'s UI.
241-
242-
When `model` is set, `dstack` automatically configures [`probes`](#probes) to verify model health.
243-
To customize or disable this, set `probes` explicitly.
244-
245-
246236
### Authorization
247237

248238
By default, the service enables authorization, meaning the service endpoint requires a `dstack` user token.
@@ -341,8 +331,6 @@ Probes are executed for each service replica while the replica is `running`. A p
341331
??? info "Model"
342332
If you set the [`model`](#model) property but don't explicitly configure `probes`,
343333
`dstack` automatically configures a default probe that tests the model using the `/v1/chat/completions` API.
344-
This default probe sends a minimal chat completion request to verify the model is responding correctly.
345-
346334
To disable probes entirely when `model` is set, explicitly set `probes` to an empty list.
347335

348336
See the [reference](../reference/dstack.yml/service.md#probes) for more probe configuration options.
@@ -442,6 +430,9 @@ Limits apply to the whole service (all replicas) and per client (by IP). Clients
442430
If the service runs a model with an OpenAI-compatible interface, you can set the [`model`](#model) property to make the model accessible through `dstack`'s chat UI on the `Models` page.
443431
In this case, `dstack` will use the service's `/v1/chat/completions` service.
444432

433+
When `model` is set, `dstack` automatically configures [`probes`](#probes) to verify model health.
434+
To customize or disable this, set `probes` explicitly.
435+
445436
### Resources
446437

447438
If you specify memory size, you can either specify an explicit size (e.g. `24GB`) or a

0 commit comments

Comments
 (0)