diff --git a/docs/docs/concepts/dev-environments.md b/docs/docs/concepts/dev-environments.md index 8c1e66c87e..eeb8138c70 100644 --- a/docs/docs/concepts/dev-environments.md +++ b/docs/docs/concepts/dev-environments.md @@ -295,6 +295,70 @@ If you don't assign a value to an environment variable (see `HF_TOKEN` above), | `DSTACK_REPO_ID` | The ID of the repo | | `DSTACK_GPUS_NUM` | The total number of GPUs in the run | +### Files + +By default, `dstack` automatically mounts the [repo](repos.md) directory where you ran `dstack init` to any run configuration. + +However, in some cases, you may not want to mount the entire directory (e.g., if it’s too large), +or you might want to mount files outside of it. In such cases, you can use the [`files`](../reference/dstack.yml/dev-environment.md#files) property. + +
+ +```yaml +type: dev-environment +name: vscode + +files: + - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` + - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` + +ide: vscode +``` + +
+ +Each entry maps a local directory or file to a path inside the container. Both local and container paths can be relative or absolute. + +- If the local path is relative, it’s resolved relative to the configuration file. +- If the container path is relative, it’s resolved relative to `/workflow`. + +The container path is optional. If not specified, it will be automatically calculated. + +
+ +```yaml +type: dev-environment +name: vscode + +files: + - ../examples # Maps `examples` (the parent directory of `.dstack.yml`) to `/workflow/examples` + - ~/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` + +ide: vscode +``` + +
+ +Note: If you want to use `files` without mounting the entire repo directory, +make sure to pass `--no-repo` when running `dstack apply`: + +
+ +```shell +$ dstack apply -f examples/.dstack.yml --no-repo +``` + +
+ +??? info ".gitignore and .dstackignore" + `dstack` automatically excludes files and folders listed in `.gitignore` and `.dstackignore`. + + Uploads are limited to 2MB. To avoid exceeding this limit, make sure to exclude unnecessary files. + You can increase the default server limit by setting the `DSTACK_SERVER_CODE_UPLOAD_LIMIT` environment variable. + +!!! warning "Experimental" + The `files` feature is experimental. Feedback is highly appreciated. + ### Retry policy By default, if `dstack` can't find capacity or the instance is interrupted, the run will fail. diff --git a/docs/docs/concepts/services.md b/docs/docs/concepts/services.md index 6c7e6b6e36..5dd92f19c3 100644 --- a/docs/docs/concepts/services.md +++ b/docs/docs/concepts/services.md @@ -513,6 +513,98 @@ resources: +### Files + +By default, `dstack` automatically mounts the [repo](repos.md) directory where you ran `dstack init` to any run configuration. + +However, in some cases, you may not want to mount the entire directory (e.g., if it’s too large), +or you might want to mount files outside of it. In such cases, you can use the [`files`](../reference/dstack.yml/dev-environment.md#files) property. + + + +
+ +```yaml +type: service +name: llama-2-7b-service + +files: + - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` + - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` + +python: 3.12 + +env: + - HF_TOKEN + - MODEL=NousResearch/Llama-2-7b-chat-hf +commands: + - uv pip install vllm + - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000 +port: 8000 + +resources: + gpu: 24GB +``` + +
+ +Each entry maps a local directory or file to a path inside the container. Both local and container paths can be relative or absolute. + +- If the local path is relative, it’s resolved relative to the configuration file. +- If the container path is relative, it’s resolved relative to `/workflow`. + +The container path is optional. If not specified, it will be automatically calculated. + + + + +
+ +```yaml +type: service +name: llama-2-7b-service + +files: + - ../examples # Maps `examples` (the parent directory of `.dstack.yml`) to `/workflow/examples` + - ~/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rsa` + +python: 3.12 + +env: + - HF_TOKEN + - MODEL=NousResearch/Llama-2-7b-chat-hf +commands: + - uv pip install vllm + - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000 +port: 8000 + +resources: + gpu: 24GB +``` + +
+ +Note: If you want to use `files` without mounting the entire repo directory, +make sure to pass `--no-repo` when running `dstack apply`: + +
+ +```shell +$ dstack apply -f examples/.dstack.yml --no-repo +``` + +
+ +??? info ".gitignore and .dstackignore" + `dstack` automatically excludes files and folders listed in `.gitignore` and `.dstackignore`. + + Uploads are limited to 2MB. To avoid exceeding this limit, make sure to exclude unnecessary files. + You can increase the default server limit by setting the `DSTACK_SERVER_CODE_UPLOAD_LIMIT` environment variable. + + +!!! warning "Experimental" + The `files` feature is experimental. Feedback is highly appreciated. + ### Retry policy By default, if `dstack` can't find capacity, or the service exits with an error, or the instance is interrupted, the run will fail. diff --git a/docs/docs/concepts/tasks.md b/docs/docs/concepts/tasks.md index d35dca31c9..3ab61ab63f 100644 --- a/docs/docs/concepts/tasks.md +++ b/docs/docs/concepts/tasks.md @@ -461,6 +461,104 @@ If you don't assign a value to an environment variable (see `HF_TOKEN` above), | `DSTACK_NODES_IPS` | The list of internal IP addresses of all nodes delimited by "\n" | | `DSTACK_MPI_HOSTFILE` | The path to a pre-populated MPI hostfile | +### Files + +By default, `dstack` automatically mounts the [repo](repos.md) directory where you ran `dstack init` to any run configuration. + +However, in some cases, you may not want to mount the entire directory (e.g., if it’s too large), +or you might want to mount files outside of it. In such cases, you can use the [`files`](../reference/dstack.yml/dev-environment.md#files) property. + +
+ +```yaml +type: task +name: trl-sft + +files: + - .:examples # Maps the directory where `.dstack.yml` to `/workflow/examples` + - ~/.ssh/id_rsa:/root/.ssh/id_rsa # Maps `~/.ssh/id_rsa` to `/root/.ssh/id_rs + +python: 3.12 + +env: + - HF_TOKEN + - HF_HUB_ENABLE_HF_TRANSFER=1 + - MODEL=Qwen/Qwen2.5-0.5B + - DATASET=stanfordnlp/imdb + +commands: + - uv pip install trl + - | + trl sft \ + --model_name_or_path $MODEL --dataset_name $DATASET + --num_processes $DSTACK_GPUS_PER_NODE + +resources: + gpu: H100:1 +``` + +
+ +Each entry maps a local directory or file to a path inside the container. Both local and container paths can be relative or absolute. + +- If the local path is relative, it’s resolved relative to the configuration file. +- If the container path is relative, it’s resolved relative to `/workflow`. + +The container path is optional. If not specified, it will be automatically calculated. + + + +
+ +```yaml +type: task +name: trl-sft + +files: + - ../examples # Maps `examples` (the parent directory of `.dstack.yml`) to `/workflow/examples` + - ~/.cache/huggingface/token # Maps `~/.cache/huggingface/token` to `/root/~/.cache/huggingface/token` + +python: 3.12 + +env: + - HF_TOKEN + - HF_HUB_ENABLE_HF_TRANSFER=1 + - MODEL=Qwen/Qwen2.5-0.5B + - DATASET=stanfordnlp/imdb + +commands: + - uv pip install trl + - | + trl sft \ + --model_name_or_path $MODEL --dataset_name $DATASET + --num_processes $DSTACK_GPUS_PER_NODE + +resources: + gpu: H100:1 +``` + +
+ +Note: If you want to use `files` without mounting the entire repo directory, +make sure to pass `--no-repo` when running `dstack apply`: + +
+ +```shell +$ dstack apply -f examples/.dstack.yml --no-repo +``` + +
+ +??? info ".gitignore and .dstackignore" + `dstack` automatically excludes files and folders listed in `.gitignore` and `.dstackignore`. + + Uploads are limited to 2MB. To avoid exceeding this limit, make sure to exclude unnecessary files. + You can increase the default server limit by setting the `DSTACK_SERVER_CODE_UPLOAD_LIMIT` environment variable. + +!!! warning "Experimental" + The `files` feature is experimental. Feedback is highly appreciated. + ### Retry policy By default, if `dstack` can't find capacity, or the task exits with an error, or the instance is interrupted, diff --git a/examples/.dstack.yml b/examples/.dstack.yml index 1e47c9a732..daded20730 100644 --- a/examples/.dstack.yml +++ b/examples/.dstack.yml @@ -1,16 +1,12 @@ type: dev-environment -# The name is optional, if not specified, generated randomly -name: vscode +name: cursor -#python: "3.11" +python: 3.12 +ide: cursor -image: un1def/dstack-base:py3.12-dev-cuda-12.1 - -ide: vscode - -# Use either spot or on-demand instances -#spot_policy: auto +files: + - .:examples + - ~/.ssh/id_rsa:/root/.ssh/id_rsa resources: - cpu: x86:8..32 - gpu: 24GB..:1 + gpu: 1 diff --git a/mkdocs.yml b/mkdocs.yml index c8df9a4ebe..0a19ad4787 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -124,13 +124,13 @@ plugins: 'examples/fine-tuning/axolotl/index.md': 'examples/single-node-training/axolotl/index.md' 'blog/efa.md': 'examples/clusters/efa/index.md' - typeset - - gen-files: - scripts: # always relative to mkdocs.yml - - scripts/docs/gen_examples.py - - scripts/docs/gen_cli_reference.py - - scripts/docs/gen_openapi_reference.py - - scripts/docs/gen_schema_reference.py - - scripts/docs/gen_rest_plugin_spec_reference.py + # - gen-files: + # scripts: # always relative to mkdocs.yml + # - scripts/docs/gen_examples.py + # - scripts/docs/gen_cli_reference.py + # - scripts/docs/gen_openapi_reference.py + # - scripts/docs/gen_schema_reference.py + # - scripts/docs/gen_rest_plugin_spec_reference.py - mkdocstrings: handlers: python: