Document the no-guardrails workaround for self-hosted Cosmos 3 generation

Self-hosting Cosmos 3 generation with the `vllm/vllm-omni:cosmos3` image, the open `nvidia/Cosmos3-Nano` weights download under OpenMDW, but the server then exits at startup while loading a gated dependency. This is the cookbook quickstart command:

```
vllm serve nvidia/Cosmos3-Nano --omni \
  --model-class-name Cosmos3OmniDiffusersPipeline \
  --allowed-local-media-path / --port 8000 --init-timeout 1800
```

and this is the error:

```
Cannot access gated repo for url https://huggingface.co/nvidia/Cosmos-1.0-Guardrail/...
Access to model nvidia/Cosmos-1.0-Guardrail is restricted and you are not in the authorized list.
```

The open generation model loads `nvidia/Cosmos-1.0-Guardrail`, which is a gated Hugging Face repo, so an HF token with access to the open model is not enough to start the server. The failure happens at startup rather than at request time, which makes it look like the whole deployment is broken instead of one optional component failing to load.

The model card references guardrails as an optional toggle (`"guardrails": true`) and lists `cosmos_guardrail` as a dependency, but it does not say that the guardrail model is gated or that the default generation path fails without access to it. The `--deploy-config` workaround that disables guardrails is in the main README, but not in the generator cookbook quickstart where a first-time user is working.

### Proposed change

Add the no-guardrails setup to the generator quickstart (`cookbooks/cosmos3/generator/audiovisual/README.md`) so a user who only wants to run generation can start the server without requesting access to the gated guardrail repo:

```yaml
# no_guardrails.yaml
async_chunk: false
stages:
  - stage_id: 0
    max_num_seqs: 1
    enforce_eager: true
    trust_remote_code: true
    model_class_name: Cosmos3OmniDiffusersPipeline
    model_config:
      guardrails: false
      offload_guardrail_models: false
```

```
vllm serve nvidia/Cosmos3-Nano --omni \
  --model-class-name Cosmos3OmniDiffusersPipeline \
  --deploy-config no_guardrails.yaml --port 8000 --init-timeout 1800
```

### Environment

Single NVIDIA H200 (141 GB) rented through NVIDIA Brev (DigitalOcean instance), driver 590.48, Docker 29.1.3 with the NVIDIA runtime, prebuilt `vllm/vllm-omni:cosmos3` image.

Happy to send a small docs PR for this if helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document the no-guardrails workaround for self-hosted Cosmos 3 generation #196

Proposed change

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Document the no-guardrails workaround for self-hosted Cosmos 3 generation #196

Description

Proposed change

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions