Skip to content

Document the no-guardrails workaround for self-hosted Cosmos 3 generation #196

Description

@jabelk

Self-hosting Cosmos 3 generation with the vllm/vllm-omni:cosmos3 image, the open nvidia/Cosmos3-Nano weights download under OpenMDW, but the server then exits at startup while loading a gated dependency. This is the cookbook quickstart command:

vllm serve nvidia/Cosmos3-Nano --omni \
  --model-class-name Cosmos3OmniDiffusersPipeline \
  --allowed-local-media-path / --port 8000 --init-timeout 1800

and this is the error:

Cannot access gated repo for url https://huggingface.co/nvidia/Cosmos-1.0-Guardrail/...
Access to model nvidia/Cosmos-1.0-Guardrail is restricted and you are not in the authorized list.

The open generation model loads nvidia/Cosmos-1.0-Guardrail, which is a gated Hugging Face repo, so an HF token with access to the open model is not enough to start the server. The failure happens at startup rather than at request time, which makes it look like the whole deployment is broken instead of one optional component failing to load.

The model card references guardrails as an optional toggle ("guardrails": true) and lists cosmos_guardrail as a dependency, but it does not say that the guardrail model is gated or that the default generation path fails without access to it. The --deploy-config workaround that disables guardrails is in the main README, but not in the generator cookbook quickstart where a first-time user is working.

Proposed change

Add the no-guardrails setup to the generator quickstart (cookbooks/cosmos3/generator/audiovisual/README.md) so a user who only wants to run generation can start the server without requesting access to the gated guardrail repo:

# no_guardrails.yaml
async_chunk: false
stages:
  - stage_id: 0
    max_num_seqs: 1
    enforce_eager: true
    trust_remote_code: true
    model_class_name: Cosmos3OmniDiffusersPipeline
    model_config:
      guardrails: false
      offload_guardrail_models: false
vllm serve nvidia/Cosmos3-Nano --omni \
  --model-class-name Cosmos3OmniDiffusersPipeline \
  --deploy-config no_guardrails.yaml --port 8000 --init-timeout 1800

Environment

Single NVIDIA H200 (141 GB) rented through NVIDIA Brev (DigitalOcean instance), driver 590.48, Docker 29.1.3 with the NVIDIA runtime, prebuilt vllm/vllm-omni:cosmos3 image.

Happy to send a small docs PR for this if helpful.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions