Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions .agents/skills/nemoclaw-user-configure-inference/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ When NemoClaw runs inside WSL, the provider menu can include Windows-host Ollama
- **Install Ollama on Windows host** when Windows does not have Ollama installed.

The install and restart paths set `OLLAMA_HOST=0.0.0.0:11434` on the Windows side so Docker and WSL can reach the daemon through `host.docker.internal`.
After an install or restart action, NemoClaw relaunches Ollama from the detected Windows tray app or verified `ollama.exe` path and waits until `host.docker.internal:11434` responds.
If the daemon does not become reachable, onboarding prints PowerShell commands you can run to inspect the Windows-side process and port state.
Use one Ollama instance on port `11434` at a time.
If both WSL and Windows-host Ollama are running, pick the intended menu entry during onboarding so NemoClaw validates and pulls models against the right daemon.

Expand Down Expand Up @@ -273,11 +275,14 @@ $ NEMOCLAW_EXPERIMENTAL=1 nemoclaw onboard

Select **Local NVIDIA NIM [experimental]** from the provider list.
NemoClaw filters available models by GPU VRAM, pulls the NIM container image, starts it, and waits for it to become healthy before continuing.
On hosts with mixed NVIDIA GPU models, the preflight summary shows each detected GPU model and the total VRAM so you can confirm which device class the model selection used.

NIM container images are hosted on `nvcr.io` and require NGC registry authentication before `docker pull` succeeds.
If Docker is not already logged in to `nvcr.io`, onboard prompts for an [NGC API key](https://org.ngc.nvidia.com/setup/api-key) and runs `docker login nvcr.io` over `--password-stdin` so the key is never written to disk or shell history.
The prompt masks the key during input and retries once on a bad key before failing.
In non-interactive mode, onboard exits with login instructions if Docker is not already authenticated; run `docker login nvcr.io` yourself, then re-run `nemoclaw onboard --non-interactive`.
If `NGC_API_KEY` or `NVIDIA_API_KEY` is already exported, NemoClaw passes it into the managed NIM container through the process environment instead of command-line arguments.
If the NIM container exits before the health endpoint becomes ready, onboarding stops early and prints the last container log lines.

> **Note:** NIM uses vLLM internally.
> The same `chat/completions` API path restriction applies.
Expand Down Expand Up @@ -326,10 +331,10 @@ Refer to Switch Inference Models (use the `nemoclaw-user-configure-inference` sk
For compatible endpoints, the command is:

```console
$ openshell inference set --provider compatible-endpoint --model <model-name>
$ nemoclaw inference set --provider compatible-endpoint --model <model-name>
```

If the provider itself needs to change (for example, switching from vLLM to a cloud API), rerun `nemoclaw onboard`.
If the provider itself needs to change (for example, switching from vLLM to a cloud API), pass the new provider to `nemoclaw inference set`.

## References

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ NemoClaw uses provider-specific local tokens for those routes, and rebuilds of l
| Anthropic | Tested | Native Anthropic | Uses anthropic-messages |
| Other Anthropic-compatible endpoint | Tested | Custom Anthropic-compatible | For Claude proxies and compatible gateways |
| Google Gemini | Tested | OpenAI-compatible | Uses Google's OpenAI-compatible endpoint |
| Hermes Provider | Hermes only | OpenAI-compatible route | Available when onboarding Hermes Agent through `nemohermes` |
| Local Ollama | Caveated | Local Ollama API | Available when Ollama is installed or running on the host |
| Local NVIDIA NIM | Experimental | Local OpenAI-compatible | Requires `NEMOCLAW_EXPERIMENTAL=1` and a NIM-capable GPU |
| Local vLLM | Experimental | Local OpenAI-compatible | Requires `NEMOCLAW_EXPERIMENTAL=1` and a server already running on `localhost:8000` |
Expand All @@ -48,6 +49,7 @@ Experimental local vLLM appears when you opt in and NemoClaw detects either a ru
| Anthropic | Routes to the Anthropic Messages API. Set `ANTHROPIC_API_KEY`. | `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-opus-4-6` |
| Other Anthropic-compatible endpoint | Routes to any server that implements the Anthropic Messages API (`/v1/messages`). The wizard prompts for a base URL and model name. Set `COMPATIBLE_ANTHROPIC_API_KEY`. | You provide the model name. |
| Google Gemini | Routes to Google's OpenAI-compatible endpoint. NemoClaw prefers `/responses` only when the endpoint proves it can handle tool calling in a way OpenClaw uses; otherwise it falls back to `/chat/completions`. Set `GEMINI_API_KEY`. | `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.5-flash-lite` |
| Hermes Provider | Routes Hermes Agent through the host OpenShell provider registered by NemoClaw when onboarding Hermes Agent. | Curated Hermes Provider models such as `moonshotai/kimi-k2.6`, `openai/gpt-5.4-mini`, and `z-ai/glm-5.1`. |
| Local Ollama | Routes to a local Ollama instance on `localhost:11434`. NemoClaw detects installed models, offers starter models if none are present, pulls and warms the selected model, and validates it. | Selected during onboarding. For more information, refer to Use a Local Inference Server (use the `nemoclaw-user-configure-inference` skill). |
| Model Router | Starts a host-side router on port `4000`, registers it as an OpenAI-compatible provider, and keeps the sandbox pointed at `inference.local`. Set `NEMOCLAW_PROVIDER=routed` for non-interactive setup. | The router pool defines the model names. |

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,47 +12,57 @@ No restart is required.

## Switch to a Different Model

Switching happens through the OpenShell inference route.
Use the provider and model that match the upstream you want to use.
This is one of the cases where a NemoClaw workflow intentionally uses `openshell`; see CLI Selection Guide (use the `nemoclaw-user-reference` skill) for the general boundary.
Use `nemoclaw inference set` with the provider and model that match the upstream you want to use.
The command updates the OpenShell inference route and synchronizes the running agent config.
For OpenClaw, it updates `agents.defaults.model.primary` and the matching provider namespace.
For Hermes, it updates `/sandbox/.hermes/config.yaml` (`model.default`, `model.base_url`, and `model.provider: custom`) without rebuilding or restarting Hermes.

Pass `--sandbox <name>` when you do not want to use the default registered sandbox.
Under `nemohermes`, pass `--sandbox <name>` when more than one Hermes sandbox is registered.

### NVIDIA Endpoints

```console
$ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b
$ nemoclaw inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b
```

### OpenAI

```console
$ openshell inference set --provider openai-api --model gpt-5.4
$ nemoclaw inference set --provider openai-api --model gpt-5.4
```

### Anthropic

```console
$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6
$ nemoclaw inference set --provider anthropic-prod --model claude-sonnet-4-6
```

### Google Gemini

```console
$ openshell inference set --provider gemini-api --model gemini-2.5-flash
$ nemoclaw inference set --provider gemini-api --model gemini-2.5-flash
```

### Compatible Endpoints

If you onboarded a custom compatible endpoint, switch models with the provider created for that endpoint:

```console
$ openshell inference set --provider compatible-endpoint --model <model-name>
$ nemoclaw inference set --provider compatible-endpoint --model <model-name>
```

```console
$ openshell inference set --provider compatible-anthropic-endpoint --model <model-name>
$ nemoclaw inference set --provider compatible-anthropic-endpoint --model <model-name>
```

If the provider itself needs to change, rerun `nemoclaw onboard`.
### Hermes Provider

For a NemoClaw-managed Hermes sandbox, use the Hermes alias with the registered Hermes Provider route:

```console
$ nemohermes inference set --provider hermes-provider --model openai/gpt-5.4-mini
```

#### Switching from Responses API to Chat Completions

Expand Down Expand Up @@ -87,28 +97,14 @@ $ NEMOCLAW_PREFERRED_API=openai-responses nemoclaw onboard

## Cross-Provider Switching

Switching to a different provider family (for example, from NVIDIA Endpoints to Anthropic) requires updating both the gateway route and the sandbox config.

Set the gateway route on the host:
Switching to a different provider family (for example, from NVIDIA Endpoints to Anthropic) also uses `nemoclaw inference set`.
The command updates both the gateway route and the OpenClaw provider namespace in the running sandbox config.

```console
$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6 --no-verify
$ nemoclaw inference set --provider anthropic-prod --model claude-sonnet-4-6 --no-verify
```

Then set the override env vars and recreate the sandbox so they take effect at startup:

```console
$ export NEMOCLAW_MODEL_OVERRIDE="anthropic/claude-sonnet-4-6"
$ export NEMOCLAW_INFERENCE_API_OVERRIDE="anthropic-messages"
$ nemoclaw onboard --resume --recreate-sandbox
```

The entrypoint patches `openclaw.json` at container startup with the override values.
You do not need to rebuild the image.
Remove the env vars and recreate the sandbox to revert to the original model.

`NEMOCLAW_INFERENCE_API_OVERRIDE` accepts `openai-completions` (for NVIDIA, OpenAI, Gemini, compatible endpoints) or `anthropic-messages` (for Anthropic and Anthropic-compatible endpoints).
This variable is only needed when switching between provider families.
Use `--no-verify` only when OpenShell cannot verify the provider at switch time but you have already confirmed the provider and credential.

## Tune Model Metadata

Expand Down Expand Up @@ -179,9 +175,8 @@ The output includes the active provider, model, and endpoint.

- The host keeps provider credentials.
- The sandbox continues to use `inference.local`.
- Same-provider model switches take effect immediately via the gateway route alone.
- Cross-provider switches also require `NEMOCLAW_MODEL_OVERRIDE` (and `NEMOCLAW_INFERENCE_API_OVERRIDE`) plus a sandbox recreate so the entrypoint patches the config at startup.
- Overrides are applied at container startup. Changing or removing env vars requires a sandbox recreate to take effect.
- `nemoclaw inference set` patches the selected running OpenClaw or Hermes sandbox config and recomputes its config hash.
- Use `nemoclaw onboard --resume --recreate-sandbox` for build-time settings such as context window, max tokens, reasoning mode, heartbeat cadence, or image contents.
- Local Ollama and local vLLM routes use local provider tokens rather than `OPENAI_API_KEY`. Rebuilds of older local-inference sandboxes clear the stale OpenAI credential requirement automatically.

## Related Topics
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ flowchart TB
* - Inference
- Credential exposure, unauthorized model access, cost overruns.
- OpenShell gateway
- Yes. Use `openshell inference set`.
- Yes. Use `nemoclaw inference set`.

:::

Expand Down Expand Up @@ -263,19 +263,21 @@ See the [Process Controls](https://docs.nvidia.com/openshell/latest/security/bes

The entrypoint drops dangerous Linux capabilities from the bounding set at startup using `capsh`.
This limits what capabilities any child process (gateway, sandbox, agent) can ever acquire.
When the entrypoint switches from root to the `sandbox` and `gateway` users, it uses `setpriv` when available to remove the remaining privilege-separation capabilities from the child process at the same time as the user change.

The entrypoint drops these capabilities: `cap_net_raw`, `cap_dac_override`, `cap_sys_chroot`, `cap_fsetid`, `cap_setfcap`, `cap_mknod`, `cap_audit_write`, `cap_net_bind_service`.
The entrypoint keeps these because it needs them for privilege separation using gosu: `cap_chown`, `cap_setuid`, `cap_setgid`, `cap_fowner`, `cap_kill`.
The initial entrypoint drop removes `cap_sys_admin`, `cap_sys_ptrace`, `cap_net_raw`, `cap_dac_override`, `cap_sys_chroot`, `cap_fsetid`, `cap_setfcap`, `cap_mknod`, `cap_audit_write`, and `cap_net_bind_service`.
During `setpriv` step-down, the child process also loses `cap_setuid`, `cap_setgid`, `cap_fowner`, `cap_chown`, and `cap_kill`.

This is best-effort: if `capsh` is not available or `CAP_SETPCAP` is not in the bounding set, the entrypoint logs a warning and continues with the default capability set.
If `setpriv` is unavailable, the entrypoint falls back to `gosu` and logs a warning that the remaining bounding-set capabilities were retained for the child process.
For additional protection, pass `--cap-drop=ALL` with `docker run` or Compose (see Sandbox Hardening (use the `nemoclaw-user-deploy-remote` skill)).

| Aspect | Detail |
|---|---|
| Default | The entrypoint drops dangerous capabilities at startup using `capsh`. Best-effort. |
| Default | The entrypoint drops dangerous capabilities at startup using `capsh`, then uses `setpriv` during user step-down when possible. Best-effort. |
| What you can change | When launching with `docker run` directly, pass `--cap-drop=ALL --cap-add=NET_BIND_SERVICE` for stricter enforcement. In the standard NemoClaw flow (with `nemoclaw onboard`), the entrypoint handles capability dropping automatically. |
| Risk if relaxed | `CAP_NET_RAW` allows raw socket access for network sniffing. `CAP_DAC_OVERRIDE` bypasses filesystem permission checks. Attackers can use `CAP_SYS_CHROOT` in container escape chains. If `capsh` is unavailable, the container runs with the default Docker capability set. |
| Recommendation | Run on an image that includes `capsh` (the NemoClaw image includes it through `libcap2-bin`). For defense-in-depth, also pass `--cap-drop=ALL` at the container runtime level. |
| Risk if relaxed | `CAP_SYS_ADMIN` and `CAP_SYS_PTRACE` expand kernel and process attack surface. `CAP_NET_RAW` allows raw socket access for network sniffing. `CAP_DAC_OVERRIDE` bypasses filesystem permission checks. If `capsh` or `setpriv` cannot run, the container retains more of the runtime-provided capability set. |
| Recommendation | Run on an image that includes `capsh` and `setpriv` (the NemoClaw image includes them). For defense-in-depth, also pass `--cap-drop=ALL` at the container runtime level. |

### Gateway Process Isolation

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,17 @@ Adjust the value via the `--ulimit nproc=512:512` flag if launching with

## Dropping Linux Capabilities

When running the sandbox container, drop all Linux capabilities and re-add only
what is strictly required:
The NemoClaw entrypoint drops dangerous capabilities from the process bounding
set before it starts agent services.
It removes `CAP_SYS_ADMIN`, `CAP_SYS_PTRACE`, `CAP_NET_RAW`,
`CAP_DAC_OVERRIDE`, `CAP_SYS_CHROOT`, `CAP_FSETID`, `CAP_SETFCAP`,
`CAP_MKNOD`, `CAP_AUDIT_WRITE`, and `CAP_NET_BIND_SERVICE`.
When `setpriv` is available, the entrypoint also removes the remaining
privilege-separation capabilities during the switch from root to the
`sandbox` and `gateway` users.

For defense-in-depth, also drop all Linux capabilities at the container runtime
when you launch the image directly:

```console
$ docker run --rm \
Expand Down
17 changes: 17 additions & 0 deletions .agents/skills/nemoclaw-user-get-started/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,23 @@ $ curl -fsSL https://www.nvidia.com/nemoclaw.sh | NEMOCLAW_NON_INTERACTIVE=1 NEM
If you use nvm or fnm to manage Node.js, the installer might not update your current shell's PATH.
If `nemoclaw` is not found after install, run `source ~/.bashrc` (or `source ~/.zshrc` for zsh) or open a new terminal.

On Linux, the installer checks Docker before it installs NemoClaw.
If Docker is missing, the installer downloads the official Docker convenience script, asks for `sudo`, installs Docker, and starts the Docker service when systemd is available.
If Docker is installed but your current shell cannot use the Docker socket yet, the installer adds your user to the `docker` group when needed and exits with a recovery command.

```console
$ newgrp docker
$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
```

On DGX Spark and DGX Station, an interactive installer can offer express install after you accept the third-party software notice.
Express install switches onboarding to non-interactive mode, applies the suggested security policy, and selects the managed local inference path for that platform.
Set `NEMOCLAW_NO_EXPRESS=1` to skip the express prompt, or set `NEMOCLAW_PROVIDER` before launching the installer when you want to choose a provider yourself.

The installer auto-launches `nemoclaw onboard` when it can locate the freshly-installed binary.
If it cannot locate the binary, or if blocking host preflight checks fail, it does not launch the wizard automatically.
In that case, the installer prints the relevant diagnostics and a `To finish setup, run:` block with the explicit `nemoclaw onboard` command.

> **Note:** The onboard flow builds the sandbox image with `NEMOCLAW_DISABLE_DEVICE_AUTH=1` so the dashboard is immediately usable during setup.
> This is a build-time setting baked into the sandbox image, not a runtime knob.
> If you export `NEMOCLAW_DISABLE_DEVICE_AUTH` after onboarding finishes, it has no effect on an existing sandbox.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,12 @@ The sandbox image is approximately 2.4 GB compressed. During image push, the Doc
|------------|----------------------------------|
| Node.js | 22.16 or later |
| npm | 10 or later |
| Docker | Docker Engine, Docker Desktop, or Colima on a tested platform |
| Platform | See [Platforms](#platforms) below |

On Linux, the installer can install Docker, start the Docker service, and add your user to the `docker` group.
If the group change is not active in the current shell, the installer exits with `newgrp docker` guidance before it starts onboarding.

:::{warning} OpenShell Lifecycle
For NemoClaw-managed environments, use `nemoclaw onboard` when you need to create or recreate the OpenShell gateway or sandbox.
Avoid `openshell self-update`, `npm update -g openshell`, `openshell gateway start --recreate`, or `openshell sandbox create` directly unless you intend to manage OpenShell separately and then rerun `nemoclaw onboard`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -134,10 +134,11 @@ $ nemohermes my-hermes snapshot create --name before-change
$ nemohermes my-hermes rebuild
```

To change the active model or provider without rebuilding the sandbox, use the OpenShell inference route.
To change the active model or provider without rebuilding the sandbox, use `nemohermes inference set`.
It updates the OpenShell inference route and patches `/sandbox/.hermes/config.yaml` without restarting Hermes.

```console
$ openshell inference set -g nemoclaw --model <model> --provider <provider>
$ nemohermes inference set --model <model> --provider <provider>
```

To remove the sandbox when you are done, destroy it explicitly.
Expand Down
Loading
Loading