diff --git a/.agents/skills/nemoclaw-user-configure-inference/SKILL.md b/.agents/skills/nemoclaw-user-configure-inference/SKILL.md index a43c37a3d4..77e24a1bf6 100644 --- a/.agents/skills/nemoclaw-user-configure-inference/SKILL.md +++ b/.agents/skills/nemoclaw-user-configure-inference/SKILL.md @@ -53,6 +53,8 @@ When NemoClaw runs inside WSL, the provider menu can include Windows-host Ollama - **Install Ollama on Windows host** when Windows does not have Ollama installed. The install and restart paths set `OLLAMA_HOST=0.0.0.0:11434` on the Windows side so Docker and WSL can reach the daemon through `host.docker.internal`. +After an install or restart action, NemoClaw relaunches Ollama from the detected Windows tray app or verified `ollama.exe` path and waits until `host.docker.internal:11434` responds. +If the daemon does not become reachable, onboarding prints PowerShell commands you can run to inspect the Windows-side process and port state. Use one Ollama instance on port `11434` at a time. If both WSL and Windows-host Ollama are running, pick the intended menu entry during onboarding so NemoClaw validates and pulls models against the right daemon. @@ -273,11 +275,14 @@ $ NEMOCLAW_EXPERIMENTAL=1 nemoclaw onboard Select **Local NVIDIA NIM [experimental]** from the provider list. NemoClaw filters available models by GPU VRAM, pulls the NIM container image, starts it, and waits for it to become healthy before continuing. +On hosts with mixed NVIDIA GPU models, the preflight summary shows each detected GPU model and the total VRAM so you can confirm which device class the model selection used. NIM container images are hosted on `nvcr.io` and require NGC registry authentication before `docker pull` succeeds. If Docker is not already logged in to `nvcr.io`, onboard prompts for an [NGC API key](https://org.ngc.nvidia.com/setup/api-key) and runs `docker login nvcr.io` over `--password-stdin` so the key is never written to disk or shell history. The prompt masks the key during input and retries once on a bad key before failing. In non-interactive mode, onboard exits with login instructions if Docker is not already authenticated; run `docker login nvcr.io` yourself, then re-run `nemoclaw onboard --non-interactive`. +If `NGC_API_KEY` or `NVIDIA_API_KEY` is already exported, NemoClaw passes it into the managed NIM container through the process environment instead of command-line arguments. +If the NIM container exits before the health endpoint becomes ready, onboarding stops early and prints the last container log lines. > **Note:** NIM uses vLLM internally. > The same `chat/completions` API path restriction applies. @@ -326,10 +331,10 @@ Refer to Switch Inference Models (use the `nemoclaw-user-configure-inference` sk For compatible endpoints, the command is: ```console -$ openshell inference set --provider compatible-endpoint --model +$ nemoclaw inference set --provider compatible-endpoint --model ``` -If the provider itself needs to change (for example, switching from vLLM to a cloud API), rerun `nemoclaw onboard`. +If the provider itself needs to change (for example, switching from vLLM to a cloud API), pass the new provider to `nemoclaw inference set`. ## References diff --git a/.agents/skills/nemoclaw-user-configure-inference/references/inference-options.md b/.agents/skills/nemoclaw-user-configure-inference/references/inference-options.md index 4a1591824d..72b83328d4 100644 --- a/.agents/skills/nemoclaw-user-configure-inference/references/inference-options.md +++ b/.agents/skills/nemoclaw-user-configure-inference/references/inference-options.md @@ -28,6 +28,7 @@ NemoClaw uses provider-specific local tokens for those routes, and rebuilds of l | Anthropic | Tested | Native Anthropic | Uses anthropic-messages | | Other Anthropic-compatible endpoint | Tested | Custom Anthropic-compatible | For Claude proxies and compatible gateways | | Google Gemini | Tested | OpenAI-compatible | Uses Google's OpenAI-compatible endpoint | +| Hermes Provider | Hermes only | OpenAI-compatible route | Available when onboarding Hermes Agent through `nemohermes` | | Local Ollama | Caveated | Local Ollama API | Available when Ollama is installed or running on the host | | Local NVIDIA NIM | Experimental | Local OpenAI-compatible | Requires `NEMOCLAW_EXPERIMENTAL=1` and a NIM-capable GPU | | Local vLLM | Experimental | Local OpenAI-compatible | Requires `NEMOCLAW_EXPERIMENTAL=1` and a server already running on `localhost:8000` | @@ -48,6 +49,7 @@ Experimental local vLLM appears when you opt in and NemoClaw detects either a ru | Anthropic | Routes to the Anthropic Messages API. Set `ANTHROPIC_API_KEY`. | `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-opus-4-6` | | Other Anthropic-compatible endpoint | Routes to any server that implements the Anthropic Messages API (`/v1/messages`). The wizard prompts for a base URL and model name. Set `COMPATIBLE_ANTHROPIC_API_KEY`. | You provide the model name. | | Google Gemini | Routes to Google's OpenAI-compatible endpoint. NemoClaw prefers `/responses` only when the endpoint proves it can handle tool calling in a way OpenClaw uses; otherwise it falls back to `/chat/completions`. Set `GEMINI_API_KEY`. | `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.5-flash-lite` | +| Hermes Provider | Routes Hermes Agent through the host OpenShell provider registered by NemoClaw when onboarding Hermes Agent. | Curated Hermes Provider models such as `moonshotai/kimi-k2.6`, `openai/gpt-5.4-mini`, and `z-ai/glm-5.1`. | | Local Ollama | Routes to a local Ollama instance on `localhost:11434`. NemoClaw detects installed models, offers starter models if none are present, pulls and warms the selected model, and validates it. | Selected during onboarding. For more information, refer to Use a Local Inference Server (use the `nemoclaw-user-configure-inference` skill). | | Model Router | Starts a host-side router on port `4000`, registers it as an OpenAI-compatible provider, and keeps the sandbox pointed at `inference.local`. Set `NEMOCLAW_PROVIDER=routed` for non-interactive setup. | The router pool defines the model names. | diff --git a/.agents/skills/nemoclaw-user-configure-inference/references/switch-inference-providers.md b/.agents/skills/nemoclaw-user-configure-inference/references/switch-inference-providers.md index bb79ead02f..c3070247f2 100644 --- a/.agents/skills/nemoclaw-user-configure-inference/references/switch-inference-providers.md +++ b/.agents/skills/nemoclaw-user-configure-inference/references/switch-inference-providers.md @@ -12,32 +12,36 @@ No restart is required. ## Switch to a Different Model -Switching happens through the OpenShell inference route. -Use the provider and model that match the upstream you want to use. -This is one of the cases where a NemoClaw workflow intentionally uses `openshell`; see CLI Selection Guide (use the `nemoclaw-user-reference` skill) for the general boundary. +Use `nemoclaw inference set` with the provider and model that match the upstream you want to use. +The command updates the OpenShell inference route and synchronizes the running agent config. +For OpenClaw, it updates `agents.defaults.model.primary` and the matching provider namespace. +For Hermes, it updates `/sandbox/.hermes/config.yaml` (`model.default`, `model.base_url`, and `model.provider: custom`) without rebuilding or restarting Hermes. + +Pass `--sandbox ` when you do not want to use the default registered sandbox. +Under `nemohermes`, pass `--sandbox ` when more than one Hermes sandbox is registered. ### NVIDIA Endpoints ```console -$ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b +$ nemoclaw inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b ``` ### OpenAI ```console -$ openshell inference set --provider openai-api --model gpt-5.4 +$ nemoclaw inference set --provider openai-api --model gpt-5.4 ``` ### Anthropic ```console -$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6 +$ nemoclaw inference set --provider anthropic-prod --model claude-sonnet-4-6 ``` ### Google Gemini ```console -$ openshell inference set --provider gemini-api --model gemini-2.5-flash +$ nemoclaw inference set --provider gemini-api --model gemini-2.5-flash ``` ### Compatible Endpoints @@ -45,14 +49,20 @@ $ openshell inference set --provider gemini-api --model gemini-2.5-flash If you onboarded a custom compatible endpoint, switch models with the provider created for that endpoint: ```console -$ openshell inference set --provider compatible-endpoint --model +$ nemoclaw inference set --provider compatible-endpoint --model ``` ```console -$ openshell inference set --provider compatible-anthropic-endpoint --model +$ nemoclaw inference set --provider compatible-anthropic-endpoint --model ``` -If the provider itself needs to change, rerun `nemoclaw onboard`. +### Hermes Provider + +For a NemoClaw-managed Hermes sandbox, use the Hermes alias with the registered Hermes Provider route: + +```console +$ nemohermes inference set --provider hermes-provider --model openai/gpt-5.4-mini +``` #### Switching from Responses API to Chat Completions @@ -87,28 +97,14 @@ $ NEMOCLAW_PREFERRED_API=openai-responses nemoclaw onboard ## Cross-Provider Switching -Switching to a different provider family (for example, from NVIDIA Endpoints to Anthropic) requires updating both the gateway route and the sandbox config. - -Set the gateway route on the host: +Switching to a different provider family (for example, from NVIDIA Endpoints to Anthropic) also uses `nemoclaw inference set`. +The command updates both the gateway route and the OpenClaw provider namespace in the running sandbox config. ```console -$ openshell inference set --provider anthropic-prod --model claude-sonnet-4-6 --no-verify +$ nemoclaw inference set --provider anthropic-prod --model claude-sonnet-4-6 --no-verify ``` -Then set the override env vars and recreate the sandbox so they take effect at startup: - -```console -$ export NEMOCLAW_MODEL_OVERRIDE="anthropic/claude-sonnet-4-6" -$ export NEMOCLAW_INFERENCE_API_OVERRIDE="anthropic-messages" -$ nemoclaw onboard --resume --recreate-sandbox -``` - -The entrypoint patches `openclaw.json` at container startup with the override values. -You do not need to rebuild the image. -Remove the env vars and recreate the sandbox to revert to the original model. - -`NEMOCLAW_INFERENCE_API_OVERRIDE` accepts `openai-completions` (for NVIDIA, OpenAI, Gemini, compatible endpoints) or `anthropic-messages` (for Anthropic and Anthropic-compatible endpoints). -This variable is only needed when switching between provider families. +Use `--no-verify` only when OpenShell cannot verify the provider at switch time but you have already confirmed the provider and credential. ## Tune Model Metadata @@ -179,9 +175,8 @@ The output includes the active provider, model, and endpoint. - The host keeps provider credentials. - The sandbox continues to use `inference.local`. -- Same-provider model switches take effect immediately via the gateway route alone. -- Cross-provider switches also require `NEMOCLAW_MODEL_OVERRIDE` (and `NEMOCLAW_INFERENCE_API_OVERRIDE`) plus a sandbox recreate so the entrypoint patches the config at startup. -- Overrides are applied at container startup. Changing or removing env vars requires a sandbox recreate to take effect. +- `nemoclaw inference set` patches the selected running OpenClaw or Hermes sandbox config and recomputes its config hash. +- Use `nemoclaw onboard --resume --recreate-sandbox` for build-time settings such as context window, max tokens, reasoning mode, heartbeat cadence, or image contents. - Local Ollama and local vLLM routes use local provider tokens rather than `OPENAI_API_KEY`. Rebuilds of older local-inference sandboxes clear the stale OpenAI credential requirement automatically. ## Related Topics diff --git a/.agents/skills/nemoclaw-user-configure-security/references/best-practices.md b/.agents/skills/nemoclaw-user-configure-security/references/best-practices.md index ae4aa6e80b..b2d7e6a4b2 100644 --- a/.agents/skills/nemoclaw-user-configure-security/references/best-practices.md +++ b/.agents/skills/nemoclaw-user-configure-security/references/best-practices.md @@ -99,7 +99,7 @@ flowchart TB * - Inference - Credential exposure, unauthorized model access, cost overruns. - OpenShell gateway - - Yes. Use `openshell inference set`. + - Yes. Use `nemoclaw inference set`. ::: @@ -263,19 +263,21 @@ See the [Process Controls](https://docs.nvidia.com/openshell/latest/security/bes The entrypoint drops dangerous Linux capabilities from the bounding set at startup using `capsh`. This limits what capabilities any child process (gateway, sandbox, agent) can ever acquire. +When the entrypoint switches from root to the `sandbox` and `gateway` users, it uses `setpriv` when available to remove the remaining privilege-separation capabilities from the child process at the same time as the user change. -The entrypoint drops these capabilities: `cap_net_raw`, `cap_dac_override`, `cap_sys_chroot`, `cap_fsetid`, `cap_setfcap`, `cap_mknod`, `cap_audit_write`, `cap_net_bind_service`. -The entrypoint keeps these because it needs them for privilege separation using gosu: `cap_chown`, `cap_setuid`, `cap_setgid`, `cap_fowner`, `cap_kill`. +The initial entrypoint drop removes `cap_sys_admin`, `cap_sys_ptrace`, `cap_net_raw`, `cap_dac_override`, `cap_sys_chroot`, `cap_fsetid`, `cap_setfcap`, `cap_mknod`, `cap_audit_write`, and `cap_net_bind_service`. +During `setpriv` step-down, the child process also loses `cap_setuid`, `cap_setgid`, `cap_fowner`, `cap_chown`, and `cap_kill`. This is best-effort: if `capsh` is not available or `CAP_SETPCAP` is not in the bounding set, the entrypoint logs a warning and continues with the default capability set. +If `setpriv` is unavailable, the entrypoint falls back to `gosu` and logs a warning that the remaining bounding-set capabilities were retained for the child process. For additional protection, pass `--cap-drop=ALL` with `docker run` or Compose (see Sandbox Hardening (use the `nemoclaw-user-deploy-remote` skill)). | Aspect | Detail | |---|---| -| Default | The entrypoint drops dangerous capabilities at startup using `capsh`. Best-effort. | +| Default | The entrypoint drops dangerous capabilities at startup using `capsh`, then uses `setpriv` during user step-down when possible. Best-effort. | | What you can change | When launching with `docker run` directly, pass `--cap-drop=ALL --cap-add=NET_BIND_SERVICE` for stricter enforcement. In the standard NemoClaw flow (with `nemoclaw onboard`), the entrypoint handles capability dropping automatically. | -| Risk if relaxed | `CAP_NET_RAW` allows raw socket access for network sniffing. `CAP_DAC_OVERRIDE` bypasses filesystem permission checks. Attackers can use `CAP_SYS_CHROOT` in container escape chains. If `capsh` is unavailable, the container runs with the default Docker capability set. | -| Recommendation | Run on an image that includes `capsh` (the NemoClaw image includes it through `libcap2-bin`). For defense-in-depth, also pass `--cap-drop=ALL` at the container runtime level. | +| Risk if relaxed | `CAP_SYS_ADMIN` and `CAP_SYS_PTRACE` expand kernel and process attack surface. `CAP_NET_RAW` allows raw socket access for network sniffing. `CAP_DAC_OVERRIDE` bypasses filesystem permission checks. If `capsh` or `setpriv` cannot run, the container retains more of the runtime-provided capability set. | +| Recommendation | Run on an image that includes `capsh` and `setpriv` (the NemoClaw image includes them). For defense-in-depth, also pass `--cap-drop=ALL` at the container runtime level. | ### Gateway Process Isolation diff --git a/.agents/skills/nemoclaw-user-deploy-remote/references/sandbox-hardening.md b/.agents/skills/nemoclaw-user-deploy-remote/references/sandbox-hardening.md index cbd4ba15e2..caf1873860 100644 --- a/.agents/skills/nemoclaw-user-deploy-remote/references/sandbox-hardening.md +++ b/.agents/skills/nemoclaw-user-deploy-remote/references/sandbox-hardening.md @@ -31,8 +31,17 @@ Adjust the value via the `--ulimit nproc=512:512` flag if launching with ## Dropping Linux Capabilities -When running the sandbox container, drop all Linux capabilities and re-add only -what is strictly required: +The NemoClaw entrypoint drops dangerous capabilities from the process bounding +set before it starts agent services. +It removes `CAP_SYS_ADMIN`, `CAP_SYS_PTRACE`, `CAP_NET_RAW`, +`CAP_DAC_OVERRIDE`, `CAP_SYS_CHROOT`, `CAP_FSETID`, `CAP_SETFCAP`, +`CAP_MKNOD`, `CAP_AUDIT_WRITE`, and `CAP_NET_BIND_SERVICE`. +When `setpriv` is available, the entrypoint also removes the remaining +privilege-separation capabilities during the switch from root to the +`sandbox` and `gateway` users. + +For defense-in-depth, also drop all Linux capabilities at the container runtime +when you launch the image directly: ```console $ docker run --rm \ diff --git a/.agents/skills/nemoclaw-user-get-started/SKILL.md b/.agents/skills/nemoclaw-user-get-started/SKILL.md index 1691d6d882..dd9424b7c3 100644 --- a/.agents/skills/nemoclaw-user-get-started/SKILL.md +++ b/.agents/skills/nemoclaw-user-get-started/SKILL.md @@ -33,6 +33,23 @@ $ curl -fsSL https://www.nvidia.com/nemoclaw.sh | NEMOCLAW_NON_INTERACTIVE=1 NEM If you use nvm or fnm to manage Node.js, the installer might not update your current shell's PATH. If `nemoclaw` is not found after install, run `source ~/.bashrc` (or `source ~/.zshrc` for zsh) or open a new terminal. +On Linux, the installer checks Docker before it installs NemoClaw. +If Docker is missing, the installer downloads the official Docker convenience script, asks for `sudo`, installs Docker, and starts the Docker service when systemd is available. +If Docker is installed but your current shell cannot use the Docker socket yet, the installer adds your user to the `docker` group when needed and exits with a recovery command. + +```console +$ newgrp docker +$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash +``` + +On DGX Spark and DGX Station, an interactive installer can offer express install after you accept the third-party software notice. +Express install switches onboarding to non-interactive mode, applies the suggested security policy, and selects the managed local inference path for that platform. +Set `NEMOCLAW_NO_EXPRESS=1` to skip the express prompt, or set `NEMOCLAW_PROVIDER` before launching the installer when you want to choose a provider yourself. + +The installer auto-launches `nemoclaw onboard` when it can locate the freshly-installed binary. +If it cannot locate the binary, or if blocking host preflight checks fail, it does not launch the wizard automatically. +In that case, the installer prints the relevant diagnostics and a `To finish setup, run:` block with the explicit `nemoclaw onboard` command. + > **Note:** The onboard flow builds the sandbox image with `NEMOCLAW_DISABLE_DEVICE_AUTH=1` so the dashboard is immediately usable during setup. > This is a build-time setting baked into the sandbox image, not a runtime knob. > If you export `NEMOCLAW_DISABLE_DEVICE_AUTH` after onboarding finishes, it has no effect on an existing sandbox. diff --git a/.agents/skills/nemoclaw-user-get-started/references/prerequisites.md b/.agents/skills/nemoclaw-user-get-started/references/prerequisites.md index afd09e2dd6..7083f12d9c 100644 --- a/.agents/skills/nemoclaw-user-get-started/references/prerequisites.md +++ b/.agents/skills/nemoclaw-user-get-started/references/prerequisites.md @@ -20,8 +20,12 @@ The sandbox image is approximately 2.4 GB compressed. During image push, the Doc |------------|----------------------------------| | Node.js | 22.16 or later | | npm | 10 or later | +| Docker | Docker Engine, Docker Desktop, or Colima on a tested platform | | Platform | See [Platforms](#platforms) below | +On Linux, the installer can install Docker, start the Docker service, and add your user to the `docker` group. +If the group change is not active in the current shell, the installer exits with `newgrp docker` guidance before it starts onboarding. + :::{warning} OpenShell Lifecycle For NemoClaw-managed environments, use `nemoclaw onboard` when you need to create or recreate the OpenShell gateway or sandbox. Avoid `openshell self-update`, `npm update -g openshell`, `openshell gateway start --recreate`, or `openshell sandbox create` directly unless you intend to manage OpenShell separately and then rerun `nemoclaw onboard`. diff --git a/.agents/skills/nemoclaw-user-get-started/references/quickstart-hermes.md b/.agents/skills/nemoclaw-user-get-started/references/quickstart-hermes.md index 6272976989..92825af90e 100644 --- a/.agents/skills/nemoclaw-user-get-started/references/quickstart-hermes.md +++ b/.agents/skills/nemoclaw-user-get-started/references/quickstart-hermes.md @@ -134,10 +134,11 @@ $ nemohermes my-hermes snapshot create --name before-change $ nemohermes my-hermes rebuild ``` -To change the active model or provider without rebuilding the sandbox, use the OpenShell inference route. +To change the active model or provider without rebuilding the sandbox, use `nemohermes inference set`. +It updates the OpenShell inference route and patches `/sandbox/.hermes/config.yaml` without restarting Hermes. ```console -$ openshell inference set -g nemoclaw --model --provider +$ nemohermes inference set --model --provider ``` To remove the sandbox when you are done, destroy it explicitly. diff --git a/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md b/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md index 4e34fb5248..fd397154b5 100644 --- a/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md +++ b/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md @@ -121,7 +121,7 @@ Recover from a misconfigured sandbox without re-running the full onboard wizard Change the active model or provider at runtime without rebuilding the sandbox: ```console -$ openshell inference set -g nemoclaw --model --provider +$ nemoclaw inference set --model --provider ``` Refer to Switch Inference Providers (use the `nemoclaw-user-configure-inference` skill) for provider-specific model IDs and API compatibility notes. @@ -197,11 +197,14 @@ The upgrade flow is non-destructive by default because NemoClaw preserves manife ```console $ nemoclaw snapshot create --name pre-upgrade # optional, recommended -$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash # updates CLI; auto-upgrades stale running sandboxes +$ nemoclaw update --yes # updates CLI through the maintained installer flow $ nemoclaw upgrade-sandboxes --check # verify or list remaining stale/unknown sandboxes $ nemoclaw upgrade-sandboxes # manually rebuild remaining stale running sandboxes ``` +`nemoclaw update` is the CLI wrapper around the same installer path as `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash`. +Use `nemoclaw update --check` when you only want to inspect version state and see the maintained update command. + For scripted manual rebuilds, use `nemoclaw upgrade-sandboxes --auto` to skip the confirmation prompt. If the upgraded sandbox needs its workspace state reverted, restore the pre-upgrade snapshot into the running sandbox. diff --git a/.agents/skills/nemoclaw-user-overview/references/release-notes.md b/.agents/skills/nemoclaw-user-overview/references/release-notes.md index 6b6e2116f7..c3e0fc008d 100644 --- a/.agents/skills/nemoclaw-user-overview/references/release-notes.md +++ b/.agents/skills/nemoclaw-user-overview/references/release-notes.md @@ -13,7 +13,25 @@ NVIDIA NemoClaw is available in early preview starting March 16, 2026. Use the f ## Behavior Changes -### v0.0.38 Reliability updates +### v0.0.39 Release Prep Updates + +NemoClaw v0.0.39 improves several day-two workflows: + +- The installer checks Docker earlier on Linux, can install and start Docker when needed, and stops with `newgrp docker` guidance when the current shell has not picked up the `docker` group yet. +- DGX Spark and DGX Station users can accept an express install prompt that preselects the local inference path and suggested policy defaults. +- NemoClaw now creates GPU-capable OpenShell Docker sandboxes by default when an NVIDIA GPU is available, with explicit `--sandbox-gpu`, `--no-sandbox-gpu`, and `--sandbox-gpu-device` controls. +- `nemohermes` supports Hermes Provider onboarding and runtime model switches through `nemohermes inference set`. +- `nemoclaw hosts-add`, `hosts-list`, and `hosts-remove` manage sandbox host aliases for LAN-only services. +- `nemoclaw update` checks and runs the maintained installer flow, while `nemoclaw upgrade-sandboxes` remains responsible for rebuilding existing sandboxes. +- `nemoclaw destroy` preserves the shared gateway by default unless `--cleanup-gateway` is selected. +- `nemoclaw connect` repairs stale `inference.local` DNS proxy routes before opening the session. +- Windows-host Ollama onboarding relaunches the daemon with the reachable binding after install or restart. +- Local NVIDIA NIM onboarding passes `NGC_API_KEY` or `NVIDIA_API_KEY` into the managed container without putting the secret in process arguments, detects early container exits during health checks, and prints a per-GPU preflight breakdown on mixed-model hosts. +- The sandbox startup path strips additional Linux capabilities before and during privilege step-down. +- OpenClaw workspace template files are seeded when bootstrap is skipped and the workspace is still empty. +- Kimi K2.6 and related NVIDIA-hosted chat-completions paths include model-specific compatibility handling for reasoning output. + +### v0.0.38 Reliability Updates NemoClaw v0.0.38 improves several day-two workflows: @@ -24,7 +42,7 @@ NemoClaw v0.0.38 improves several day-two workflows: - Rebuild backups tolerate partial archive output when usable data was produced, then report only the manifest-defined paths that could not be archived. - NemoHermes uninstall output uses NemoHermes-specific help, progress, and completion text. -### v0.0.34 — Installer requires explicit acceptance in non-TTY environments +### v0.0.34 Installer Requires Explicit Acceptance in Non-TTY Environments Starting with NemoClaw v0.0.34, the `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash` installer pipeline no longer auto-accepts the third-party software notice when stdin is piped and `/dev/tty` is unavailable (for example, deeply detached SSH sessions or some container shells). In environments without a TTY, accept upfront in the pipe: diff --git a/.agents/skills/nemoclaw-user-reference/references/cli-selection-guide.md b/.agents/skills/nemoclaw-user-reference/references/cli-selection-guide.md index a87615c6b0..3d17231a80 100644 --- a/.agents/skills/nemoclaw-user-reference/references/cli-selection-guide.md +++ b/.agents/skills/nemoclaw-user-reference/references/cli-selection-guide.md @@ -79,10 +79,9 @@ Use `openshell` when the docs explicitly call for a live OpenShell gateway opera $ openshell term ``` -- Change the live gateway inference route: +- Inspect the live gateway inference route: ```console - $ openshell inference set -g nemoclaw --provider --model $ openshell inference get -g nemoclaw ``` @@ -162,13 +161,19 @@ Approved endpoints are session-scoped unless you also add them to the policy thr ### Change Models or Providers -For a same-provider model switch, change the live OpenShell inference route: +Use the NemoClaw command for model or provider switches so the OpenShell route and the running agent config stay consistent: ```console -$ openshell inference set -g nemoclaw --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b +$ nemoclaw inference set --provider nvidia-prod --model nvidia/nemotron-3-super-120b-a12b ``` -For a provider-family change or a build-time OpenClaw setting change, rerun onboarding so the sandbox configuration is recreated consistently: +For Hermes sandboxes, use the alias; it updates the route and `/sandbox/.hermes/config.yaml` without a rebuild or restart: + +```console +$ nemohermes inference set --provider hermes-provider --model openai/gpt-5.4-mini +``` + +For a build-time agent setting change, rerun onboarding so the sandbox configuration is recreated consistently: ```console $ nemoclaw onboard --resume --recreate-sandbox diff --git a/.agents/skills/nemoclaw-user-reference/references/commands.md b/.agents/skills/nemoclaw-user-reference/references/commands.md index 3c294729f3..a8e169a8de 100644 --- a/.agents/skills/nemoclaw-user-reference/references/commands.md +++ b/.agents/skills/nemoclaw-user-reference/references/commands.md @@ -45,7 +45,7 @@ The wizard creates an OpenShell gateway, registers inference providers, builds t Use this command for new installs and for recreating a sandbox after changes to policy or configuration. ```console -$ nemoclaw onboard [--non-interactive] [--resume | --fresh] [--recreate-sandbox] [--gpu | --no-gpu] [--from ] [--name ] [--agent ] [--control-ui-port ] [--yes-i-accept-third-party-software] +$ nemoclaw onboard [--non-interactive] [--resume | --fresh] [--recreate-sandbox] [--gpu | --no-gpu] [--from ] [--name ] [--sandbox-gpu | --no-sandbox-gpu] [--sandbox-gpu-device ] [--agent ] [--control-ui-port ] [--yes | -y] [--yes-i-accept-third-party-software] ``` > **Warning:** For NemoClaw-managed environments, use `nemoclaw onboard` when you need to create or recreate the OpenShell gateway or sandbox. @@ -229,6 +229,8 @@ $ nemoclaw onboard --from ./Dockerfile.custom When `nemoclaw onboard` detects an NVIDIA GPU on the host (`nvidia-smi` succeeds), it enables OpenShell GPU passthrough at both the gateway and sandbox level by default. Use `--no-gpu` to opt out when you want host-side inference providers only and do not need direct GPU access inside the sandbox. Use `--gpu` to require GPU passthrough and fail fast if an NVIDIA GPU is not detected. +Use `--sandbox-gpu` or `--no-sandbox-gpu` to control only direct NVIDIA GPU access inside the sandbox. +Use `--sandbox-gpu-device ` to pass a specific OpenShell GPU device selector to `openshell sandbox create`. Prerequisites: @@ -246,7 +248,7 @@ Sandboxes with an active SSH session are marked with a `●` indicator so you ca When a sandbox has a recorded dashboard port, the output includes its local dashboard URL. ```console -$ nemoclaw list +$ nemoclaw list [--json] $ nemoclaw list --json ``` @@ -281,9 +283,12 @@ After a host reboot, the OpenShell gateway rotates its SSH host keys. You no longer need to re-run `nemoclaw onboard` after a reboot in this case. ```console -$ nemoclaw my-assistant connect +$ nemoclaw my-assistant connect [--probe-only] ``` +The `--probe-only` flag verifies the sandbox is reachable over SSH and exits without opening a shell. +Use it for health checks and scripted readiness probes. + ### `nemoclaw recover` Restart the in-sandbox gateway and re-establish the host-side dashboard port-forward without opening an SSH session. @@ -406,9 +411,11 @@ For Ollama-backed sandboxes, `destroy` also asks Ollama to unload currently load If another terminal has an active SSH session to the sandbox, `destroy` prints an active-session warning and requires a second confirmation before it proceeds. Pass `--yes`, `-y`, or `--force` to skip the prompt in scripted workflows. +By default, `destroy` preserves the shared NemoClaw gateway. +Pass `--cleanup-gateway` to remove the shared gateway when destroying the last sandbox, or `--no-cleanup-gateway` to force preservation when environment defaults request cleanup. ```console -$ nemoclaw my-assistant destroy +$ nemoclaw my-assistant destroy [--yes|-y|--force] [--cleanup-gateway|--no-cleanup-gateway] ``` ### `nemoclaw policy-add` @@ -494,6 +501,41 @@ If the preset is unknown or not currently applied, the command exits non-zero wi Unchecking a preset in the onboard TUI checkbox also removes it from the sandbox. +### `nemoclaw hosts-add` + +Add a host alias to the sandbox pod template. +Use this when a sandbox needs a stable LAN-only name, such as a local SearXNG or internal model endpoint, without dropping to `docker exec` and `kubectl patch`. + +```console +$ nemoclaw my-assistant hosts-add searxng.local 192.168.1.105 +``` + +The command validates the hostname and IP address, rejects duplicate hostnames, and patches `spec.podTemplate.spec.hostAliases` on the sandbox resource. + +| Flag | Description | +|------|-------------| +| `--dry-run` | Print the JSON patch for the resulting `hostAliases` list without applying it | + +### `nemoclaw hosts-list` + +List host aliases configured on the sandbox resource. + +```console +$ nemoclaw my-assistant hosts-list +``` + +### `nemoclaw hosts-remove` + +Remove a hostname from the sandbox `hostAliases` list. + +```console +$ nemoclaw my-assistant hosts-remove searxng.local +``` + +| Flag | Description | +|------|-------------| +| `--dry-run` | Print the JSON patch for the resulting `hostAliases` list without applying it | + ### `nemoclaw channels list` List the messaging channels NemoClaw knows about (`telegram`, `discord`, `slack`) with a short description. @@ -610,6 +652,28 @@ If an archive command reports partial output while still producing usable data, If any required state path still cannot be backed up, `rebuild` exits before destroying the original sandbox. After restore, the command runs `openclaw doctor --fix` for cross-version structure repair. +### `nemoclaw update` + +Check for a NemoClaw CLI update and, when requested, run the maintained installer flow. +This command is a discoverable CLI wrapper around the supported installer path: + +```console +$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash +``` + +```console +$ nemoclaw update [--check] [--yes|-y] +``` + +| Flag | Description | +|------|-------------| +| `--check` | Show the current version, latest maintained version, install type, and maintained update command without changing anything | +| `--yes`, `-y` | Skip the confirmation prompt and run the maintained installer flow | + +`nemoclaw update` updates the host-side NemoClaw installation. +It does not replace `nemoclaw upgrade-sandboxes`; use that command to inspect or rebuild existing sandboxes after the CLI has been updated. +When the command is running from a source checkout, it reports that state and does not replace the checkout with a global package install. + ### `nemoclaw upgrade-sandboxes` Rebuild sandboxes whose base image is older than the one currently pinned by NemoClaw. @@ -806,12 +870,28 @@ $ nemoclaw status $ nemoclaw status --json ``` +### `nemoclaw inference set` + +Switch the active inference provider or model for a NemoClaw-managed OpenClaw or Hermes sandbox. +The command updates the OpenShell gateway route, patches the selected running agent config so it matches the route, recomputes the config hash, and updates the NemoClaw registry. +For Hermes, the patch updates `/sandbox/.hermes/config.yaml` (`model.default`, `model.base_url`, and `model.provider: custom`) and does not rebuild or restart the gateway. + +By default, the command syncs the default registered sandbox. +Under the `nemohermes` alias, it uses the registered Hermes sandbox when exactly one exists; otherwise pass `--sandbox ` to target one explicitly. + +```console +$ nemoclaw inference set --provider --model [--sandbox ] [--no-verify] +``` + +Supported provider names are `nvidia-prod`, `nvidia-nim`, `nvidia-router`, `openai-api`, `anthropic-prod`, `compatible-anthropic-endpoint`, `gemini-api`, `compatible-endpoint`, `hermes-provider`, `ollama-local`, and `vllm-local`. +Use `--no-verify` only when OpenShell cannot verify the provider at switch time but you have already confirmed the provider and credential. + ### `nemoclaw setup` > **Warning:** The `nemoclaw setup` command is deprecated. > Use `nemoclaw onboard` instead. -This command remains as a compatibility alias to `nemoclaw onboard`. +This command remains as a compatibility alias to `nemoclaw onboard` and accepts the same flags: `--non-interactive`, `--resume`, `--fresh`, `--recreate-sandbox`, `--gpu` / `--no-gpu`, `--from`, `--name`, `--sandbox-gpu` / `--no-sandbox-gpu`, `--sandbox-gpu-device`, `--agent`, `--control-ui-port`, `--yes` / `-y`, `--yes-i-accept-third-party-software`. ```console $ nemoclaw setup @@ -822,7 +902,7 @@ $ nemoclaw setup > **Warning:** The `nemoclaw setup-spark` command is deprecated. > Use the standard installer and run `nemoclaw onboard` instead, because current OpenShell releases handle the older DGX Spark cgroup behavior. -This command remains as a compatibility alias to `nemoclaw onboard`. +This command remains as a compatibility alias to `nemoclaw onboard` and accepts the same flags: `--non-interactive`, `--resume`, `--fresh`, `--recreate-sandbox`, `--gpu` / `--no-gpu`, `--from`, `--name`, `--sandbox-gpu` / `--no-sandbox-gpu`, `--sandbox-gpu-device`, `--agent`, `--control-ui-port`, `--yes` / `-y`, `--yes-i-accept-third-party-software`. ```console $ nemoclaw setup-spark @@ -901,9 +981,10 @@ For Local Ollama setups, uninstall also stops matching Ollama auth proxy process | `--yes` | Skip the confirmation prompt | | `--keep-openshell` | Leave the `openshell` binary installed | | `--delete-models` | Also remove NemoClaw-pulled Ollama models | +| `--gateway ` | Override the gateway name to remove (default: `nemoclaw`) | ```console -$ nemoclaw uninstall [--yes] [--keep-openshell] [--delete-models] +$ nemoclaw uninstall [--yes] [--keep-openshell] [--delete-models] [--gateway ] ``` #### `nemoclaw uninstall` vs. the hosted `uninstall.sh` @@ -922,7 +1003,7 @@ Use the hosted `curl … | bash` form only when the CLI is broken or already par ## Environment Variables -NemoClaw reads the following environment variables to configure service ports. +NemoClaw reads the following environment variables to configure service ports, onboarding behavior, and lifecycle defaults. Set them before running `nemoclaw onboard` or any command that starts services. All ports must be non-privileged integers between 1024 and 65535. @@ -948,9 +1029,67 @@ Defaults are unchanged when no variable is set. If `NEMOCLAW_DASHBOARD_PORT` or the port from `CHAT_UI_URL` is already occupied by another sandbox, onboarding scans `18789` through `18799` and uses the next free dashboard port. Pass `--control-ui-port ` to require a specific port. -### Onboard timeouts +### Onboarding Configuration -The following environment variables tune onboard-time wall-clock limits. Set them before running `nemoclaw onboard` if a slow connection or large model pull risks tripping the default. +These variables let you tune onboarding without editing the Dockerfile or passing repeated flags. +Set them before running `nemoclaw onboard`. + +| Variable | Format | Effect | +|----------|--------|--------| +| `NEMOCLAW_PROVIDER` | provider key (e.g. `nvidia`, `openai`, `anthropic`, `ollama`, `vllm`, `compatible`) | Selects the inference provider in non-interactive onboarding. Must match one of the keys the wizard would prompt for. | +| `NEMOCLAW_HERMES_AUTH_METHOD` | `oauth` | Selects Hermes Provider authentication in non-interactive onboarding. Valid values: `oauth`, `nous-portal-oauth`, `api-key`, `nous-api-key`. | +| `NEMOCLAW_HERMES_AUTH` | same as `NEMOCLAW_HERMES_AUTH_METHOD` | Back-compatible alias for Hermes Provider authentication selection. | +| `NEMOCLAW_NOUS_AUTH_METHOD` | same as `NEMOCLAW_HERMES_AUTH_METHOD` | Nous-specific alias for Hermes Provider authentication selection. | +| `NEMOCLAW_ENDPOINT_URL` | URL | Custom OpenAI-compatible endpoint URL. Used together with `NEMOCLAW_PROVIDER=compatible`. | +| `NEMOCLAW_PREFERRED_API` | `completions` (currently the only honored value) | Forces the validation probe to use the `/v1/chat/completions` API path instead of the newer `/v1/responses` API. | +| `NEMOCLAW_INFERENCE_INPUTS` | comma-separated list of `text` and/or `image` | Declares model input modalities for vision-capable models. Validated strictly; unknown tokens are ignored. | +| `NEMOCLAW_AGENT_TIMEOUT` | positive integer (seconds) | Overrides `agents.defaults.timeoutSeconds` in the built OpenClaw config. Raise for slow inference. | +| `NEMOCLAW_CONTEXT_WINDOW` | positive integer (tokens) | Overrides the model's context-window value in the built OpenClaw config. | +| `NEMOCLAW_MAX_TOKENS` | positive integer (tokens) | Overrides the model's `maxTokens` in the built OpenClaw config. | +| `NEMOCLAW_REASONING` | `true` or `false` | Overrides the model's reasoning-mode flag in the built OpenClaw config. | +| `NEMOCLAW_AGENT_HEARTBEAT_EVERY` | duration with `s`, `m`, or `h` suffix (for example `30m`, `1h`, or `0m`) | Overrides `agents.defaults.heartbeat.every` in the built OpenClaw config. Set `0m` to disable periodic agent turns. | +| `NEMOCLAW_OLLAMA_REQUIRE_TOOLS` | `0` to disable, anything else to keep the default | When set to `0`, skips the Ollama tool-calling capability check during local-inference onboarding. | +| `NEMOCLAW_PROXY_HOST` | hostname or IP | Overrides the sandbox-side outbound HTTP proxy host. Defaults to `10.200.0.1`. | +| `NEMOCLAW_PROXY_PORT` | integer port | Overrides the sandbox-side outbound HTTP proxy port. Defaults to `3128`. | +| `NEMOCLAW_OPENSHELL_BIN` | path | Overrides the `openshell` binary the CLI invokes. Defaults to `openshell` (resolved via `PATH`). | +| `NEMOCLAW_SANDBOX` | sandbox name | Alternate spelling of `NEMOCLAW_SANDBOX_NAME`; used by `services` and `debug` lookups when neither a flag nor `NEMOCLAW_SANDBOX_NAME` is set. | +| `NEMOCLAW_INSTALL_REF` | git ref | For internal installer commands: the git ref to install from. Overridden by the `--install-ref` flag. | +| `NEMOCLAW_INSTALL_TAG` | release tag | For internal installer commands: the release tag to install. Overridden by the `--install-tag` flag. | + +### Onboarding Behavior Flags + +These flags toggle optional behaviors during onboarding; set them before running `nemoclaw onboard`. + +| Variable | Format | Effect | +|----------|--------|--------| +| `NEMOCLAW_YES` | `1` to enable | Auto-accepts confirmation prompts (`--yes` equivalent) including in helpers like the Ollama proxy auth setup. | +| `NEMOCLAW_NO_EXPRESS` | `1` to enable | Installer-only. Skips the DGX Spark and DGX Station express install prompt and continues with the normal interactive onboarding flow. | +| `NEMOCLAW_EXPERIMENTAL` | `1` to enable | Surfaces experimental providers and flows in onboarding. | +| `NEMOCLAW_IGNORE_RUNTIME_RESOURCES` | `1` to enable | Suppresses the under-provisioned runtime warning during preflight. Use only when you know the sandbox host meets the minimums. | +| `NEMOCLAW_DISABLE_OVERLAY_FIX` | `1` to enable | Skips the Docker overlay-fix step during sandbox build. For environments where the fix is incompatible. | +| `NEMOCLAW_OVERLAY_SNAPSHOTTER` | snapshotter name | Selects the containerd overlay snapshotter for sandbox builds. Empty (default) preserves containerd's choice. | +| `NEMOCLAW_SKIP_TELEGRAM_REACHABILITY` | `1` to enable | Skips the Telegram bot reachability probe during onboard (useful in restricted networks). | +| `NEMOCLAW_CONFIG_ACCEPT_NEW_PATH` | `1` to enable | Accepts a new sandbox config path without an interactive prompt when the stored path differs from the discovered one. | +| `NEMOCLAW_SANDBOX_GPU` | `auto`, `1`, or `0` | Controls sandbox GPU passthrough during onboarding. `auto` enables GPU passthrough when an NVIDIA GPU is detected, `1` requires GPU passthrough, and `0` forces CPU-only sandbox creation. | +| `NEMOCLAW_SANDBOX_GPU_DEVICE` | OpenShell GPU device selector | Selects the GPU device passed with `openshell sandbox create --gpu-device`. Setting this value enables sandbox GPU passthrough unless `NEMOCLAW_SANDBOX_GPU=0` is also set, which is rejected. | +| `NEMOCLAW_OPENSHELL_GATEWAY_BIN` | path | Advanced override for the `openshell-gateway` binary used by the Linux Docker-driver gateway. Defaults to the binary next to `openshell`, then common install paths. | +| `NEMOCLAW_OPENSHELL_SANDBOX_BIN` | path | Advanced override for the `openshell-sandbox` binary passed to the Linux Docker-driver gateway supervisor. Defaults to the binary next to `openshell`, then common install paths. | +| `NEMOCLAW_OPENSHELL_GATEWAY_STATE_DIR` | path | Advanced override for the Linux Docker-driver gateway pid file and SQLite state directory. Defaults to `~/.local/state/nemoclaw/openshell-docker-gateway`. | + +### Probe Timeouts + +These tune how long internal probes wait before giving up. +Defaults are sized for typical hardware; override only if you see false-positive timeouts. + +| Variable | Default | Effect | +|----------|---------|--------| +| `NEMOCLAW_SANDBOX_EXEC_TIMEOUT_MS` | per call site (typically `15000`) | Overrides the default timeout for `openshell sandbox exec` calls issued by recovery and lifecycle helpers. Integer milliseconds; non-positive or non-numeric values fall back to the per-call-site default. | +| `NEMOCLAW_STATUS_PROBE_TIMEOUT_MS` | built-in default | Overrides the timeout for the OpenShell status probe used by `nemoclaw status`. Integer milliseconds; non-positive or non-numeric values fall back to the default. | + +### Onboard Timeouts + +The following environment variables tune onboard-time wall-clock limits. +Set them before running `nemoclaw onboard` if a slow connection or large model pull risks tripping the default. | Variable | Default | Purpose | |----------|---------|---------| @@ -969,6 +1108,7 @@ These flags change defaults for commands that manage existing sandboxes. | Variable | Format | Effect | |----------|--------|--------| +| `NEMOCLAW_CLEANUP_GATEWAY` | `1`, `true`, or `yes` to enable; `0`, `false`, or `no` to disable | Sets the default for whether `nemoclaw destroy` removes the shared gateway when destroying the last sandbox. Command-line `--cleanup-gateway` and `--no-cleanup-gateway` still take precedence. | | `NEMOCLAW_DISABLE_INFERENCE_ROUTE_REPAIR` | `1` to enable | Skips the automatic DNS-proxy repair for stale `inference.local` routes during `nemoclaw connect` and `nemoclaw connect --probe-only`. Use only as a troubleshooting escape hatch. | ## NemoHermes Alias diff --git a/.agents/skills/nemoclaw-user-reference/references/troubleshooting.md b/.agents/skills/nemoclaw-user-reference/references/troubleshooting.md index e571054471..88bbf57a74 100644 --- a/.agents/skills/nemoclaw-user-reference/references/troubleshooting.md +++ b/.agents/skills/nemoclaw-user-reference/references/troubleshooting.md @@ -63,6 +63,7 @@ On macOS with Docker Desktop, open the Docker Desktop application and wait for i ### Docker permission denied on Linux On Linux, if the Docker daemon is running but you see "permission denied" errors, your user may not be in the `docker` group. +The installer can add your user to the group, but Linux does not activate that membership in the current shell automatically. Add your user and activate the group in the current shell: ```console @@ -71,6 +72,12 @@ $ newgrp docker ``` Then retry `nemoclaw onboard`. +If the installer stopped after printing `newgrp docker`, run that command and then re-run the installer: + +```console +$ newgrp docker +$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash +``` ### macOS first-run failures @@ -999,3 +1006,81 @@ For additional troubleshooting, see the Quickstart (use the `nemoclaw-user-get-s Podman is not a tested runtime. OpenShell officially documents Docker-based runtimes only. If you encounter issues with Podman, switch to a tested runtime (Docker Engine, Docker Desktop, or Colima) and rerun onboarding. + +## Brev + +For Brev setup instructions, refer to Brev Web UI (use the `nemoclaw-user-deploy-remote` skill). + +### Most OpenClaw skills show as blocked + +After deploying NemoClaw on Brev, the Skills page in the OpenClaw gateway dashboard shows most bundled skills with a `blocked` status. +Only three skills are available by default: `healthcheck`, `skill-creator`, and `weather`. + +Skills are blocked for one of three reasons. + +- The skill requires a macOS-only binary (`memo`, `remindctl`, `grizzly`, and similar) that is not available on the Linux (GCP) instance Brev provisions. +- The skill requires a CLI binary that is not pre-installed in the sandbox image, such as `gh` for the GitHub skill. +- The skill requires API credentials that have not been configured, such as a Notion API key or Discord bot token. + +Skills that require macOS-only binaries cannot be enabled on Brev. +Skills that require additional CLI binaries require a custom sandbox image rebuild. + +For credentials, use the supported host-side setup flow. Re-run onboarding for inference or Brave Search credentials, or use `nemoclaw channels add ` for messaging channels. +To add a binary to the sandbox image, update the sandbox `Dockerfile.base` to install the required package, then rebuild: + +```console +$ nemoclaw rebuild +``` + +After the rebuild completes, return to the Skills page to confirm the skill status has changed from `blocked` to `ready`. + +### `openclaw config set` fails with a permission error on Brev + +When the sandbox config has been locked from the host, `openclaw.json` is owned by root and mounted read-only inside the sandbox. +Running `openclaw config set` inside the sandbox then returns: + +```text +EACCES: permission denied, open '/sandbox/.openclaw/openclaw.json' +``` + +In the default sandbox state, `openclaw.json` is writable by the sandbox user. +If you see this error, use the host-side config command instead: + +```console +$ nemoclaw config set --key --value '' --restart +``` + +Refer to Commands (use the `nemoclaw-user-reference` skill) for the full list of supported configuration keys. + +### OpenClaw dashboard is unreachable after extended uptime on Brev + +After leaving NemoClaw running for an extended period on Brev, the OpenClaw dashboard may return `ERR_CONNECTION_RESET` or fail to load in the browser. +The agent may still respond on messaging channels such as Telegram or Slack while the dashboard is unreachable. + +> **Back up your workspace first:** Take a snapshot before running onboard to protect your workspace files. +> +> ```console +> $ nemoclaw snapshot create +> ``` + +Re-run onboarding to restore dashboard connectivity: + +```console +$ nemoclaw onboard +``` + +Depending on current sandbox state, onboarding may prompt before recreating resources. + +### Skill install buttons do not work on Brev + +Clicking **Install** on a skill in the OpenClaw gateway dashboard on Brev shows no response or fails silently. + +Skill installation runs against the sandbox environment. +Installing packages on the Brev host does not make them available inside the sandbox. +To install a skill dependency, add it to the sandbox image and rebuild: + +```console +$ nemoclaw rebuild +``` + +After the rebuild completes, return to the Skills page to confirm the skill is ready. diff --git a/docs/about/release-notes.md b/docs/about/release-notes.md index 81d7e275a9..0273ce3ef9 100644 --- a/docs/about/release-notes.md +++ b/docs/about/release-notes.md @@ -22,18 +22,27 @@ status: published # Release Notes -NVIDIA NemoClaw is available in early preview starting March 16, 2026. Use the following GitHub resources to track changes. +NVIDIA NemoClaw is available in early preview starting March 16, 2026. Use this page to track changes. -| Resource | Description | -|---|---| -| [Releases](https://github.com/NVIDIA/NemoClaw/releases) | Versioned release notes and downloadable assets. | -| [Release comparison](https://github.com/NVIDIA/NemoClaw/compare) | Diff between any two tags or branches. | -| [Merged pull requests](https://github.com/NVIDIA/NemoClaw/pulls?q=is%3Apr+is%3Amerged) | Individual changes with review discussion. | -| [Commit history](https://github.com/NVIDIA/NemoClaw/commits/main) | Full commit log on `main`. | +## v0.0.39 -## Behavior Changes +NemoClaw v0.0.39 improves several day-two workflows: -### v0.0.38 Reliability Updates +- The installer checks Docker earlier on Linux, can install and start Docker when needed, and stops with `newgrp docker` guidance when the current shell has not picked up the `docker` group yet. +- DGX Spark and DGX Station users can accept an express install prompt that preselects the local inference path and suggested policy defaults. +- NemoClaw now creates GPU-capable OpenShell Docker sandboxes by default when an NVIDIA GPU is available, with explicit `--sandbox-gpu`, `--no-sandbox-gpu`, and `--sandbox-gpu-device` controls. +- `nemohermes` supports Hermes Provider onboarding and runtime model switches through `nemohermes inference set`. +- `nemoclaw hosts-add`, `hosts-list`, and `hosts-remove` manage sandbox host aliases for LAN-only services. +- `nemoclaw update` checks and runs the maintained installer flow, while `nemoclaw upgrade-sandboxes` remains responsible for rebuilding existing sandboxes. +- `nemoclaw destroy` preserves the shared gateway by default unless `--cleanup-gateway` is selected. +- `nemoclaw connect` repairs stale `inference.local` DNS proxy routes before opening the session. +- Windows-host Ollama onboarding relaunches the daemon with the reachable binding after install or restart. +- Local NVIDIA NIM onboarding passes `NGC_API_KEY` or `NVIDIA_API_KEY` into the managed container without putting the secret in process arguments, detects early container exits during health checks, and prints a per-GPU preflight breakdown on mixed-model hosts. +- The sandbox startup path strips additional Linux capabilities before and during privilege step-down. +- OpenClaw workspace template files are seeded when bootstrap is skipped and the workspace is still empty. +- Kimi K2.6 and related NVIDIA-hosted chat-completions paths include model-specific compatibility handling for reasoning output. + +## v0.0.38 NemoClaw v0.0.38 improves several day-two workflows: @@ -44,7 +53,7 @@ NemoClaw v0.0.38 improves several day-two workflows: - Rebuild backups tolerate partial archive output when usable data was produced, then report only the manifest-defined paths that could not be archived. - NemoHermes uninstall output uses NemoHermes-specific help, progress, and completion text. -### v0.0.34 Installer Requires Explicit Acceptance in Non-TTY Environments +## v0.0.34 Starting with NemoClaw v0.0.34, the `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash` installer pipeline no longer auto-accepts the third-party software notice when stdin is piped and `/dev/tty` is unavailable (for example, deeply detached SSH sessions or some container shells). In environments without a TTY, accept upfront in the pipe: diff --git a/docs/deployment/sandbox-hardening.md b/docs/deployment/sandbox-hardening.md index f131f7c1e1..1497c6370e 100644 --- a/docs/deployment/sandbox-hardening.md +++ b/docs/deployment/sandbox-hardening.md @@ -51,8 +51,17 @@ Adjust the value via the `--ulimit nproc=512:512` flag if launching with ## Dropping Linux Capabilities -When running the sandbox container, drop all Linux capabilities and re-add only -what is strictly required: +The NemoClaw entrypoint drops dangerous capabilities from the process bounding +set before it starts agent services. +It removes `CAP_SYS_ADMIN`, `CAP_SYS_PTRACE`, `CAP_NET_RAW`, +`CAP_DAC_OVERRIDE`, `CAP_SYS_CHROOT`, `CAP_FSETID`, `CAP_SETFCAP`, +`CAP_MKNOD`, `CAP_AUDIT_WRITE`, and `CAP_NET_BIND_SERVICE`. +When `setpriv` is available, the entrypoint also removes the remaining +privilege-separation capabilities during the switch from root to the +`sandbox` and `gateway` users. + +For defense-in-depth, also drop all Linux capabilities at the container runtime +when you launch the image directly: ```console $ docker run --rm \ diff --git a/docs/get-started/prerequisites.md b/docs/get-started/prerequisites.md index be5ac89f0c..1e74e2d8e2 100644 --- a/docs/get-started/prerequisites.md +++ b/docs/get-started/prerequisites.md @@ -40,8 +40,12 @@ The sandbox image is approximately 2.4 GB compressed. During image push, the Doc |------------|----------------------------------| | Node.js | 22.16 or later | | npm | 10 or later | +| Docker | Docker Engine, Docker Desktop, or Colima on a tested platform | | Platform | See [Platforms](#platforms) below | +On Linux, the installer can install Docker, start the Docker service, and add your user to the `docker` group. +If the group change is not active in the current shell, the installer exits with `newgrp docker` guidance before it starts onboarding. + :::{warning} OpenShell Lifecycle For NemoClaw-managed environments, use `nemoclaw onboard` when you need to create or recreate the OpenShell gateway or sandbox. Avoid `openshell self-update`, `npm update -g openshell`, `openshell gateway start --recreate`, or `openshell sandbox create` directly unless you intend to manage OpenShell separately and then rerun `nemoclaw onboard`. diff --git a/docs/get-started/quickstart.md b/docs/get-started/quickstart.md index 9672fe65d1..f47cc55b9a 100644 --- a/docs/get-started/quickstart.md +++ b/docs/get-started/quickstart.md @@ -53,6 +53,19 @@ $ curl -fsSL https://www.nvidia.com/nemoclaw.sh | NEMOCLAW_NON_INTERACTIVE=1 NEM If you use nvm or fnm to manage Node.js, the installer might not update your current shell's PATH. If `nemoclaw` is not found after install, run `source ~/.bashrc` (or `source ~/.zshrc` for zsh) or open a new terminal. +On Linux, the installer checks Docker before it installs NemoClaw. +If Docker is missing, the installer downloads the official Docker convenience script, asks for `sudo`, installs Docker, and starts the Docker service when systemd is available. +If Docker is installed but your current shell cannot use the Docker socket yet, the installer adds your user to the `docker` group when needed and exits with a recovery command. + +```console +$ newgrp docker +$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash +``` + +On DGX Spark and DGX Station, an interactive installer can offer express install after you accept the third-party software notice. +Express install switches onboarding to non-interactive mode, applies the suggested security policy, and selects the managed local inference path for that platform. +Set `NEMOCLAW_NO_EXPRESS=1` to skip the express prompt, or set `NEMOCLAW_PROVIDER` before launching the installer when you want to choose a provider yourself. + The installer auto-launches `nemoclaw onboard` when it can locate the freshly-installed binary. If it cannot locate the binary, or if blocking host preflight checks fail, it does not launch the wizard automatically. In that case, the installer prints the relevant diagnostics and a `To finish setup, run:` block with the explicit `nemoclaw onboard` command. diff --git a/docs/inference/use-local-inference.md b/docs/inference/use-local-inference.md index 6e93bac860..31ebb5bb62 100644 --- a/docs/inference/use-local-inference.md +++ b/docs/inference/use-local-inference.md @@ -70,6 +70,8 @@ When NemoClaw runs inside WSL, the provider menu can include Windows-host Ollama - **Install Ollama on Windows host** when Windows does not have Ollama installed. The install and restart paths set `OLLAMA_HOST=0.0.0.0:11434` on the Windows side so Docker and WSL can reach the daemon through `host.docker.internal`. +After an install or restart action, NemoClaw relaunches Ollama from the detected Windows tray app or verified `ollama.exe` path and waits until `host.docker.internal:11434` responds. +If the daemon does not become reachable, onboarding prints PowerShell commands you can run to inspect the Windows-side process and port state. Use one Ollama instance on port `11434` at a time. If both WSL and Windows-host Ollama are running, pick the intended menu entry during onboarding so NemoClaw validates and pulls models against the right daemon. @@ -292,11 +294,14 @@ $ NEMOCLAW_EXPERIMENTAL=1 nemoclaw onboard Select **Local NVIDIA NIM [experimental]** from the provider list. NemoClaw filters available models by GPU VRAM, pulls the NIM container image, starts it, and waits for it to become healthy before continuing. +On hosts with mixed NVIDIA GPU models, the preflight summary shows each detected GPU model and the total VRAM so you can confirm which device class the model selection used. NIM container images are hosted on `nvcr.io` and require NGC registry authentication before `docker pull` succeeds. If Docker is not already logged in to `nvcr.io`, onboard prompts for an [NGC API key](https://org.ngc.nvidia.com/setup/api-key) and runs `docker login nvcr.io` over `--password-stdin` so the key is never written to disk or shell history. The prompt masks the key during input and retries once on a bad key before failing. In non-interactive mode, onboard exits with login instructions if Docker is not already authenticated; run `docker login nvcr.io` yourself, then re-run `nemoclaw onboard --non-interactive`. +If `NGC_API_KEY` or `NVIDIA_API_KEY` is already exported, NemoClaw passes it into the managed NIM container through the process environment instead of command-line arguments. +If the NIM container exits before the health endpoint becomes ready, onboarding stops early and prints the last container log lines. :::{note} NIM uses vLLM internally. diff --git a/docs/project.json b/docs/project.json index 105e1a8bc7..0f18029178 100644 --- a/docs/project.json +++ b/docs/project.json @@ -1 +1 @@ -{"name": "nemoclaw", "version": "0.0.38"} +{"name": "nemoclaw", "version": "0.0.39"} diff --git a/docs/reference/commands.md b/docs/reference/commands.md index eb896286b0..9d6caf35ff 100644 --- a/docs/reference/commands.md +++ b/docs/reference/commands.md @@ -1099,6 +1099,7 @@ These flags toggle optional behaviors during onboarding; set them before running | Variable | Format | Effect | |----------|--------|--------| | `NEMOCLAW_YES` | `1` to enable | Auto-accepts confirmation prompts (`--yes` equivalent) including in helpers like the Ollama proxy auth setup. | +| `NEMOCLAW_NO_EXPRESS` | `1` to enable | Installer-only. Skips the DGX Spark and DGX Station express install prompt and continues with the normal interactive onboarding flow. | | `NEMOCLAW_EXPERIMENTAL` | `1` to enable | Surfaces experimental providers and flows in onboarding. | | `NEMOCLAW_IGNORE_RUNTIME_RESOURCES` | `1` to enable | Suppresses the under-provisioned runtime warning during preflight. Use only when you know the sandbox host meets the minimums. | | `NEMOCLAW_DISABLE_OVERLAY_FIX` | `1` to enable | Skips the Docker overlay-fix step during sandbox build. For environments where the fix is incompatible. | diff --git a/docs/reference/troubleshooting.md b/docs/reference/troubleshooting.md index ecb42f4f95..b3d7c68573 100644 --- a/docs/reference/troubleshooting.md +++ b/docs/reference/troubleshooting.md @@ -89,6 +89,7 @@ On macOS with Docker Desktop, open the Docker Desktop application and wait for i ### Docker permission denied on Linux On Linux, if the Docker daemon is running but you see "permission denied" errors, your user may not be in the `docker` group. +The installer can add your user to the group, but Linux does not activate that membership in the current shell automatically. Add your user and activate the group in the current shell: ```console @@ -97,6 +98,12 @@ $ newgrp docker ``` Then retry `nemoclaw onboard`. +If the installer stopped after printing `newgrp docker`, run that command and then re-run the installer: + +```console +$ newgrp docker +$ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash +``` ### macOS first-run failures @@ -1057,16 +1064,16 @@ $ nemoclaw rebuild After the rebuild completes, return to the Skills page to confirm the skill status has changed from `blocked` to `ready`. -### `openclaw config set` fails with a permission error on Brev (Shields Up) +### `openclaw config set` fails with a permission error on Brev -When `nemoclaw shields up` has been run, `openclaw.json` is owned by root and mounted read-only inside the sandbox. +When the sandbox config has been locked from the host, `openclaw.json` is owned by root and mounted read-only inside the sandbox. Running `openclaw config set` inside the sandbox then returns: ```text EACCES: permission denied, open '/sandbox/.openclaw/openclaw.json' ``` -In the default sandbox state (before `shields up`), `openclaw.json` is writable by the sandbox user. +In the default sandbox state, `openclaw.json` is writable by the sandbox user. If you see this error, use the host-side config command instead: ```console diff --git a/docs/security/best-practices.md b/docs/security/best-practices.md index b67d4ea215..ce936e4d06 100644 --- a/docs/security/best-practices.md +++ b/docs/security/best-practices.md @@ -283,19 +283,21 @@ See the [Process Controls](https://docs.nvidia.com/openshell/latest/security/bes The entrypoint drops dangerous Linux capabilities from the bounding set at startup using `capsh`. This limits what capabilities any child process (gateway, sandbox, agent) can ever acquire. +When the entrypoint switches from root to the `sandbox` and `gateway` users, it uses `setpriv` when available to remove the remaining privilege-separation capabilities from the child process at the same time as the user change. -The entrypoint drops these capabilities: `cap_net_raw`, `cap_dac_override`, `cap_sys_chroot`, `cap_fsetid`, `cap_setfcap`, `cap_mknod`, `cap_audit_write`, `cap_net_bind_service`. -The entrypoint keeps these because it needs them for privilege separation using gosu: `cap_chown`, `cap_setuid`, `cap_setgid`, `cap_fowner`, `cap_kill`. +The initial entrypoint drop removes `cap_sys_admin`, `cap_sys_ptrace`, `cap_net_raw`, `cap_dac_override`, `cap_sys_chroot`, `cap_fsetid`, `cap_setfcap`, `cap_mknod`, `cap_audit_write`, and `cap_net_bind_service`. +During `setpriv` step-down, the child process also loses `cap_setuid`, `cap_setgid`, `cap_fowner`, `cap_chown`, and `cap_kill`. This is best-effort: if `capsh` is not available or `CAP_SETPCAP` is not in the bounding set, the entrypoint logs a warning and continues with the default capability set. +If `setpriv` is unavailable, the entrypoint falls back to `gosu` and logs a warning that the remaining bounding-set capabilities were retained for the child process. For additional protection, pass `--cap-drop=ALL` with `docker run` or Compose (see [Sandbox Hardening](../deployment/sandbox-hardening.md)). | Aspect | Detail | |---|---| -| Default | The entrypoint drops dangerous capabilities at startup using `capsh`. Best-effort. | +| Default | The entrypoint drops dangerous capabilities at startup using `capsh`, then uses `setpriv` during user step-down when possible. Best-effort. | | What you can change | When launching with `docker run` directly, pass `--cap-drop=ALL --cap-add=NET_BIND_SERVICE` for stricter enforcement. In the standard NemoClaw flow (with `nemoclaw onboard`), the entrypoint handles capability dropping automatically. | -| Risk if relaxed | `CAP_NET_RAW` allows raw socket access for network sniffing. `CAP_DAC_OVERRIDE` bypasses filesystem permission checks. Attackers can use `CAP_SYS_CHROOT` in container escape chains. If `capsh` is unavailable, the container runs with the default Docker capability set. | -| Recommendation | Run on an image that includes `capsh` (the NemoClaw image includes it through `libcap2-bin`). For defense-in-depth, also pass `--cap-drop=ALL` at the container runtime level. | +| Risk if relaxed | `CAP_SYS_ADMIN` and `CAP_SYS_PTRACE` expand kernel and process attack surface. `CAP_NET_RAW` allows raw socket access for network sniffing. `CAP_DAC_OVERRIDE` bypasses filesystem permission checks. If `capsh` or `setpriv` cannot run, the container retains more of the runtime-provided capability set. | +| Recommendation | Run on an image that includes `capsh` and `setpriv` (the NemoClaw image includes them). For defense-in-depth, also pass `--cap-drop=ALL` at the container runtime level. | ### Gateway Process Isolation diff --git a/docs/versions1.json b/docs/versions1.json index 753003454e..81d18a34f1 100644 --- a/docs/versions1.json +++ b/docs/versions1.json @@ -1,6 +1,10 @@ [ { "preferred": true, + "version": "0.0.39", + "url": "https://docs.nvidia.com/nemoclaw/0.0.39/" + }, + { "version": "0.0.38", "url": "https://docs.nvidia.com/nemoclaw/0.0.38/" },