-
Notifications
You must be signed in to change notification settings - Fork 2.8k
chore: skills signing batch 4 #4465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
120 changes: 12 additions & 108 deletions
120
.agents/skills/nemoclaw-user-deploy-remote/evals/evals.json
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| # Evaluation Report | ||
|
|
||
| Evaluation of the `nemoclaw-user-deploy-remote` skill before publication through NVSkills-Eval. | ||
|
|
||
| This benchmark summarizes 3-Tier Evaluation from NVSkills-Eval results for the skill. The goal is to document whether the skill is safe, discoverable, effective, and useful for agents before it is published for broader workflow use. | ||
|
|
||
| ## Evaluation Summary | ||
|
|
||
| - Skill: `nemoclaw-user-deploy-remote` | ||
| - Evaluation date: 2026-05-28 | ||
| - NVSkills-Eval profile: `external` | ||
| - Overall verdict: FAIL | ||
| - Tier 3 live agent evaluation: not available in this report | ||
|
|
||
| ## Agents Used | ||
|
|
||
| - Tier 3 agent details were not available in this report. | ||
|
|
||
| ## Metrics Used | ||
|
|
||
| Reported benchmark dimensions: | ||
|
|
||
| - Security: checks whether skill-assisted execution avoids unsafe behavior such as secret leakage, destructive commands, or unauthorized access. | ||
| - Correctness: checks whether the agent follows the expected workflow and produces the correct final output. | ||
| - Discoverability: checks whether the agent loads the skill when relevant and avoids using it when irrelevant. | ||
| - Effectiveness: checks whether the agent performs measurably better with the skill than without it. | ||
| - Efficiency: checks whether the agent uses fewer tokens and avoids redundant work. | ||
|
|
||
| Underlying evaluation signals used in this run: | ||
|
|
||
| - No Tier 3 evaluation signal details were available in this report. | ||
|
|
||
| ## Test Tasks | ||
|
|
||
| Tier 3 evaluation task details were not available in this report. | ||
|
|
||
| ## Results | ||
|
|
||
| Tier 3 dimension rollup was not available in this report. | ||
|
|
||
| ## Tier 1: Static Validation Summary | ||
|
|
||
| Tier 1 validation passed with observations. NVSkills-Eval ran 9 checks and found 13 total findings. | ||
|
|
||
| Top findings: | ||
|
|
||
| - MEDIUM QUALITY/quality_correctness: SKILL_SPEC recommended field missing: 'metadata.author' (`skills/nemoclaw-user-deploy-remote/SKILL.md`) | ||
| - MEDIUM QUALITY/quality_correctness: SKILL_SPEC recommended field missing: 'metadata.tags' (`skills/nemoclaw-user-deploy-remote/SKILL.md`) | ||
| - MEDIUM QUALITY/quality_efficiency: Deeply nested references in brev-web-ui.md (`skills/nemoclaw-user-deploy-remote/SKILL.md`) | ||
| - MEDIUM SCHEMA/body_recommended_section: Missing recommended section: '## Instructions' (`skills/nemoclaw-user-deploy-remote/SKILL.md`) | ||
| - MEDIUM SCHEMA/body_recommended_section: Missing recommended section: '## Examples' (`skills/nemoclaw-user-deploy-remote/SKILL.md`) | ||
|
|
||
| ## Tier 2: Deduplication Summary | ||
|
|
||
| Tier 2 validation reported findings. NVSkills-Eval ran 2 checks and found 2 total findings. | ||
|
|
||
| Top findings: | ||
|
|
||
| - HIGH DUPLICATE/duplicate: Duplicate content found within references/install-openclaw-plugins.md: | ||
| "## Network Access" in references/install-openclaw-plugins.md (lines 64-73) | ||
| vs "## Next Steps" in references/install-openclaw-plugins.md (lines 86-93) (`references/install-openclaw-plugins.md:64`) | ||
| - HIGH DUPLICATE/duplicate: Duplicate content found across SKILL.md and references/brev-web-ui.md and references/install-openclaw-plugins.md and references/sandbox-hardening.md: | ||
| "(preamble)" in SKILL.md (lines 1-3) | ||
| vs "(preamble)" in references/brev-web-ui.md (lines 1-2) | ||
| vs "(preamble)" in references/install-openclaw-plugins.md (lines 1-2) | ||
| vs "(preamble)" in references/sandbox-hardening.md (lines 1-2) (`SKILL.md:1`) | ||
|
|
||
| ## Publication Recommendation | ||
|
|
||
| The skill should be reviewed before NVSkills-Eval publication. Skill owners should address the findings above and rerun NVSkills-Eval to refresh this benchmark. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,177 @@ | ||
| --- | ||
| name: "nemoclaw-user-deploy-remote" | ||
| description: "Explains how to run NemoClaw on a remote GPU instance, including the deprecated Brev compatibility path and the preferred installer plus onboard flow. Use when deploying NemoClaw to a remote VM, onboarding a Brev instance, or migrating away from the legacy `nemoclaw deploy` wrapper. Trigger keywords - deploy nemoclaw remote gpu, nemoclaw brev cloud deployment, nemoclaw plugins, openclaw plugins, install openclaw plugin, nemoclaw onboard from dockerfile, nemoclaw brev web ui, nemoclaw getting started, brev quickstart, nvidia nemotron agent, nemoclaw sandbox hardening, container security, docker capabilities, process limits." | ||
| license: "Apache-2.0" | ||
| --- | ||
|
|
||
| <!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. --> | ||
| <!-- SPDX-License-Identifier: Apache-2.0 --> | ||
|
|
||
| # Deploy NemoClaw to a Remote GPU Instance | ||
|
|
||
| ## Gotchas | ||
|
|
||
| - The `nemoclaw deploy` command is deprecated. | ||
| - On Brev, set `CHAT_UI_URL` in the launchable environment configuration so it is available when the installer builds the sandbox image. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - The [Brev CLI](https://brev.nvidia.com) installed and authenticated. | ||
| - A provider credential for the inference backend you want to use during onboarding. | ||
| - `HF_TOKEN` or `HUGGING_FACE_HUB_TOKEN` exported when your remote vLLM or Hugging Face workflow needs access to gated models. | ||
| - NemoClaw installed locally if you plan to use the deprecated `nemoclaw deploy` wrapper. Otherwise, install NemoClaw directly on the remote host after provisioning it. | ||
|
|
||
| Run NemoClaw on a remote GPU instance through [Brev](https://brev.nvidia.com). | ||
| The preferred path is to provision the VM, run the standard NemoClaw installer on that host, and then run `nemoclaw onboard`. | ||
|
|
||
| ## Quick Start | ||
|
|
||
| If your Brev instance is already up and has already been onboarded with a sandbox, start with the standard sandbox chat flow: | ||
|
|
||
| ```console | ||
| $ nemoclaw my-assistant connect | ||
| $ openclaw tui | ||
| ``` | ||
|
|
||
| This gets you into the sandbox shell first and opens the OpenClaw chat UI right away. | ||
| If the VM is fresh, run the standard installer on that host and then run `nemoclaw onboard` before trying `nemoclaw my-assistant connect`. | ||
|
|
||
| If you are connecting from your local machine and still need to provision the remote VM, you can still use `nemoclaw deploy <instance-name>` as the legacy compatibility path described below. | ||
|
|
||
| ## Deploy the Instance | ||
|
|
||
| **Warning:** | ||
|
|
||
| The `nemoclaw deploy` command is deprecated. | ||
| Prefer provisioning the remote host separately, then running the standard NemoClaw installer and `nemoclaw onboard` on that host. | ||
|
|
||
| Create a Brev instance and run the legacy compatibility flow: | ||
|
|
||
| ```console | ||
| $ nemoclaw deploy <instance-name> | ||
| ``` | ||
|
|
||
| Replace `<instance-name>` with a name for your remote instance, for example `my-gpu-box`. | ||
| The sandbox created on the remote VM uses `NEMOCLAW_SANDBOX_NAME`, or `my-assistant` when the variable is unset. | ||
| Sandbox names must be lowercase, start with a letter, contain only letters, numbers, and internal hyphens, and end with a letter or number. | ||
| The deploy wrapper validates the sandbox name before it provisions the Brev instance, opens SSH, or starts the remote installer. | ||
|
|
||
| The legacy compatibility flow performs the following steps on the VM: | ||
|
|
||
| 1. Installs Docker and the NVIDIA Container Toolkit if a GPU is present. | ||
| 2. Installs the OpenShell CLI. | ||
| 3. Runs `nemoclaw onboard` (the setup wizard) to create the gateway, register providers, and launch the sandbox. | ||
| 4. Starts optional host auxiliary services (for example the cloudflared tunnel) when `cloudflared` is available. Channel messaging is configured during onboarding and runs through OpenShell-managed processes, not through `nemoclaw tunnel start`. | ||
|
|
||
| By default, the compatibility wrapper asks Brev to provision on `gcp`. Override this with `NEMOCLAW_BREV_PROVIDER` if you need a different Brev cloud provider. | ||
| If you export `HF_TOKEN` or `HUGGING_FACE_HUB_TOKEN`, the wrapper forwards those values to the VM so remote setup can pull gated Hugging Face model repositories. | ||
|
|
||
| ## Connect to the Remote Sandbox | ||
|
|
||
| After deployment finishes, the deploy command opens an interactive shell inside the remote sandbox. | ||
| To reconnect after closing the session, run the command again: | ||
|
|
||
| ```console | ||
| $ nemoclaw deploy <instance-name> | ||
| ``` | ||
|
|
||
| ## Monitor the Remote Sandbox | ||
|
|
||
| SSH to the instance and run the OpenShell TUI to monitor activity and approve network requests: | ||
|
|
||
| ```console | ||
| $ ssh <instance-name> 'cd ~/nemoclaw && set -a && . .env && set +a && openshell term' | ||
| ``` | ||
|
|
||
| ## Verify Inference | ||
|
|
||
| Run a test agent prompt inside the remote sandbox: | ||
|
|
||
| ```console | ||
| $ openclaw agent --agent main -m "Hello from the remote sandbox" --session-id test | ||
| ``` | ||
|
|
||
| ## Remote Dashboard Access | ||
|
|
||
| The NemoClaw dashboard validates the browser origin against an allowlist baked | ||
| into the sandbox image at build time. By default the allowlist only contains | ||
| `http://127.0.0.1:18789`. When accessing the dashboard from a remote browser | ||
| (for example through a Brev public URL or an SSH port-forward), set | ||
| `CHAT_UI_URL` to the origin the browser will use **before** running setup: | ||
|
|
||
| ```console | ||
| $ export CHAT_UI_URL="https://openclaw0-<id>.brevlab.com" | ||
| $ nemoclaw deploy <instance-name> | ||
| ``` | ||
|
|
||
| For SSH port-forwarding, the origin is typically `http://127.0.0.1:18789` (the | ||
| default), so no extra configuration is needed. | ||
|
|
||
| **Warning:** | ||
|
|
||
| On Brev, set `CHAT_UI_URL` in the launchable environment configuration so it is | ||
| available when the installer builds the sandbox image. If `CHAT_UI_URL` is not | ||
| set on a headless host, the compatibility wrapper prints a warning. | ||
|
|
||
| `NEMOCLAW_DISABLE_DEVICE_AUTH` is also evaluated at image build time. | ||
| When `CHAT_UI_URL` points at a non-loopback origin, NemoClaw disables OpenClaw device pairing in the generated sandbox configuration because browser-only remote users cannot complete terminal-based pairing. | ||
| Any device that can reach the configured dashboard origin can connect without pairing, so avoid exposing that origin on internet-reachable or shared-network deployments. | ||
|
|
||
| ## First-Run Readiness Budget | ||
|
|
||
| On a remote GPU host, the first `nemoclaw onboard` typically does the slowest work of the lifecycle: the sandbox image is built locally and uploaded into the OpenShell gateway, which can stream hundreds of MiB over the VM's link before the readiness wait even starts. | ||
| The post-create readiness wait defaults to 180 seconds (`NEMOCLAW_SANDBOX_READY_TIMEOUT`), which is sized for warm-cache, workstation-class onboarding and can be exceeded on: | ||
|
|
||
| - DGX Station first runs with large quantised models (70B+ parameter footprints, NVFP4 weights). | ||
| - Cloud VMs where the local image-build cache is cold and the upload runs over the public network. | ||
| - Hosts onboarding the Brave Web Search preset on the first run (the egress policy stack adds boot work). | ||
|
|
||
| Raise the budget before re-running onboard: | ||
|
|
||
| ```console | ||
| $ export NEMOCLAW_SANDBOX_READY_TIMEOUT=600 | ||
| $ nemoclaw onboard | ||
| ``` | ||
|
|
||
| If onboard ends with `Sandbox '<name>' was created but did not become ready within 180s`, onboard deletes the partially-created sandbox first, so the next attempt with the raised budget starts from a clean state. | ||
| For the inference-probe budget that runs earlier in onboarding, see `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` (use the `nemoclaw-user-configure-inference` skill). | ||
|
|
||
| ## Proxy Configuration | ||
|
|
||
| NemoClaw routes sandbox traffic through a gateway proxy that defaults to `10.200.0.1:3128`. | ||
| If your network requires a different proxy, set `NEMOCLAW_PROXY_HOST` and `NEMOCLAW_PROXY_PORT` before onboarding: | ||
|
|
||
| ```console | ||
| $ export NEMOCLAW_PROXY_HOST=proxy.example.com | ||
| $ export NEMOCLAW_PROXY_PORT=8080 | ||
| $ nemoclaw onboard | ||
| ``` | ||
|
|
||
| These values are baked into the sandbox image at build time. | ||
| They are also forwarded into the runtime container during sandbox creation, so `/tmp/nemoclaw-proxy-env.sh` uses the same host and port that the image build used. | ||
| Only alphanumeric characters, dots, hyphens, and colons are accepted for the host. | ||
| The port must be numeric (0-65535). | ||
| Changing the proxy after onboarding requires re-running `nemoclaw onboard`. | ||
|
|
||
| ## GPU Configuration | ||
|
|
||
| The deploy script uses the `NEMOCLAW_GPU` environment variable to select the GPU type. | ||
| The default value is `a2-highgpu-1g:nvidia-tesla-a100:1`. | ||
| Set this variable before running `nemoclaw deploy` to use a different GPU configuration: | ||
|
|
||
| ```console | ||
| $ export NEMOCLAW_GPU="a2-highgpu-1g:nvidia-tesla-a100:2" | ||
| $ nemoclaw deploy <instance-name> | ||
| ``` | ||
|
|
||
| ## References | ||
|
|
||
| - **Load [references/install-openclaw-plugins.md](references/install-openclaw-plugins.md)** when users ask how to install, build, or configure OpenClaw plugins under NemoClaw. Explains the difference between OpenClaw plugins and agent skills, and shows the current Dockerfile-based workflow for baking a plugin into a NemoClaw sandbox. | ||
| - **Load [references/brev-web-ui.md](references/brev-web-ui.md)** when a user wants to try NemoClaw without installing the CLI, or asks how to get started on Brev. Guides users through deploying NemoClaw with the Brev web UI. | ||
| - **Load [references/sandbox-hardening.md](references/sandbox-hardening.md)** when reviewing sandbox image security controls, auditing capability drops, or looking up the runtime resource limits. Includes the sandbox container image hardening reference, covering Docker capabilities and process limits. | ||
|
|
||
| ## Related Skills | ||
|
|
||
| - `nemoclaw-user-manage-sandboxes` — Set Up Messaging Channels (use the `nemoclaw-user-manage-sandboxes` skill) to connect Telegram, Discord, or Slack through OpenShell-managed channel messaging | ||
| - `nemoclaw-user-monitor-sandbox` — Monitor Sandbox Activity (use the `nemoclaw-user-monitor-sandbox` skill) for sandbox monitoring tools | ||
| - `nemoclaw-user-reference` — Commands (use the `nemoclaw-user-reference` skill) for the full `deploy` command reference | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix product-name casing in metadata keywords/description.
Line 3 uses
nvidiaandopenclawin lowercase. Please normalize toNVIDIAandOpenClaw(and keepNemoClaw/OpenShellexact casing everywhere in metadata too).As per coding guidelines, "NVIDIA must be all caps (not Nvidia, nvidia)." and "NemoClaw, OpenClaw, and OpenShell must use correct casing."
🤖 Prompt for AI Agents