From 23cdaa40cb4ea37133365e37cef182a93190d9b0 Mon Sep 17 00:00:00 2001 From: ColinM-sys Date: Thu, 9 Apr 2026 01:18:29 -0400 Subject: [PATCH 1/4] docs(k8s): document evaluation-only patterns and production alternatives MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The k8s/README.md calls the manifest "experimental" but does not spell out which specific patterns are unsafe in production. A user deploying to a real cluster has no way to know that `privileged: true`, `DOCKER_TLS_CERTDIR=""`, `POLICY_MODE=skip`, `curl | bash` at pod start, the `dummy` placeholder API key, the absence of any NetworkPolicy, and the absence of resource limits are all *intentional* tradeoffs for a kubectl-apply-and-try-it-out flow — and not a production blueprint. Add k8s/SECURITY.md, walking through every risky pattern in the manifest, why it is unsafe in production, and what a production alternative would look like. Cross-link from k8s/README.md so the warning is discoverable from the existing entry point. Refs: #1442 Signed-off-by: ColinM-sys --- k8s/README.md | 2 + k8s/SECURITY.md | 188 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 190 insertions(+) create mode 100644 k8s/SECURITY.md diff --git a/k8s/README.md b/k8s/README.md index 183c8f18ca..add8096d6d 100644 --- a/k8s/README.md +++ b/k8s/README.md @@ -1,6 +1,8 @@ # NemoClaw on Kubernetes > **⚠️ Experimental**: This deployment method is intended for **trying out NemoClaw on Kubernetes**, not for production use. It requires a **privileged pod** running **Docker-in-Docker (DinD)** to create isolated sandbox environments. Operational requirements (storage, runtime, security policies) vary by cluster configuration. +> +> See **[SECURITY.md](./SECURITY.md)** for the specific patterns that make this manifest evaluation-only and what a production-ready deployment would look like instead. The sample manifest now uses a few safer defaults out of the box: diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md new file mode 100644 index 0000000000..498cceb4b2 --- /dev/null +++ b/k8s/SECURITY.md @@ -0,0 +1,188 @@ + + + +# Kubernetes Deployment — Security Considerations + +> **The manifest in [`nemoclaw-k8s.yaml`](./nemoclaw-k8s.yaml) is for evaluation only. Do not run it as-is in a production cluster.** + +The existing `k8s/README.md` already calls the deployment "experimental", +but the specific patterns that make it experimental are not spelled out. +This page lists each one, why it is unsafe in production, and what a +production-ready alternative would look like. It addresses the gap +flagged in [#1442](https://github.com/NVIDIA/NemoClaw/issues/1442). + +## What the evaluation manifest does + +The pod runs **two containers** plus an init container: + +| Container | Image | Purpose | +|---|---|---| +| `dind` | `docker:24-dind` | Docker-in-Docker daemon. Required because OpenShell sandboxes are Docker containers and a sandbox-on-sandbox needs a real daemon. | +| `workspace` | `node:22` | Runs the official NemoClaw installer over the DinD socket. | +| `init-docker-config` | `busybox` | Writes `daemon.json` so DinD uses host cgroup namespacing. | + +That arrangement is the simplest possible way to get NemoClaw onto a +Kubernetes cluster — and also the most dangerous one. The patterns +below are intentional for an *evaluation* deployment but would be +unacceptable in *production*. + +## Security risks in the evaluation manifest + +### 1. `privileged: true` on the DinD container + +```yaml +securityContext: + privileged: true +``` + +A privileged container has effectively **no isolation from the node**. +It can load kernel modules, mount the host filesystem, access every +device, and (with a single misstep) escalate to full node compromise. +This is required to run a nested Docker daemon — the daemon needs +unrestricted access to cgroups, namespaces, and `/var/lib/docker` — +but it means a successful exploit inside the sandbox escalates not +just to the pod but to the entire node. + +**Production alternative:** run the sandbox container directly on the +host's container runtime via a CSI driver or a runtime class +(`runc`, `kata`, `gvisor`), and skip DinD entirely. NemoClaw's +OpenShell runtime does not require Docker-in-Docker if the host +already has a compatible runtime. + +### 2. Docker TLS disabled + +```yaml +env: + - name: DOCKER_TLS_CERTDIR + value: "" +``` + +Setting `DOCKER_TLS_CERTDIR=""` makes the DinD daemon listen on a +plain Unix socket with no client authentication. Any process inside +the workspace container that can reach `/var/run/docker.sock` can +issue arbitrary Docker API calls — including `docker run -v /:/host` +to escape the sandbox. + +**Production alternative:** leave `DOCKER_TLS_CERTDIR` at its default +so the daemon issues client certs, then mount only the certs (not the +socket) into the workspace container. + +### 3. `NEMOCLAW_POLICY_MODE=skip` + +```yaml +- name: NEMOCLAW_POLICY_MODE + value: "skip" +``` + +`POLICY_MODE=skip` disables NemoClaw's network policy enforcement +inside the sandbox. The agent inside the sandbox can reach **any** +host on the cluster network, exfiltrate data, or pivot to other +services. Policies (`pypi`, `npm`, `github`, `huggingface`, etc.) +have zero effect. + +**Production alternative:** drop the env var (or set +`NEMOCLAW_POLICY_MODE=enforce`) and pick the smallest set of policy +presets the agent actually needs during onboard. + +### 4. `curl | bash` installer over the network + +```yaml +command: + - bash + - -c + - | + ... + curl -fsSL https://nvidia.com/nemoclaw.sh | bash +``` + +Pulling the installer over the network at pod start time means the +deployed version of NemoClaw is whatever is live on +`nvidia.com/nemoclaw.sh` at the moment the pod boots. There is no +checksum verification, no version pinning, and no offline path. A +compromise of the installer URL or a transient redirect is a one-shot +supply-chain compromise of every pod that ever restarts. + +**Production alternative:** build a NemoClaw image at a known tag, +publish it to your own registry pinned by digest (see #1438), and +deploy that image instead of running the installer at pod start. + +### 5. Placeholder API key + +```yaml +- name: COMPATIBLE_API_KEY + value: "dummy" +``` + +The manifest hardcodes a placeholder credential. In a production +deployment this needs to be a real key, sourced from a Kubernetes +`Secret`, not an environment variable in plain YAML. + +**Production alternative:** + +```yaml +- name: COMPATIBLE_API_KEY + valueFrom: + secretKeyRef: + name: nemoclaw-credentials + key: compatible-api-key +``` + +### 6. No `NetworkPolicy` + +The pod has no Kubernetes `NetworkPolicy` attached. With the default +"allow all" cluster behavior, the workspace container can reach any +service in the cluster — including the kube-apiserver — via the +node's cluster network, and `POLICY_MODE=skip` removes the +NemoClaw-side guardrail too. + +**Production alternative:** ship a default-deny `NetworkPolicy` for +the `nemoclaw` namespace and explicitly allow only the inference +endpoint and DNS. + +### 7. No `limits` (only `requests`) + +```yaml +resources: + requests: + memory: "8Gi" + cpu: "2" +``` + +Without `resources.limits`, a runaway agent or a memory leak in the +sandbox can consume unbounded CPU and memory on the node, causing +OOMKills of unrelated workloads. This is the gap flagged in +[#1447](https://github.com/NVIDIA/NemoClaw/issues/1447). + +**Production alternative:** + +```yaml +resources: + requests: + memory: "8Gi" + cpu: "2" + limits: + memory: "16Gi" + cpu: "4" +``` + +## Minimum bar for production + +If you need to run NemoClaw on a real Kubernetes cluster, none of the +above is acceptable as-is. At a minimum: + +1. **Drop `privileged: true`.** Use a runtime class instead of DinD. +2. **Build and pin a NemoClaw image** by digest. Do not `curl | bash` + at pod start. +3. **Source credentials from `Secret` resources**, not env vars. +4. **Set `NEMOCLAW_POLICY_MODE=enforce`** and select only the policy + presets the agent actually needs. +5. **Attach a default-deny `NetworkPolicy`** to the `nemoclaw` + namespace. +6. **Set `resources.limits`** so a sandbox cannot starve the node. +7. **Add `livenessProbe` / `readinessProbe`** so kubelet can detect + and restart unhealthy pods. + +The current manifest deliberately ships **none** of those because it +optimizes for "kubectl apply and try it out". That tradeoff is fine +for evaluation, dangerous for production, and the reason this page +exists. From 71b896f183dbabc5bd2c9e7eb059f40a7682ff9e Mon Sep 17 00:00:00 2001 From: ColinM-sys Date: Tue, 14 Apr 2026 22:25:41 -0400 Subject: [PATCH 2/4] fix: capitalize GitHub in SECURITY.md per CodeRabbit Signed-off-by: ColinM-sys --- k8s/SECURITY.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md index 498cceb4b2..8b6f13071a 100644 --- a/k8s/SECURITY.md +++ b/k8s/SECURITY.md @@ -77,7 +77,7 @@ socket) into the workspace container. `POLICY_MODE=skip` disables NemoClaw's network policy enforcement inside the sandbox. The agent inside the sandbox can reach **any** host on the cluster network, exfiltrate data, or pivot to other -services. Policies (`pypi`, `npm`, `github`, `huggingface`, etc.) +services. Policies (`pypi`, `npm`, `GitHub`, `huggingface`, etc.) have zero effect. **Production alternative:** drop the env var (or set From 9af82530294b93fdcb9cd562806c6e85e3ee9eb8 Mon Sep 17 00:00:00 2001 From: ColinM-sys Date: Wed, 15 Apr 2026 11:06:12 -0400 Subject: [PATCH 3/4] docs(k8s): update SECURITY.md to reflect current manifest improvements per CodeRabbit Signed-off-by: ColinM-sys --- k8s/SECURITY.md | 81 ++++++++++++++++++++++++++----------------------- 1 file changed, 43 insertions(+), 38 deletions(-) diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md index 8b6f13071a..b26bc8341a 100644 --- a/k8s/SECURITY.md +++ b/k8s/SECURITY.md @@ -67,66 +67,69 @@ to escape the sandbox. so the daemon issues client certs, then mount only the certs (not the socket) into the workspace container. -### 3. `NEMOCLAW_POLICY_MODE=skip` +### 3. `NEMOCLAW_POLICY_MODE=suggested` ```yaml - name: NEMOCLAW_POLICY_MODE - value: "skip" + value: "suggested" ``` -`POLICY_MODE=skip` disables NemoClaw's network policy enforcement -inside the sandbox. The agent inside the sandbox can reach **any** -host on the cluster network, exfiltrate data, or pivot to other -services. Policies (`pypi`, `npm`, `GitHub`, `huggingface`, etc.) -have zero effect. - -**Production alternative:** drop the env var (or set -`NEMOCLAW_POLICY_MODE=enforce`) and pick the smallest set of policy +The current manifest uses `suggested` — a permissive mode that +applies NemoClaw's suggested policy presets without strictly +enforcing them. This is a meaningful improvement over the previous +`skip` default (which disabled policy enforcement entirely), but it +is still not the strictest setting. For production workloads +handling sensitive data, reduce the allowed policy set to only the presets the agent actually needs during onboard. -### 4. `curl | bash` installer over the network +### 4. Installer pulled over the network at pod start + +The manifest now downloads the installer to a local file with +HTTPS-only curl flags before executing: ```yaml -command: - - bash - - -c - - | - ... - curl -fsSL https://nvidia.com/nemoclaw.sh | bash +curl --proto '=https' --tlsv1.2 --fail --show-error --silent \ + --location \ + --output /tmp/nemoclaw-install.sh \ + https://www.nvidia.com/nemoclaw.sh +chmod 700 /tmp/nemoclaw-install.sh +bash /tmp/nemoclaw-install.sh ``` -Pulling the installer over the network at pod start time means the -deployed version of NemoClaw is whatever is live on -`nvidia.com/nemoclaw.sh` at the moment the pod boots. There is no -checksum verification, no version pinning, and no offline path. A -compromise of the installer URL or a transient redirect is a one-shot -supply-chain compromise of every pod that ever restarts. +This is better than the original `curl | bash` — the download and +execute are now separate steps, TLS 1.2+ is enforced, and HTTP is +rejected. However, the installer script itself is still pulled at +pod start with no checksum verification and no version pinning. A +compromise of the installer URL or a transient redirect is still a +one-shot supply-chain compromise of every pod that ever restarts. **Production alternative:** build a NemoClaw image at a known tag, publish it to your own registry pinned by digest (see #1438), and deploy that image instead of running the installer at pod start. -### 5. Placeholder API key +### 5. API key handling -```yaml -- name: COMPATIBLE_API_KEY - value: "dummy" -``` - -The manifest hardcodes a placeholder credential. In a production -deployment this needs to be a real key, sourced from a Kubernetes -`Secret`, not an environment variable in plain YAML. - -**Production alternative:** +The manifest now loads `COMPATIBLE_API_KEY` from an optional +Kubernetes `Secret` with a `dummy` fallback in startup shell logic +for unauthenticated endpoints like local Dynamo/vLLM: ```yaml - name: COMPATIBLE_API_KEY valueFrom: secretKeyRef: - name: nemoclaw-credentials - key: compatible-api-key + name: nemoclaw-compatible-api-key + key: api-key + optional: true ``` +This is the correct pattern for production. The `optional: true` +flag allows the manifest to deploy without the Secret (useful for +evaluation against open endpoints), and the startup shell assigns +`dummy` when the Secret is absent so the CLI's credential +validation does not block startup. For production, create the +Secret with a real key before applying the manifest — see the +step-by-step in [README.md](./README.md). + ### 6. No `NetworkPolicy` The pod has no Kubernetes `NetworkPolicy` attached. With the default @@ -174,8 +177,10 @@ above is acceptable as-is. At a minimum: 2. **Build and pin a NemoClaw image** by digest. Do not `curl | bash` at pod start. 3. **Source credentials from `Secret` resources**, not env vars. -4. **Set `NEMOCLAW_POLICY_MODE=enforce`** and select only the policy - presets the agent actually needs. +4. **Reduce the policy preset set.** The manifest already uses + `NEMOCLAW_POLICY_MODE=suggested` (a permissive but non-skip + default). Narrow the suggested presets to only what the agent + actually needs during onboard. 5. **Attach a default-deny `NetworkPolicy`** to the `nemoclaw` namespace. 6. **Set `resources.limits`** so a sandbox cannot starve the node. From 01c2f937dc78c245102c815cd4cd87bb0f1f4343 Mon Sep 17 00:00:00 2001 From: ColinM-sys Date: Wed, 15 Apr 2026 11:08:05 -0400 Subject: [PATCH 4/4] docs(k8s): clarify POLICY_MODE=suggested semantics in NetworkPolicy section Signed-off-by: ColinM-sys --- k8s/SECURITY.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md index b26bc8341a..253d4b71ef 100644 --- a/k8s/SECURITY.md +++ b/k8s/SECURITY.md @@ -135,8 +135,10 @@ step-by-step in [README.md](./README.md). The pod has no Kubernetes `NetworkPolicy` attached. With the default "allow all" cluster behavior, the workspace container can reach any service in the cluster — including the kube-apiserver — via the -node's cluster network, and `POLICY_MODE=skip` removes the -NemoClaw-side guardrail too. +node's cluster network. `NEMOCLAW_POLICY_MODE=suggested` (the +current default) weakens the NemoClaw-side guardrails but does not +fully disable them, so the remaining gap is at the cluster network +layer. **Production alternative:** ship a default-deny `NetworkPolicy` for the `nemoclaw` namespace and explicitly allow only the inference