From 23cdaa40cb4ea37133365e37cef182a93190d9b0 Mon Sep 17 00:00:00 2001
From: ColinM-sys <cmcdonough@50words.com>
Date: Thu, 9 Apr 2026 01:18:29 -0400
Subject: [PATCH 1/4] docs(k8s): document evaluation-only patterns and
 production alternatives
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The k8s/README.md calls the manifest "experimental" but does not
spell out which specific patterns are unsafe in production. A user
deploying to a real cluster has no way to know that
`privileged: true`, `DOCKER_TLS_CERTDIR=""`, `POLICY_MODE=skip`,
`curl | bash` at pod start, the `dummy` placeholder API key, the
absence of any NetworkPolicy, and the absence of resource limits
are all *intentional* tradeoffs for a kubectl-apply-and-try-it-out
flow — and not a production blueprint.

Add k8s/SECURITY.md, walking through every risky pattern in the
manifest, why it is unsafe in production, and what a production
alternative would look like. Cross-link from k8s/README.md so the
warning is discoverable from the existing entry point.

Refs: #1442
Signed-off-by: ColinM-sys <cmcdonough@50words.com>
---
 k8s/README.md   |   2 +
 k8s/SECURITY.md | 188 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 190 insertions(+)
 create mode 100644 k8s/SECURITY.md

diff --git a/k8s/README.md b/k8s/README.md
index 183c8f18ca..add8096d6d 100644
--- a/k8s/README.md
+++ b/k8s/README.md
@@ -1,6 +1,8 @@
 # NemoClaw on Kubernetes
 
 > **⚠️ Experimental**: This deployment method is intended for **trying out NemoClaw on Kubernetes**, not for production use. It requires a **privileged pod** running **Docker-in-Docker (DinD)** to create isolated sandbox environments. Operational requirements (storage, runtime, security policies) vary by cluster configuration.
+>
+> See **[SECURITY.md](./SECURITY.md)** for the specific patterns that make this manifest evaluation-only and what a production-ready deployment would look like instead.
 
 The sample manifest now uses a few safer defaults out of the box:
 
diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md
new file mode 100644
index 0000000000..498cceb4b2
--- /dev/null
+++ b/k8s/SECURITY.md
@@ -0,0 +1,188 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
+# Kubernetes Deployment — Security Considerations
+
+> **The manifest in [`nemoclaw-k8s.yaml`](./nemoclaw-k8s.yaml) is for evaluation only. Do not run it as-is in a production cluster.**
+
+The existing `k8s/README.md` already calls the deployment "experimental",
+but the specific patterns that make it experimental are not spelled out.
+This page lists each one, why it is unsafe in production, and what a
+production-ready alternative would look like. It addresses the gap
+flagged in [#1442](https://github.com/NVIDIA/NemoClaw/issues/1442).
+
+## What the evaluation manifest does
+
+The pod runs **two containers** plus an init container:
+
+| Container | Image | Purpose |
+|---|---|---|
+| `dind` | `docker:24-dind` | Docker-in-Docker daemon. Required because OpenShell sandboxes are Docker containers and a sandbox-on-sandbox needs a real daemon. |
+| `workspace` | `node:22` | Runs the official NemoClaw installer over the DinD socket. |
+| `init-docker-config` | `busybox` | Writes `daemon.json` so DinD uses host cgroup namespacing. |
+
+That arrangement is the simplest possible way to get NemoClaw onto a
+Kubernetes cluster — and also the most dangerous one. The patterns
+below are intentional for an *evaluation* deployment but would be
+unacceptable in *production*.
+
+## Security risks in the evaluation manifest
+
+### 1. `privileged: true` on the DinD container
+
+```yaml
+securityContext:
+  privileged: true
+```
+
+A privileged container has effectively **no isolation from the node**.
+It can load kernel modules, mount the host filesystem, access every
+device, and (with a single misstep) escalate to full node compromise.
+This is required to run a nested Docker daemon — the daemon needs
+unrestricted access to cgroups, namespaces, and `/var/lib/docker` —
+but it means a successful exploit inside the sandbox escalates not
+just to the pod but to the entire node.
+
+**Production alternative:** run the sandbox container directly on the
+host's container runtime via a CSI driver or a runtime class
+(`runc`, `kata`, `gvisor`), and skip DinD entirely. NemoClaw's
+OpenShell runtime does not require Docker-in-Docker if the host
+already has a compatible runtime.
+
+### 2. Docker TLS disabled
+
+```yaml
+env:
+  - name: DOCKER_TLS_CERTDIR
+    value: ""
+```
+
+Setting `DOCKER_TLS_CERTDIR=""` makes the DinD daemon listen on a
+plain Unix socket with no client authentication. Any process inside
+the workspace container that can reach `/var/run/docker.sock` can
+issue arbitrary Docker API calls — including `docker run -v /:/host`
+to escape the sandbox.
+
+**Production alternative:** leave `DOCKER_TLS_CERTDIR` at its default
+so the daemon issues client certs, then mount only the certs (not the
+socket) into the workspace container.
+
+### 3. `NEMOCLAW_POLICY_MODE=skip`
+
+```yaml
+- name: NEMOCLAW_POLICY_MODE
+  value: "skip"
+```
+
+`POLICY_MODE=skip` disables NemoClaw's network policy enforcement
+inside the sandbox. The agent inside the sandbox can reach **any**
+host on the cluster network, exfiltrate data, or pivot to other
+services. Policies (`pypi`, `npm`, `github`, `huggingface`, etc.)
+have zero effect.
+
+**Production alternative:** drop the env var (or set
+`NEMOCLAW_POLICY_MODE=enforce`) and pick the smallest set of policy
+presets the agent actually needs during onboard.
+
+### 4. `curl | bash` installer over the network
+
+```yaml
+command:
+  - bash
+  - -c
+  - |
+      ...
+      curl -fsSL https://nvidia.com/nemoclaw.sh | bash
+```
+
+Pulling the installer over the network at pod start time means the
+deployed version of NemoClaw is whatever is live on
+`nvidia.com/nemoclaw.sh` at the moment the pod boots. There is no
+checksum verification, no version pinning, and no offline path. A
+compromise of the installer URL or a transient redirect is a one-shot
+supply-chain compromise of every pod that ever restarts.
+
+**Production alternative:** build a NemoClaw image at a known tag,
+publish it to your own registry pinned by digest (see #1438), and
+deploy that image instead of running the installer at pod start.
+
+### 5. Placeholder API key
+
+```yaml
+- name: COMPATIBLE_API_KEY
+  value: "dummy"
+```
+
+The manifest hardcodes a placeholder credential. In a production
+deployment this needs to be a real key, sourced from a Kubernetes
+`Secret`, not an environment variable in plain YAML.
+
+**Production alternative:**
+
+```yaml
+- name: COMPATIBLE_API_KEY
+  valueFrom:
+    secretKeyRef:
+      name: nemoclaw-credentials
+      key: compatible-api-key
+```
+
+### 6. No `NetworkPolicy`
+
+The pod has no Kubernetes `NetworkPolicy` attached. With the default
+"allow all" cluster behavior, the workspace container can reach any
+service in the cluster — including the kube-apiserver — via the
+node's cluster network, and `POLICY_MODE=skip` removes the
+NemoClaw-side guardrail too.
+
+**Production alternative:** ship a default-deny `NetworkPolicy` for
+the `nemoclaw` namespace and explicitly allow only the inference
+endpoint and DNS.
+
+### 7. No `limits` (only `requests`)
+
+```yaml
+resources:
+  requests:
+    memory: "8Gi"
+    cpu: "2"
+```
+
+Without `resources.limits`, a runaway agent or a memory leak in the
+sandbox can consume unbounded CPU and memory on the node, causing
+OOMKills of unrelated workloads. This is the gap flagged in
+[#1447](https://github.com/NVIDIA/NemoClaw/issues/1447).
+
+**Production alternative:**
+
+```yaml
+resources:
+  requests:
+    memory: "8Gi"
+    cpu: "2"
+  limits:
+    memory: "16Gi"
+    cpu: "4"
+```
+
+## Minimum bar for production
+
+If you need to run NemoClaw on a real Kubernetes cluster, none of the
+above is acceptable as-is. At a minimum:
+
+1. **Drop `privileged: true`.** Use a runtime class instead of DinD.
+2. **Build and pin a NemoClaw image** by digest. Do not `curl | bash`
+   at pod start.
+3. **Source credentials from `Secret` resources**, not env vars.
+4. **Set `NEMOCLAW_POLICY_MODE=enforce`** and select only the policy
+   presets the agent actually needs.
+5. **Attach a default-deny `NetworkPolicy`** to the `nemoclaw`
+   namespace.
+6. **Set `resources.limits`** so a sandbox cannot starve the node.
+7. **Add `livenessProbe` / `readinessProbe`** so kubelet can detect
+   and restart unhealthy pods.
+
+The current manifest deliberately ships **none** of those because it
+optimizes for "kubectl apply and try it out". That tradeoff is fine
+for evaluation, dangerous for production, and the reason this page
+exists.

From 71b896f183dbabc5bd2c9e7eb059f40a7682ff9e Mon Sep 17 00:00:00 2001
From: ColinM-sys <cmcdonough@50words.com>
Date: Tue, 14 Apr 2026 22:25:41 -0400
Subject: [PATCH 2/4] fix: capitalize GitHub in SECURITY.md per CodeRabbit

Signed-off-by: ColinM-sys <cmcdonough@50words.com>
---
 k8s/SECURITY.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md
index 498cceb4b2..8b6f13071a 100644
--- a/k8s/SECURITY.md
+++ b/k8s/SECURITY.md
@@ -77,7 +77,7 @@ socket) into the workspace container.
 `POLICY_MODE=skip` disables NemoClaw's network policy enforcement
 inside the sandbox. The agent inside the sandbox can reach **any**
 host on the cluster network, exfiltrate data, or pivot to other
-services. Policies (`pypi`, `npm`, `github`, `huggingface`, etc.)
+services. Policies (`pypi`, `npm`, `GitHub`, `huggingface`, etc.)
 have zero effect.
 
 **Production alternative:** drop the env var (or set

From 9af82530294b93fdcb9cd562806c6e85e3ee9eb8 Mon Sep 17 00:00:00 2001
From: ColinM-sys <cmcdonough@50words.com>
Date: Wed, 15 Apr 2026 11:06:12 -0400
Subject: [PATCH 3/4] docs(k8s): update SECURITY.md to reflect current manifest
 improvements per CodeRabbit

Signed-off-by: ColinM-sys <cmcdonough@50words.com>
---
 k8s/SECURITY.md | 81 ++++++++++++++++++++++++++-----------------------
 1 file changed, 43 insertions(+), 38 deletions(-)

diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md
index 8b6f13071a..b26bc8341a 100644
--- a/k8s/SECURITY.md
+++ b/k8s/SECURITY.md
@@ -67,66 +67,69 @@ to escape the sandbox.
 so the daemon issues client certs, then mount only the certs (not the
 socket) into the workspace container.
 
-### 3. `NEMOCLAW_POLICY_MODE=skip`
+### 3. `NEMOCLAW_POLICY_MODE=suggested`
 
 ```yaml
 - name: NEMOCLAW_POLICY_MODE
-  value: "skip"
+  value: "suggested"
 ```
 
-`POLICY_MODE=skip` disables NemoClaw's network policy enforcement
-inside the sandbox. The agent inside the sandbox can reach **any**
-host on the cluster network, exfiltrate data, or pivot to other
-services. Policies (`pypi`, `npm`, `GitHub`, `huggingface`, etc.)
-have zero effect.
-
-**Production alternative:** drop the env var (or set
-`NEMOCLAW_POLICY_MODE=enforce`) and pick the smallest set of policy
+The current manifest uses `suggested` — a permissive mode that
+applies NemoClaw's suggested policy presets without strictly
+enforcing them. This is a meaningful improvement over the previous
+`skip` default (which disabled policy enforcement entirely), but it
+is still not the strictest setting. For production workloads
+handling sensitive data, reduce the allowed policy set to only the
 presets the agent actually needs during onboard.
 
-### 4. `curl | bash` installer over the network
+### 4. Installer pulled over the network at pod start
+
+The manifest now downloads the installer to a local file with
+HTTPS-only curl flags before executing:
 
 ```yaml
-command:
-  - bash
-  - -c
-  - |
-      ...
-      curl -fsSL https://nvidia.com/nemoclaw.sh | bash
+curl --proto '=https' --tlsv1.2 --fail --show-error --silent \
+  --location \
+  --output /tmp/nemoclaw-install.sh \
+  https://www.nvidia.com/nemoclaw.sh
+chmod 700 /tmp/nemoclaw-install.sh
+bash /tmp/nemoclaw-install.sh
 ```
 
-Pulling the installer over the network at pod start time means the
-deployed version of NemoClaw is whatever is live on
-`nvidia.com/nemoclaw.sh` at the moment the pod boots. There is no
-checksum verification, no version pinning, and no offline path. A
-compromise of the installer URL or a transient redirect is a one-shot
-supply-chain compromise of every pod that ever restarts.
+This is better than the original `curl | bash` — the download and
+execute are now separate steps, TLS 1.2+ is enforced, and HTTP is
+rejected. However, the installer script itself is still pulled at
+pod start with no checksum verification and no version pinning. A
+compromise of the installer URL or a transient redirect is still a
+one-shot supply-chain compromise of every pod that ever restarts.
 
 **Production alternative:** build a NemoClaw image at a known tag,
 publish it to your own registry pinned by digest (see #1438), and
 deploy that image instead of running the installer at pod start.
 
-### 5. Placeholder API key
+### 5. API key handling
 
-```yaml
-- name: COMPATIBLE_API_KEY
-  value: "dummy"
-```
-
-The manifest hardcodes a placeholder credential. In a production
-deployment this needs to be a real key, sourced from a Kubernetes
-`Secret`, not an environment variable in plain YAML.
-
-**Production alternative:**
+The manifest now loads `COMPATIBLE_API_KEY` from an optional
+Kubernetes `Secret` with a `dummy` fallback in startup shell logic
+for unauthenticated endpoints like local Dynamo/vLLM:
 
 ```yaml
 - name: COMPATIBLE_API_KEY
   valueFrom:
     secretKeyRef:
-      name: nemoclaw-credentials
-      key: compatible-api-key
+      name: nemoclaw-compatible-api-key
+      key: api-key
+      optional: true
 ```
 
+This is the correct pattern for production. The `optional: true`
+flag allows the manifest to deploy without the Secret (useful for
+evaluation against open endpoints), and the startup shell assigns
+`dummy` when the Secret is absent so the CLI's credential
+validation does not block startup. For production, create the
+Secret with a real key before applying the manifest — see the
+step-by-step in [README.md](./README.md).
+
 ### 6. No `NetworkPolicy`
 
 The pod has no Kubernetes `NetworkPolicy` attached. With the default
@@ -174,8 +177,10 @@ above is acceptable as-is. At a minimum:
 2. **Build and pin a NemoClaw image** by digest. Do not `curl | bash`
    at pod start.
 3. **Source credentials from `Secret` resources**, not env vars.
-4. **Set `NEMOCLAW_POLICY_MODE=enforce`** and select only the policy
-   presets the agent actually needs.
+4. **Reduce the policy preset set.** The manifest already uses
+   `NEMOCLAW_POLICY_MODE=suggested` (a permissive but non-skip
+   default). Narrow the suggested presets to only what the agent
+   actually needs during onboard.
 5. **Attach a default-deny `NetworkPolicy`** to the `nemoclaw`
    namespace.
 6. **Set `resources.limits`** so a sandbox cannot starve the node.

From 01c2f937dc78c245102c815cd4cd87bb0f1f4343 Mon Sep 17 00:00:00 2001
From: ColinM-sys <cmcdonough@50words.com>
Date: Wed, 15 Apr 2026 11:08:05 -0400
Subject: [PATCH 4/4] docs(k8s): clarify POLICY_MODE=suggested semantics in
 NetworkPolicy section

Signed-off-by: ColinM-sys <cmcdonough@50words.com>
---
 k8s/SECURITY.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/k8s/SECURITY.md b/k8s/SECURITY.md
index b26bc8341a..253d4b71ef 100644
--- a/k8s/SECURITY.md
+++ b/k8s/SECURITY.md
@@ -135,8 +135,10 @@ step-by-step in [README.md](./README.md).
 The pod has no Kubernetes `NetworkPolicy` attached. With the default
 "allow all" cluster behavior, the workspace container can reach any
 service in the cluster — including the kube-apiserver — via the
-node's cluster network, and `POLICY_MODE=skip` removes the
-NemoClaw-side guardrail too.
+node's cluster network. `NEMOCLAW_POLICY_MODE=suggested` (the
+current default) weakens the NemoClaw-side guardrails but does not
+fully disable them, so the remaining gap is at the cluster network
+layer.
 
 **Production alternative:** ship a default-deny `NetworkPolicy` for
 the `nemoclaw` namespace and explicitly allow only the inference