From 66a23966e4c16d89b6950ccfd057155812ac2904 Mon Sep 17 00:00:00 2001 From: latenighthackathon Date: Thu, 21 May 2026 05:01:44 +0000 Subject: [PATCH] docs(reference): document Docker-driver compute path alongside legacy k3s Closes #3432. `docs/reference/architecture.md` describes only the legacy `k3s` path even though every supported platform now selects the Docker driver (see `src/lib/onboard/docker-driver-platform.ts`: `isLinuxDockerDriverGatewayEnabled = platform === "linux" || (platform === "darwin" && arch === "arm64")`). Operators who do not already know the dispatch rule cannot tell from the docs whether their sandbox runs as a Kubernetes pod or as a sibling Docker container, which matters for runtime debugging, log paths, and the resource model. Add a platform / compute-path table at the top of the "What runs where" section that names every platform in [`ci/platform-matrix.json`] with its compute path and links the dispatch function so the source of truth is one click away. The table matches the current dispatch exactly: | Platform | Compute path | |---------------------------------------|---------------| | Linux (any arch, incl. DGX Spark / Station) | Docker driver | | Windows WSL2 (Docker Desktop) | Docker driver | | macOS Apple Silicon (Colima / Docker Desktop) | Docker driver | Mark the legacy `k3s` diagram below as historical context; clarify the "What runs where" intro and layering table so the Sandbox container and Sandbox pod variants are documented side by side rather than the pod being the only documented shape. Update the prerequisites note that attributed the fuse-overlayfs limitation to `k3s` so it points at the OpenShell gateway image (the limitation surfaces on both compute paths on hosts that use Docker's containerd image store). `prek` clean. `markdownlint-cli2` + `Verify docs-to-skills output` pass. No source code changes; docs-only. Signed-off-by: latenighthackathon --- docs/get-started/prerequisites.mdx | 8 ++++++-- docs/reference/architecture.mdx | 23 +++++++++++++++++------ 2 files changed, 23 insertions(+), 8 deletions(-) diff --git a/docs/get-started/prerequisites.mdx b/docs/get-started/prerequisites.mdx index 434a2fe4c0..f1125ee875 100644 --- a/docs/get-started/prerequisites.mdx +++ b/docs/get-started/prerequisites.mdx @@ -19,7 +19,11 @@ Before getting started, check the prerequisites to ensure you have the necessary | RAM | 8 GB | 16 GB | | Disk | 20 GB free | 40 GB free | -The sandbox image is approximately 2.4 GB compressed. During image push, the Docker daemon, k3s, and the OpenShell gateway run alongside the export pipeline. The pipeline buffers decompressed layers in memory. On machines with less than 8 GB of RAM, this combined usage can trigger the OOM killer. If you cannot add memory, configuring at least 8 GB of swap can work around the issue at the cost of slower performance. +The sandbox image is approximately 2.4 GB compressed. +During image push, the Docker daemon, the OpenShell gateway container, and the sandbox runtime run alongside the export pipeline. +The pipeline buffers decompressed layers in memory. +On machines with less than 8 GB of RAM, this combined usage can trigger the OOM killer. +If you cannot add memory, configuring at least 8 GB of swap can work around the issue at the cost of slower performance. ## Software @@ -45,7 +49,7 @@ Avoid `openshell self-update`, `npm update -g openshell`, `openshell gateway sta -On Linux hosts running Docker 26 or later with the [containerd image store](https://docs.docker.com/engine/storage/containerd/) enabled (the install-time default for fresh `docker-ce` installations on Ubuntu 24.04 and similar distros), `nemoclaw onboard` transparently builds a `fuse-overlayfs`-enabled cluster image to bypass a kernel-level nested-overlay limitation in k3s. +On Linux hosts running Docker 26 or later with the [containerd image store](https://docs.docker.com/engine/storage/containerd/) enabled (the install-time default for fresh `docker-ce` installations on Ubuntu 24.04 and similar distros), `nemoclaw onboard` transparently builds a `fuse-overlayfs`-enabled cluster image to bypass a kernel-level nested-overlay limitation in the OpenShell gateway image. No manual setup is required. See the [troubleshooting guide](/reference/troubleshooting) for the override knobs and a manual `daemon.json` alternative. diff --git a/docs/reference/architecture.mdx b/docs/reference/architecture.mdx index 0bde7c8a6c..66ddc8ca19 100644 --- a/docs/reference/architecture.mdx +++ b/docs/reference/architecture.mdx @@ -76,9 +76,19 @@ graph LR The logical diagram above shows how components relate. This section shows what actually runs where on the host. -NemoClaw uses a Docker daemon. -The OpenShell gateway runs as a container that embeds a k3s cluster. -The sandbox runs as a Kubernetes pod inside that embedded cluster. +NemoClaw uses a Docker daemon, and the OpenShell gateway runs as a container. +The sandbox runs as a sibling Docker container or as a pod inside an embedded `k3s` cluster, depending on the host platform. + +| Platform | Compute path | Sandbox runs as | +|---|---|---| +| Linux (any arch, including DGX Spark / Station) | Docker driver | Sibling Docker container | +| Windows WSL2 (Docker Desktop, WSL backend) | Docker driver | Sibling Docker container (WSL2 reports `platform === "linux"`) | +| macOS Apple Silicon (Colima or Docker Desktop) | Docker driver | Sibling Docker container | + +Every platform currently listed in [`ci/platform-matrix.json`](https://github.com/NVIDIA/NemoClaw/blob/main/ci/platform-matrix.json) selects the Docker driver compute path; the dispatch is `platform === "linux" || (platform === "darwin" && arch === "arm64")` in [`src/lib/onboard/docker-driver-platform.ts`](https://github.com/NVIDIA/NemoClaw/blob/main/src/lib/onboard/docker-driver-platform.ts). +On the Docker driver path, the embedded `k3s` cluster and the sandbox pod are replaced by a sibling Docker container that hosts the OpenClaw agent under the same `Landlock`, `seccomp`, and `netns` confinement. + +The diagram below describes the legacy `k3s` path, kept for historical context: when a platform fell outside the dispatch rule above the gateway embedded a `k3s` cluster and scheduled the sandbox as a pod inside it. ```mermaid graph TB @@ -132,9 +142,10 @@ Layering from top to bottom: |---|---|---| | Host CLI | Host process (`nemoclaw` on Node.js) | Orchestrates OpenShell via `openshell` CLI calls. | | Docker daemon | Host service | Runs the OpenShell gateway container. | -| Gateway container | Docker container | Hosts the credential store, the L7 proxy, and the embedded k3s control plane. | -| k3s | Process tree inside the gateway container | Kubernetes control plane that schedules the sandbox pod. | -| Sandbox pod | Pod in the embedded k3s cluster | Runs the OpenClaw agent and the NemoClaw plugin under Landlock + seccomp + netns. | +| Gateway container | Docker container | Hosts the credential store and the L7 proxy. On the legacy `k3s` path, also hosts the embedded `k3s` control plane. | +| `k3s` (legacy path only) | Process tree inside the gateway container | Kubernetes control plane that schedules the sandbox pod. | +| Sandbox container (Docker driver path) | Sibling Docker container managed by the gateway | Runs the OpenClaw agent and the NemoClaw plugin under `Landlock` + `seccomp` + `netns`. | +| Sandbox pod (legacy `k3s` path) | Pod in the embedded `k3s` cluster | Runs the OpenClaw agent and the NemoClaw plugin under `Landlock` + `seccomp` + `netns`. | | OpenShell L7 proxy | Process in the gateway container | Intercepts agent egress and rewrites `Authorization` headers (Bearer/Bot) and URL-path segments to inject the real credential at the network boundary. | NemoClaw never gives the sandbox a raw provider key.