-
Notifications
You must be signed in to change notification settings - Fork 76
Run code-executor unprivileged with seccomp on k8s >= 1.33 #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: r2
Are you sure you want to change the base?
Changes from all commits
ad5e598
2cc7c2e
e256eb1
ef7ac7d
8e2af46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| {{- if include "retool.workflows.enabled" . }} | ||
| {{- if and (not .Values.codeExecutor.securityContext) (semverCompare ">=1.33-0" .Capabilities.KubeVersion.Version) }} | ||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: {{ template "retool.fullname" . }}-code-executor-seccomp | ||
| labels: | ||
| {{- include "retool.labels" . | nindent 4 }} | ||
| data: | ||
| nsjail-seccomp.json: | | ||
| {{- .Files.Get "files/nsjail-seccomp.json" | nindent 4 }} | ||
| {{- end }} | ||
| {{- end }} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,18 @@ | ||
| {{- if include "retool.workflows.enabled" . }} | ||
| {{- /* | ||
| Workflows are run securely within Code Executor via nsjail sandboxes. Creating | ||
| those sandboxes requires the container to have elevated privileges. By default | ||
| these privileges are granted by running the container as privileged | ||
| (securityContext.privileged: true). On Kubernetes 1.33+ we can instead grant only | ||
| what nsjail needs, far more granularly than the privileged flag: a slightly | ||
| relaxed version of Docker's default seccomp profile, the NET_ADMIN capability for | ||
| network isolation, and an unmasked /proc for process resource monitoring. | ||
| $useSecComp selects this less-privileged path when the operator has not pinned a | ||
| securityContext and the cluster is >= 1.33; otherwise we fall back to privileged. | ||
| To use the more fine-grained privileges, please upgrade your k8s cluster to 1.33 | ||
| or higher. | ||
| */ -}} | ||
| {{- $useSecComp := and (not .Values.codeExecutor.securityContext) (semverCompare ">=1.33-0" .Capabilities.KubeVersion.Version) -}} | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
|
|
@@ -23,6 +37,9 @@ spec: | |
| template: | ||
| metadata: | ||
| annotations: | ||
| {{- if $useSecComp }} | ||
| checksum/seccomp: {{ .Files.Get "files/nsjail-seccomp.json" | sha256sum }} | ||
| {{- end }} | ||
| {{- if .Values.podAnnotations }} | ||
| {{ toYaml .Values.podAnnotations | indent 8 }} | ||
| {{- end }} | ||
|
|
@@ -44,23 +61,62 @@ spec: | |
| {{- if .Values.priorityClassName }} | ||
| priorityClassName: "{{ .Values.priorityClassName }}" | ||
| {{- end }} | ||
| {{- if .Values.initContainers }} | ||
| {{- if $useSecComp }} | ||
| hostUsers: false | ||
| {{- end }} | ||
| {{- if or $useSecComp .Values.initContainers }} | ||
| initContainers: | ||
| {{- if $useSecComp }} | ||
| - name: install-seccomp | ||
| image: busybox:1.37.0@sha256:b3255e7dfbcd10cb367af0d409747d511aeb66dfac98cf30e97e87e4207dd76f | ||
| securityContext: | ||
| allowPrivilegeEscalation: false | ||
| readOnlyRootFilesystem: true | ||
| capabilities: | ||
| drop: ["ALL"] | ||
| resources: | ||
| requests: | ||
| cpu: 1m | ||
| memory: 4Mi | ||
| limits: | ||
| cpu: 10m | ||
| memory: 16Mi | ||
| command: | ||
| - /bin/sh | ||
| - -c | ||
| - | | ||
| DEST="/host-seccomp/{{ .Values.codeExecutor.seccompLocalhostProfile }}" | ||
| mkdir -p "$(dirname "$DEST")" | ||
| cp /seccomp-profile/nsjail-seccomp.json "$DEST" | ||
| echo "seccomp profile installed at $DEST" | ||
| volumeMounts: | ||
| - name: seccomp-profile | ||
| mountPath: /seccomp-profile | ||
| - name: host-seccomp | ||
| mountPath: /host-seccomp | ||
| {{- end }} | ||
| {{- range $key, $value := .Values.initContainers }} | ||
| - name: "{{ $key }}" | ||
| {{ toYaml $value | indent 8 }} | ||
| - name: "{{ $key }}" | ||
| {{ toYaml $value | indent 10 }} | ||
| {{- end }} | ||
| {{- end }} | ||
| containers: | ||
| - name: code-executor | ||
| image: "{{ .Values.codeExecutor.image.repository }}:{{ include "retool.codeExecutor.image.tag" . }}" | ||
| imagePullPolicy: {{ .Values.image.pullPolicy }} | ||
| securityContext: | ||
| {{ if .Values.codeExecutor.securityContext }} | ||
| {{- if .Values.codeExecutor.securityContext }} | ||
| {{ toYaml .Values.codeExecutor.securityContext | indent 10 }} | ||
| {{ else }} | ||
| {{- else if $useSecComp }} | ||
| capabilities: | ||
| add: ["NET_ADMIN"] | ||
| procMount: Unmasked | ||
| seccompProfile: | ||
| type: Localhost | ||
| localhostProfile: {{ .Values.codeExecutor.seccompLocalhostProfile }} | ||
| {{- else }} | ||
| privileged: true | ||
| {{ end }} | ||
| {{- end }} | ||
|
Comment on lines
107
to
+119
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The seccomp security context adds |
||
| {{- if .Values.securityContext.extraContainerSecurityContext }} | ||
| {{ toYaml .Values.securityContext.extraContainerSecurityContext | indent 10 }} | ||
| {{- end }} | ||
|
|
@@ -128,6 +184,15 @@ spec: | |
| {{ tpl . $ | indent 6 }} | ||
| {{- end }} | ||
| volumes: | ||
| {{- if $useSecComp }} | ||
| - name: seccomp-profile | ||
| configMap: | ||
| name: {{ template "retool.fullname" . }}-code-executor-seccomp | ||
| - name: host-seccomp | ||
| hostPath: | ||
| path: /var/lib/kubelet/seccomp | ||
| type: DirectoryOrCreate | ||
| {{- end }} | ||
| {{- if .Values.codeExecutor.volumes }} | ||
| {{ toYaml .Values.codeExecutor.volumes | indent 8 }} | ||
| {{- end }} | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hostUsers: falsecreates a write-permission dependency on idmap mountsThe pod-level
hostUsers: falseenables user namespaces for the entire pod, including theinstall-seccompinit container. With user namespaces active, UID 0 inside the container maps to a non-privileged UID on the host (e.g. 65536), so the/var/lib/kubelet/seccompdirectory — typically owned by host root — is only writable if the kubelet is using idmap mounts (Linux kernel ≥ 5.12 for partial support, ≥ 6.3 for full hostPath support). The jsExecutor's identical init container works without issue because it does not sethostUsers: false.If the node kernel is older (even on a k8s 1.33 cluster), the init container will fail with a permission-denied error and the pod will never reach the running state. The two live-cluster validation items in the test plan remain unchecked — this is the exact failure mode they would surface.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!