Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .agents/skills/debug-openshell-cluster/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ Common findings:
helm -n openshell status openshell
helm -n openshell get values openshell
kubectl -n openshell get statefulset,pod,svc,pvc
kubectl -n openshell logs statefulset/openshell --tail=200
kubectl -n openshell logs statefulset/openshell -c openshell-gateway --tail=200
kubectl -n openshell rollout status statefulset/openshell
```

Expand Down Expand Up @@ -238,7 +238,7 @@ If the gateway is healthy but sandbox creation fails:
```bash
kubectl -n openshell get pods
kubectl -n openshell get events --sort-by=.lastTimestamp | tail -n 50
kubectl -n openshell logs statefulset/openshell --tail=200
kubectl -n openshell logs statefulset/openshell -c openshell-gateway --tail=200
```

Check the configured sandbox namespace:
Expand Down Expand Up @@ -286,7 +286,7 @@ openshell logs <sandbox-name>
| Docker or Podman sandbox never registers | Wrong callback endpoint or supervisor startup failure | Gateway logs and sandbox container logs |
| Docker GPU e2e fails before GPU sandbox comparison | NVIDIA CDI specs are missing or Docker has not discovered them | `docker info --format '{{json .DiscoveredDevices}}'`, `/etc/cdi`, `/var/run/cdi`, `nvidia-cdi-refresh.service` |
| Kubernetes gateway pod pending | PVC unbound, taint, selector, or insufficient resources | `kubectl -n openshell describe pod <pod>` |
| Kubernetes gateway pod crash loops | Missing secret, bad DB URL, bad TLS config | `kubectl -n openshell logs statefulset/openshell` |
| Kubernetes gateway pod crash loops | Missing secret, bad DB URL, bad TLS config | `kubectl -n openshell logs statefulset/openshell -c openshell-gateway` |
| CLI TLS error | Local mTLS bundle does not match server cert/CA | Check `~/.config/openshell/gateways/<name>/mtls/` |
| Image pull failure | Gateway or sandbox image cannot be pulled | Runtime events and image pull credentials |
| `K8s namespace not ready` with `envoy-gateway-openshell.yaml: the server could not find the requested resource` | Optional Gateway API manifest was applied without Envoy Gateway CRDs, or k3s Helm controller startup exceeded the namespace wait | Apply `deploy/kube/manifests/envoy-gateway-openshell.yaml` manually only after Envoy Gateway is installed and `grpcRoute` is enabled |
Expand Down
2 changes: 1 addition & 1 deletion deploy/helm/openshell/templates/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ spec:
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
- name: openshell-gateway
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: {{ include "openshell.image" . | quote }}
Expand Down
7 changes: 7 additions & 0 deletions deploy/helm/openshell/tests/gateway_config_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ tests:
- exists:
path: spec.template.metadata.annotations["checksum/gateway-config"]

- it: uses a stable gateway container name
template: templates/statefulset.yaml
asserts:
- equal:
path: spec.template.spec.containers[0].name
value: openshell-gateway

- it: mounts the OIDC CA bundle when TLS is disabled
template: templates/statefulset.yaml
set:
Expand Down
Loading