Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions deployments/charts/service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -403,11 +403,24 @@ Envoy uses filesystem-based dynamic configuration (LDS/CDS). When the ConfigMap
| `gateway.networkPolicies.enabled` | Deploy NetworkPolicies restricting ingress to upstream pods | `false` |
| `gateway.networkPolicies.upstreams` | List of upstream pods to protect (name, podSelector, port) | See values.yaml |

#### TLS
#### Gateway → Upstream TLS

Traffic between the Envoy gateway and the upstream services (`osmo-service`, `osmo-router`, `osmo-agent`, `osmo-logger`) is encrypted by default. The UI intentionally stays on plain HTTP behind NetworkPolicy — Next.js does not natively serve TLS.

**Default — encryption without validation.** Each upstream service mints its own ephemeral self-signed cert in-process at startup (ECDSA P-256, ~1ms) and loads it into uvicorn's SSLContext via `--ssl_self_signed true`. Envoy connects with TLS but does *not* validate the cert. The wire is encrypted; identity verification is delegated to NetworkPolicy + Kubernetes RBAC. No CA management, no Secrets, no rotation — cert lifecycle is tied to process lifecycle.

**Externally-provisioned certs.** Point `gateway.tls.upstreamCerts.<service>` at an existing `kubernetes.io/tls` Secret containing `tls.crt` + `tls.key`. That Secret is mounted at `/etc/osmo/tls` and uvicorn loads it instead of self-signing. To make Envoy validate against a CA, set `gateway.tls.caSecret` to a Secret containing `ca.crt`. The chart does not create these Secrets — provision them however suits your environment (cert-manager, Vault CSI, sealed-secrets, manual `kubectl create secret tls`, etc.). The two knobs are independent: you can use external certs without validation, or validation alone (rarely useful), but typical "real" TLS sets both.

| Parameter | Description | Default |
|-----------|-------------|---------|
| `gateway.tls.enabled` | Generate self-signed certs for upstream TLS | `false` |
| `gateway.tls.enabled` | Encrypt gateway → upstream traffic. | `true` |
| `gateway.tls.upstreamCerts.service` | Existing `kubernetes.io/tls` Secret for `osmo-service`. Empty string ⇒ self-signed. | `""` |
| `gateway.tls.upstreamCerts.router` | Same, for `osmo-router`. | `""` |
| `gateway.tls.upstreamCerts.agent` | Same, for `osmo-agent`. | `""` |
| `gateway.tls.upstreamCerts.logger` | Same, for `osmo-logger`. | `""` |
| `gateway.tls.caSecret` | Existing Secret containing `ca.crt`. When set, Envoy validates upstreams against this CA; when empty, TLS is encryption-only. | `""` |

NetworkPolicy and TLS are independent: NetworkPolicy controls *who* can connect at L3/L4; TLS encrypts the bytes at L7. Run them together for defense in depth.

### Extensibility

Expand Down
89 changes: 73 additions & 16 deletions deployments/charts/service/templates/_gateway-envoy-config.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ data:
filename: /etc/ssl/envoy-certs/tls.key
{{- end }}

{{- if $gw.tls.enabled }}
{{- if and $gw.tls.enabled $gw.tls.caSecret }}
sds_upstream_ca.yaml: |
resources:
- "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret
Expand Down Expand Up @@ -578,14 +578,24 @@ data:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: {{ $gw.upstreams.service.host }}
common_tls_context:
{{/* Envoy 1.29 upstream defaults to TLS 1.2 max. uvicorn's
SSLContext uses Python defaults (TLS 1.2 floor, 1.3 if
the openssl version supports it). Allow up to 1.3 so
negotiation can pick the most compatible option. */}}
tls_params:
tls_minimum_protocol_version: TLSv1_2
tls_maximum_protocol_version: TLSv1_3
{{- if $gw.tls.caSecret }}
validation_context_sds_secret_config:
name: upstream_ca
sds_config:
path_config_source:
path: /var/config/sds_upstream_ca.yaml
watched_directory:
path: /var/config
{{- end }}
{{- end }}

{{- if $gw.upstreams.router.enabled }}
Expand All @@ -611,14 +621,24 @@ data:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: {{ $gw.upstreams.router.host }}
common_tls_context:
{{/* Envoy 1.29 upstream defaults to TLS 1.2 max. uvicorn's
SSLContext uses Python defaults (TLS 1.2 floor, 1.3 if
the openssl version supports it). Allow up to 1.3 so
negotiation can pick the most compatible option. */}}
tls_params:
tls_minimum_protocol_version: TLSv1_2
tls_maximum_protocol_version: TLSv1_3
{{- if $gw.tls.caSecret }}
validation_context_sds_secret_config:
name: upstream_ca
sds_config:
path_config_source:
path: /var/config/sds_upstream_ca.yaml
watched_directory:
path: /var/config
{{- end }}
{{- end }}
{{- end }}

Expand All @@ -638,20 +658,12 @@ data:
socket_address:
address: {{ $gw.upstreams.ui.host }}
port_value: {{ $gw.upstreams.ui.port }}
{{- if $gw.tls.enabled }}
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
validation_context_sds_secret_config:
name: upstream_ca
sds_config:
path_config_source:
path: /var/config/sds_upstream_ca.yaml
watched_directory:
path: /var/config
{{- end }}
{{/*
UI traffic stays HTTP — Next.js does not natively serve HTTPS and
the UI sits behind NetworkPolicy. Confidentiality of the UI HTML
relies on browser → gateway TLS (gateway.envoy.ssl.enabled), not on
Envoy → upstream TLS.
*/}}
{{- end }}

{{- if $gw.upstreams.agent.enabled }}
Expand All @@ -675,14 +687,24 @@ data:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: {{ $gw.upstreams.agent.host }}
common_tls_context:
{{/* Envoy 1.29 upstream defaults to TLS 1.2 max. uvicorn's
SSLContext uses Python defaults (TLS 1.2 floor, 1.3 if
the openssl version supports it). Allow up to 1.3 so
negotiation can pick the most compatible option. */}}
tls_params:
tls_minimum_protocol_version: TLSv1_2
tls_maximum_protocol_version: TLSv1_3
{{- if $gw.tls.caSecret }}
validation_context_sds_secret_config:
name: upstream_ca
sds_config:
path_config_source:
path: /var/config/sds_upstream_ca.yaml
watched_directory:
path: /var/config
{{- end }}
{{- end }}
{{- end }}

Expand All @@ -707,14 +729,24 @@ data:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: {{ $gw.upstreams.logger.host }}
common_tls_context:
{{/* Envoy 1.29 upstream defaults to TLS 1.2 max. uvicorn's
SSLContext uses Python defaults (TLS 1.2 floor, 1.3 if
the openssl version supports it). Allow up to 1.3 so
negotiation can pick the most compatible option. */}}
tls_params:
tls_minimum_protocol_version: TLSv1_2
tls_maximum_protocol_version: TLSv1_3
{{- if $gw.tls.caSecret }}
validation_context_sds_secret_config:
name: upstream_ca
sds_config:
path_config_source:
path: /var/config/sds_upstream_ca.yaml
watched_directory:
path: /var/config
{{- end }}
{{- end }}
{{- end }}

Expand Down Expand Up @@ -805,6 +837,7 @@ data:
{{- end }}

{{- if $envoy.internalJwks.enabled }}
{{- $jwksHost := $envoy.internalJwks.host | default $gw.upstreams.service.host }}
- "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
name: {{ $envoy.internalJwks.cluster }}
connect_timeout: 3s
Expand All @@ -818,8 +851,32 @@ data:
- endpoint:
address:
socket_address:
address: {{ $envoy.internalJwks.host | default $gw.upstreams.service.host }}
address: {{ $jwksHost }}
port_value: {{ $envoy.internalJwks.port | default $gw.upstreams.service.port }}
{{- if $gw.tls.enabled }}
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: {{ $jwksHost }}
common_tls_context:
{{/* Envoy 1.29 upstream defaults to TLS 1.2 max. uvicorn's
SSLContext uses Python defaults (TLS 1.2 floor, 1.3 if
the openssl version supports it). Allow up to 1.3 so
negotiation can pick the most compatible option. */}}
tls_params:
tls_minimum_protocol_version: TLSv1_2
tls_maximum_protocol_version: TLSv1_3
{{- if $gw.tls.caSecret }}
validation_context_sds_secret_config:
name: upstream_ca
sds_config:
path_config_source:
path: /var/config/sds_upstream_ca.yaml
watched_directory:
path: /var/config
{{- end }}
{{- end }}
{{- end }}

{{- end }}
Expand Down
63 changes: 63 additions & 0 deletions deployments/charts/service/templates/_gateway-helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,66 @@ app.kubernetes.io/name: {{ include "osmo.gateway-name" .context }}
app.kubernetes.io/instance: {{ .context.Release.Name }}
app.kubernetes.io/component: {{ .component }}
{{- end }}

{{/*
Per-upstream TLS args. Pass a dict with "context" and "secretName".

When secretName is non-empty, that Secret is mounted at /etc/osmo/tls and
uvicorn loads tls.crt + tls.key from there (--ssl_keyfile / --ssl_certfile).
When empty, the Python service mints an ephemeral self-signed cert in
process at startup (--ssl_self_signed true) — no chart-side cert material.
*/}}
{{- define "osmo.upstream-tls-args" -}}
{{- if .context.Values.gateway.tls.enabled }}
{{- if .secretName }}
- --ssl_keyfile
- /etc/osmo/tls/tls.key
- --ssl_certfile
- /etc/osmo/tls/tls.crt
{{- else }}
- --ssl_self_signed
- "true"
{{- end }}
{{- end }}
{{- end }}

{{/*
TLS volume mount for an upstream container. Only emitted when a Secret
name is provided — self-signed mode keeps cert material in an in-process
tempdir, so no mount is needed.
*/}}
{{- define "osmo.upstream-tls-volume-mount" -}}
{{- if and .context.Values.gateway.tls.enabled .secretName }}
- name: tls
mountPath: /etc/osmo/tls
readOnly: true
{{- end }}
{{- end }}

{{/*
TLS volume for an upstream pod. Pass dict with "context" and "secretName".
Only emitted when secretName is non-empty.
*/}}
{{- define "osmo.upstream-tls-volume" -}}
{{- if and .context.Values.gateway.tls.enabled .secretName }}
- name: tls
secret:
secretName: {{ .secretName }}
{{- end }}
{{- end }}

{{/*
Render a probe block, injecting `scheme: HTTPS` into httpGet when TLS is on.
Pass dict with "probe" (the probe value from Values) and "context" ($).

Use:
livenessProbe:
{{- include "osmo.upstream-probe-yaml" (dict "probe" .Values.services.service.livenessProbe "context" .) | nindent 10 }}
*/}}
{{- define "osmo.upstream-probe-yaml" -}}
{{- $probe := .probe }}
{{- if and $probe .context.Values.gateway.tls.enabled (hasKey $probe "httpGet") }}
{{- $probe = mustMergeOverwrite (deepCopy $probe) (dict "httpGet" (dict "scheme" "HTTPS")) }}
{{- end }}
{{- toYaml $probe }}
{{- end }}
8 changes: 7 additions & 1 deletion deployments/charts/service/templates/agent-service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ spec:
{{- range $arg := .Values.services.agent.extraArgs }}
- {{ $arg | quote }}
{{- end }}
{{- include "osmo.upstream-tls-args" (dict "context" . "secretName" .Values.gateway.tls.upstreamCerts.agent) | nindent 8 }}
env:
{{- if .Values.services.migration.enabled }}
- name: OSMO_SCHEMA_VERSION
Expand All @@ -154,7 +155,7 @@ spec:
{{- end }}
imagePullPolicy: {{ .Values.services.agent.imagePullPolicy }}
ports:
{{- if or .Values.services.configFile.enabled .Values.global.logs.enabled .Values.services.configs.enabled .Values.services.agent.extraVolumeMounts }}
{{- if or .Values.services.configFile.enabled .Values.global.logs.enabled .Values.services.configs.enabled .Values.services.agent.extraVolumeMounts (and .Values.gateway.tls.enabled .Values.gateway.tls.upstreamCerts.agent) }}
volumeMounts:
{{- end }}
{{- if .Values.services.configFile.enabled}}
Expand All @@ -168,6 +169,7 @@ spec:
mountPath: /logs
{{- end }}
{{- include "osmo.extra-volume-mounts" .Values.services.agent | nindent 8 }}
{{- include "osmo.upstream-tls-volume-mount" (dict "context" . "secretName" .Values.gateway.tls.upstreamCerts.agent) | nindent 8 }}
resources:
{{- toYaml .Values.services.agent.resources | nindent 10 }}

Expand Down Expand Up @@ -202,6 +204,9 @@ spec:
httpGet:
port: 8000
path: /health
{{- if .Values.gateway.tls.enabled }}
scheme: HTTPS
{{- end }}
periodSeconds: 45
failureThreshold: 3
timeoutSeconds: 20
Expand All @@ -210,6 +215,7 @@ spec:
{{- include "osmo.extra-sidecars" .Values.services.agent | nindent 6 }}
volumes:
{{- include "osmo.extra-volumes" .Values.services.agent | nindent 8 }}
{{- include "osmo.upstream-tls-volume" (dict "context" . "secretName" .Values.gateway.tls.upstreamCerts.agent) | nindent 8 }}
{{- if .Values.global.logs.enabled }}
- name: logs
emptyDir: {}
Expand Down
13 changes: 11 additions & 2 deletions deployments/charts/service/templates/api-service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ spec:
{{- range $arg := .Values.services.service.extraArgs }}
- {{ $arg | quote }}
{{- end }}
{{- include "osmo.upstream-tls-args" (dict "context" . "secretName" .Values.gateway.tls.upstreamCerts.service) | nindent 8 }}
env:
- name: OSMO_DISABLE_TASK_METRICS
value: {{ .Values.services.service.disableTaskMetrics | quote }}
Expand Down Expand Up @@ -193,7 +194,7 @@ spec:
ports:
- name: metrics
containerPort: 9464
{{- if or .Values.services.configFile.enabled .Values.global.logs.enabled .Values.services.configs.enabled .Values.services.service.extraVolumeMounts }}
{{- if or .Values.services.configFile.enabled .Values.global.logs.enabled .Values.services.configs.enabled .Values.services.service.extraVolumeMounts (and .Values.gateway.tls.enabled .Values.gateway.tls.upstreamCerts.service) }}
volumeMounts:
{{- end }}
{{- if .Values.services.configFile.enabled}}
Expand All @@ -207,19 +208,23 @@ spec:
mountPath: /logs
{{- end }}
{{- include "osmo.extra-volume-mounts" .Values.services.service | nindent 8 }}
{{- include "osmo.upstream-tls-volume-mount" (dict "context" . "secretName" .Values.gateway.tls.upstreamCerts.service) | nindent 8 }}
resources:
{{- toYaml .Values.services.service.resources | nindent 10 }}

# Any failure to return the version api means the service is in a bad state
livenessProbe:
{{- toYaml .Values.services.service.livenessProbe | nindent 10 }}
{{- include "osmo.upstream-probe-yaml" (dict "probe" .Values.services.service.livenessProbe "context" .) | nindent 10 }}


# Give the container 30 seconds to startup
startupProbe:
httpGet:
port: 8000
path: /api/version
{{- if .Values.gateway.tls.enabled }}
scheme: HTTPS
{{- end }}
failureThreshold: 6
periodSeconds: 5
timeoutSeconds: 3
Expand All @@ -230,6 +235,9 @@ spec:
httpGet:
port: 8000
path: /api/workflow?limit=0&all_pools=true
{{- if .Values.gateway.tls.enabled }}
scheme: HTTPS
{{- end }}
httpHeaders:
- name: x-osmo-roles
value: osmo-admin
Expand All @@ -240,6 +248,7 @@ spec:
{{- include "osmo.extra-sidecars" .Values.services.service | nindent 6 }}
volumes:
{{- include "osmo.extra-volumes" .Values.services.service | nindent 8 }}
{{- include "osmo.upstream-tls-volume" (dict "context" . "secretName" .Values.gateway.tls.upstreamCerts.service) | nindent 8 }}
{{- if .Values.global.logs.enabled }}
- name: logs
emptyDir: {}
Expand Down
6 changes: 3 additions & 3 deletions deployments/charts/service/templates/gateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ spec:
- mountPath: /var/config
name: envoy-config
readOnly: true
{{- if $gw.tls.enabled }}
{{- if and $gw.tls.enabled $gw.tls.caSecret }}
- name: gateway-tls-ca
mountPath: /etc/gateway-tls
readOnly: true
Expand Down Expand Up @@ -112,10 +112,10 @@ spec:
- name: envoy-config
configMap:
name: {{ $gwName }}-envoy-config
{{- if $gw.tls.enabled }}
{{- if and $gw.tls.enabled $gw.tls.caSecret }}
- name: gateway-tls-ca
secret:
secretName: {{ $gwName }}-ca-tls
secretName: {{ $gw.tls.caSecret }}
items:
- key: ca.crt
path: ca.crt
Expand Down
Loading
Loading