Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
172 changes: 172 additions & 0 deletions website/blog/2026-04-24-transparent-mtls-cilium-ztunnel/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
---
title: "Cilium mTLS Encryption Deep Dive"
date: 2026-04-24
description: "Learn how Cilium and ztunnel deliver transparent mutual TLS (mTLS) for AKS pod-to-pod traffic with HBONE encryption, workload identity, and no sidecars."
authors: ["quang-nguyen", "apurup-chevuru", "michael-zappa"]
tags:
- cilium
- networking
- security
- open-source
---

<!-- markdownlint-disable MD033 -->

import BrowserOnly from '@docusaurus/BrowserOnly';
Comment thread
nddq marked this conversation as resolved.
import { useColorMode } from '@docusaurus/theme-common';
import animationUrl from './transparent-mtls-animation.html';

export function TransparentMtlsAnimation() {
const { colorMode } = useColorMode();
const theme = colorMode === 'light' ? 'light' : 'dark';
return (
<iframe
src={`${animationUrl}?theme=${theme}`}
width="100%"
height="700"
style={{border: "none"}}
title="Interactive mTLS flow walkthrough"
sandbox="allow-scripts allow-popups"
referrerPolicy="no-referrer"
Comment thread
nddq marked this conversation as resolved.
loading="lazy"
/>
Comment thread
nddq marked this conversation as resolved.
);
}

[Cilium](https://github.com/cilium/cilium) is the eBPF data plane behind [Azure CNI Powered by Cilium](https://learn.microsoft.com/azure/aks/azure-cni-powered-by-cilium) and most security features in [Advanced Container Networking Services (ACNS)](https://learn.microsoft.com/azure/aks/advanced-container-networking-services-overview). Microsoft has evolved from a consumer of the project into an active upstream contributor, running Cilium at AKS scale and feeding fixes and hardening back to the community. Transport encryption has been one of our focus areas — and there's a gap worth closing: Kubernetes pod-to-pod traffic isn't encrypted by default, and Cilium's existing IPsec and WireGuard options are node-to-node, encrypting the wire without saying anything about which workload sent the packet.

**Cilium mTLS encryption**, now in [public preview](https://techcommunity.microsoft.com/blog/AzureNetworkingBlog/announcing-public-preview-cilium-mtls-encryption-for-azure-kubernetes-service/4504423) in ACNS, closes that gap. Pods get [SPIFFE](https://spiffe.io/) identities derived from their namespace and service account, and HBONE-tunneled traffic terminates at a node-local [ztunnel](https://github.com/istio/ztunnel) — no sidecars. This post walks through how it works end to end.

<!-- truncate -->

Comment thread
nddq marked this conversation as resolved.
## Architecture Overview

The system has two halves: ztunnel as the **data plane**, and Cilium as the **control plane**.

**[ztunnel](https://github.com/istio/ztunnel)** is a lightweight L4 proxy written in Rust, originally built by Istio for [ambient mesh](https://istio.io/latest/docs/ambient/overview/). It runs as a DaemonSet — one instance per node — and transparently intercepts traffic from enrolled pods. Rather than building a new proxy, the Cilium community adopted ztunnel: it already solves the problem well, and reusing it means both ecosystems benefit from improvements. The public preview ships with a [Cilium fork of ztunnel](https://github.com/cilium/ztunnel) that adds SPIRE support and xDS-over-Unix-socket transport — both are in the process of being upstreamed to the Istio project.

Instead of using Istio's istiod control plane to program the ztunnel data plane, **Cilium's control plane**, consisting of two components, is used:

- The **Cilium agent** (per-node DaemonSet) handles the node-local relationship with ztunnel — enrolling pods, streaming workload discovery, setting up traffic interception, and optionally signing certificates. All communication with ztunnel happens over local Unix sockets.
- The **Cilium operator** (cluster-wide) handles SPIRE identity registration — when a namespace is enrolled, it creates SPIRE entries for all service accounts in that namespace so ztunnel can obtain certificates on their behalf.

Each workload's identity is derived from its Kubernetes namespace and service account, encoded as a [SPIFFE](https://spiffe.io/) ID:

```text
spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>
```

This identity is embedded in the workload's X.509 certificate and verified during every mTLS handshake. The tunneling protocol is [HBONE](https://istio.io/latest/docs/ambient/architecture/hbone/) (HTTP-Based Overlay Network Encapsulation): TCP streams are encapsulated in HTTP/2 CONNECT tunnels, with mTLS 1.3 securing the outer connection to provide forward secrecy and modern cipher suites by default.

Importantly, ztunnel does not replace Cilium's core datapath. Cilium's eBPF datapath continues to handle service routing and load balancing as it always has — ztunnel layers on top purely to provide encryption and identity for pod-to-pod traffic between enrolled namespaces.

![How transparent mTLS works: ztunnel encrypts traffic between enrolled pods while passing through traffic to non-enrolled destinations as plaintext.](./how-mtls-works.png)
Comment thread
nddq marked this conversation as resolved.

### Interactive Walkthrough

Step through the full mTLS flow with code references in this interactive animation:

<BrowserOnly>
Comment thread
nddq marked this conversation as resolved.
{() => <TransparentMtlsAnimation />}
</BrowserOnly>

## Namespace Enrollment

Enrollment is namespace-scoped. When you label a namespace:

```yaml
metadata:
labels:
io.cilium/mtls-enabled: "true"
```

The Cilium agent detects the change via a Kubernetes watch. Internally, enrolled namespaces are tracked in Cilium's in-memory state store (StateDB). A reconciliation controller — following the same watch-and-react pattern familiar from Kubernetes controllers — monitors this state and pushes all existing pod endpoints in newly enrolled namespaces as workload discovery events to ztunnel.

## The Three Control Plane Channels

Cilium and ztunnel communicate over three distinct channels, each with its own socket and protocol. All three are local to the node — there is no cross-node control plane traffic.

### xDS — Workload Address Discovery

The xDS server provides workload and service discovery over a Unix socket at `/var/run/cilium/xds.sock`. It implements the [Envoy Aggregated xDS (ADS) protocol in delta mode](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration#delta-grpc-xds) — the same protocol ztunnel already speaks with istiod, so no ztunnel-side changes were needed. For each enrolled endpoint, Cilium sends a workload record containing the pod's identity and IP addresses, marked with `TunnelProtocol=HBONE`. This is how ztunnel decides whether to encrypt traffic to a given destination or pass it through as plaintext.

### ZDS — Workload Lifecycle

The [Ztunnel Discovery Service (ZDS)](https://github.com/istio/ztunnel/blob/master/proto/zds.proto) is a simplified extension to xDS, specifically designed for ztunnel, which manages the lifecycle of enrolled workloads over a Unix socket at `/var/run/cilium/ztunnel.sock`. When ztunnel connects, Cilium sends an initial snapshot of all enrolled pods — one message per pod, carrying its namespace, name, service account, and UID. Crucially, each message includes the pod's **network namespace file descriptor** passed as ancillary data via `unix.UnixRights()`. This is how ztunnel gains access to the pod's network namespace without being a sidecar. After the snapshot, incremental add/delete messages flow as pods are created or deleted.

### CA — Certificate Signing

Workload certificates — the X.509 credentials that enable mTLS — can come from two sources:

**Built-in CA**: The Cilium agent runs a gRPC server implementing `IstioCertificateService` on TCP port 15012 with TLS. Ztunnel sends Certificate Signing Requests (CSRs); the CA signs them using a pre-provisioned CA key. This mode is primarily useful for development and testing — the ACNS public preview does not use it.

**SPIRE**: For production environments, the built-in CA is disabled and ztunnel obtains certificates directly from a SPIRE agent, which provides attestation-based identity verification. SPIRE is the intended CA for production deployments and is what ships with the ACNS public preview. The operator-side SPIRE entry registration was upstreamed in [cilium/cilium#44136](https://github.com/cilium/cilium/pull/44136); ztunnel-side SPIRE support is being upstreamed to the Istio ztunnel project. See the [SPIRE Integration](#spire-integration--production-identity) section below for details.

## Traffic Interception

When a pod is enrolled via ZDS, the Cilium agent enters the pod's network namespace (using the file descriptor received during the ZDS handshake) and installs iptables rules. These rules transparently redirect the application's TCP traffic to ztunnel's listening ports.

The rules use two custom chains — `CILIUM_PREROUTING` and `CILIUM_OUTPUT` — in both the `mangle` and `nat` tables:

**Inbound traffic** (PREROUTING):

- Non-localhost TCP traffic without the inpod mark (`0x539`) is redirected to ztunnel's **inbound plaintext port** (15006). This is how ztunnel intercepts incoming traffic destined for the pod.
- Port 15008 — ztunnel's **HBONE inbound port** — is excluded from redirection since it already receives encrypted traffic directly from remote ztunnel instances.

**Outbound traffic** (OUTPUT):

- The tproxy connmark (`0x111`) is restored from the connection tracker — traffic already processed by ztunnel is allowed through without re-redirection.
- Self-addressed loopback traffic bypasses ztunnel (app-to-app on the same pod).
- DNS traffic is not redirected — the iptables rules only capture TCP, so UDP-based DNS bypasses ztunnel entirely.
- All other non-localhost TCP traffic without the inpod mark is redirected to ztunnel's **outbound port** (15001).
Comment thread
nddq marked this conversation as resolved.

The combination of packet marks (`0x539` for "ztunnel processed" and `0x111` for "already redirected") prevents redirection loops — ztunnel marks its own outbound packets so they pass through the iptables rules untouched.

## SPIRE Integration — Production Identity

[SPIRE](https://spiffe.io/docs/latest/spire-about/) is the reference implementation of the SPIFFE standard — it issues short-lived X.509 certificates to workloads after verifying their identity through a process called attestation. The SPIRE integration is the most significant extension to the ztunnel feature, spanning both the Cilium operator and the ztunnel fork. Where the built-in CA simply signs any CSR it receives, SPIRE provides cryptographic attestation: it verifies the workload's identity by inspecting its running process before issuing a certificate.

### Operator Side — Registering Identities

When a namespace is enrolled, the Cilium operator iterates all ServiceAccounts in that namespace and registers them with the SPIRE server. For each service account, it creates a SPIRE entry:

```text
SpiffeId: spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>
ParentId: spiffe://<trust-domain>/ztunnel
Selectors: [k8s:ns:<namespace>, k8s:sa:<service-account>]
```

The `ParentId` of `/ztunnel` establishes the delegation chain — it scopes which workload entries ztunnel can fetch certificates for through the Delegated Identity API. When a namespace is unenrolled, the operator batch-deletes all associated SPIRE entries.

### Ztunnel Side — Certificate Acquisition via SPIRE

On the ztunnel side, a new `SpireClient` uses SPIRE's [Delegated Identity API](https://github.com/spiffe/spire/blob/main/proto/spire/api/agent/delegatedidentity/v1/delegatedidentity.proto) to fetch X.509 certificates for workloads. When ztunnel needs a certificate, it resolves the workload's pod UID to a container PID via the Container Runtime Interface (CRI) API, then passes that PID to SPIRE for attestation. SPIRE verifies the PID belongs to the claimed identity and returns an X.509 SVID (certificate + private key). As a security measure, ztunnel re-verifies the PID after attestation to guard against PID reuse races.

Certificates are cached per-pod for SPIRE (since each pod has a distinct PID) versus per-identity for the built-in CA. Certificates are fetched on-demand when ztunnel first needs them for a connection.

## Permissive Mode

ztunnel operates in **permissive mode** by default, enabling incremental rollout without disrupting existing traffic:
Comment thread
nddq marked this conversation as resolved.

- **Enrolled → Enrolled**: Traffic is encrypted via HBONE mTLS. Both sides verify each other's SPIFFE identity.
- **Enrolled → Non-enrolled**: ztunnel proxies the traffic but sends it as **plaintext**. No encryption, no disruption.
- **Non-enrolled → Enrolled**: Traffic arrives as plaintext. ztunnel on the destination node accepts it without requiring mTLS.

This means you can enroll namespaces one at a time. Enrolled pods can still reach external APIs, databases in non-enrolled namespaces, and any other service — traffic flows normally, just without encryption for those paths. A **strict mode** — where enrolled pods reject plaintext from non-enrolled peers — isn't implemented today, but is something we could consider based on customer feedback.

## Current Limitations and What's Next

The initial release focuses on **encryption and identity** — transparent mTLS for all traffic between enrolled pods. This is the foundation that everything else builds on.

What's **not yet supported**:

- **L4/L7 network policy enforcement**: L3 policies (IP-based) work with enrolled traffic, but L4 port-based rules are not yet enforced — ztunnel rewrites destination ports during redirection, so Cilium's eBPF datapath sees the rewritten port rather than the original. L7 policies are similarly not yet supported for ztunnel-proxied connections.
- **Identity-based authorization**: Fine-grained access control based on SPIFFE identity (e.g., allowing only specific service accounts to reach a backend) is not yet supported in Cilium.

The goal is a layered architecture: ztunnel handles encryption and identity-based authorization at L4, Cilium enforces network policy at the eBPF datapath, and [waypoint proxies](https://istio.io/latest/docs/ambient/overview/#layer-7-features) are an area we can explore for L7 traffic management in the future.

:::info Ready to try it?

Follow the [step-by-step guide](https://learn.microsoft.com/azure/aks/container-network-security-cilium-mutual-tls-how-to) to enable Cilium mTLS on your AKS cluster, track upstream development on [GitHub](https://github.com/cilium/cilium), and share feedback on the [AKS GitHub Issues](https://github.com/Azure/AKS/issues) page.

:::
Loading
Loading