Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions kagenti-operator/GETTING_STARTED.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ EOF
The controller will:
1. Resolve `targetRef` and verify the Deployment exists
2. Apply `kagenti.io/type: agent` and `app.kubernetes.io/managed-by: kagenti-operator` labels
3. Compute a config hash from cluster/namespace/CR configuration and set it as a `kagenti.io/config-hash` annotation on the PodTemplateSpec
3. Compute a config hash from cluster/namespace platform configuration and set it as a `kagenti.io/config-hash` annotation on the PodTemplateSpec
4. Trigger a rolling update — new Pods are created with the `kagenti.io/type` label, which the AuthBridge webhook matches to inject sidecars

### Step 3: Check Status
Expand All @@ -156,7 +156,7 @@ kubectl get pods -n team1 -l kagenti.io/type=agent -o jsonpath='{.items[0].spec.

### Updating Configuration

When you update the AgentRuntime CR (e.g., changing the trust domain), the controller recomputes the config hash and triggers a rolling update automatically:
When you update the AgentRuntime CR (e.g., changing the trust domain), the webhook picks up the new values at pod CREATE time. CR spec changes do **not** trigger a rolling update — only platform config changes (cluster or namespace ConfigMaps) do:

```bash
kubectl patch agentruntime weather-agent-runtime -n team1 --type merge -p '
Expand Down
9 changes: 5 additions & 4 deletions kagenti-operator/docs/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -398,11 +398,12 @@ Both resources use the shared `TargetRef` type to reference the backing workload

### Configuration Precedence

The controller merges configuration from three layers (highest priority wins):
The controller computes the config hash from two platform layers (highest priority wins):

1. **AgentRuntime CR spec** — per-workload overrides (trust domain, etc.)
2. **Namespace defaults** — ConfigMap with `kagenti.io/defaults=true` label in the workload's namespace
3. **Cluster defaults** — `kagenti-platform-config` ConfigMap in `kagenti-system`
1. **Namespace defaults** — ConfigMap with `kagenti.io/defaults=true` label in the workload's namespace
2. **Cluster defaults** — `kagenti-platform-config` ConfigMap in `kagenti-system`

> **Note:** Per-CR overrides (identity, authBridgeMode, mtlsMode) are **not** included in the controller's config hash. The webhook reads these fields at pod CREATE time.

> **Note:** Feature gates (`kagenti-feature-gates`) are platform-wide policy and are **not** overrideable by namespace defaults or AgentRuntime CRs. They control which AuthBridge components (envoy proxy, SPIFFE helper, client registration) are enabled globally, and whether skill discovery (`skillDiscovery`) is active.

Expand Down
10 changes: 5 additions & 5 deletions kagenti-operator/docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ The Kagenti Operator is a Kubernetes controller that implements the [Operator Pa
#### AgentRuntime CRD
- The declarative way to enroll a workload into the Kagenti platform
- Developer creates an AgentRuntime CR with `targetRef` — the controller applies labels and triggers injection
- Configures identity (SPIFFE) per workload via 3-layer defaults (cluster → namespace → CR)
- Platform-level config (cluster → namespace) drives the config hash; per-CR overrides are read by the webhook at pod CREATE time
- Uses `targetRef` to reference backing workloads (Deployment, StatefulSet)
- The `kagenti.io/type` label applied by the controller triggers the webhook's `objectSelector`
- Developer workloads only need a `protocol.kagenti.io/a2a` label — the controller applies `kagenti.io/type` and `managed-by` labels automatically
Expand Down Expand Up @@ -69,7 +69,7 @@ The Kagenti Operator is a Kubernetes controller that implements the [Operator Pa
#### AgentRuntime Controller
- Watches AgentRuntime CRs, Deployments, StatefulSets, and ConfigMaps
- Applies `kagenti.io/type` label and `kagenti.io/config-hash` annotation to target workloads
- Computes config hash from 3-layer merged configuration (cluster defaults → namespace defaults → CR overrides)
- Computes config hash from 2-layer merged configuration (cluster defaults → namespace defaults)
- Discovers linked skills by reading the `kagenti.io/skills` annotation from target workloads when the `skillDiscovery` feature gate is enabled
- Triggers rolling updates when configuration changes
- On CR deletion: removes type label, managed-by label and config-hash annotation (causing the workload to lose sidecars)
Expand Down Expand Up @@ -206,7 +206,7 @@ The AgentRuntime Controller reconciles AgentRuntime CRs by resolving the target
Note: feature gates are platform-wide policy — they are NOT
overrideable by namespace defaults or AgentRuntime CRs.
c. Read namespace defaults (ConfigMap with kagenti.io/defaults=true)
d. Merge defaults: cluster → namespace → CR spec (CR wins)
d. Merge defaults: cluster → namespace (2-layer, no CR fields)
e. Hash the merged result (deterministic SHA256)
f. Surface warnings (e.g., multiple namespace defaults ConfigMaps)
as a ConfigResolved condition on the AgentRuntime status
Expand All @@ -231,10 +231,10 @@ AgentRuntime CR created/updated

| Concern | Controller | Webhook |
|---------|-----------|---------|
| Detect config change | Yes (3-layer merge + hash) | No |
| Detect config change | Yes (2-layer merge + hash) | No |
| Trigger pod restart | Yes (annotation on PodTemplateSpec) | No |
| Read ConfigMap data | Yes (for hash computation) | Yes (for sidecar configuration) |
| Merge config values | Yes (same 3-layer merge) | Yes (independently) |
| Merge config values | Yes (2-layer platform config) | Yes (independently) |
| Mutate pod spec | No | Yes (sidecar injection) |

#### Watches
Expand Down
4 changes: 2 additions & 2 deletions kagenti-operator/docs/authbridge-webhook.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ Pod CREATE request
└─ Mount operator Keycloak Secret (if annotation present) → PATCH
```

## Configuration Merge (3-Layer Config Resolution)
## Configuration Merge (Webhook Config Resolution)

When the `perWorkloadConfigResolution` feature gate is enabled, the webhook resolves configuration values at admission time instead of deferring to kubelet's ConfigMapKeyRef/SecretKeyRef resolution. This merge happens in `ResolveConfig()` (`internal/webhook/injector/resolved_config.go`).

Expand Down Expand Up @@ -285,7 +285,7 @@ Per-workload iptables overrides for proxy-init:
| `internal/webhook/injector/pod_mutator.go` | Central orchestrator — pre-filtering, precedence evaluation, AgentRuntime gate, container/volume injection |
| `internal/webhook/injector/precedence.go` | Per-sidecar 2-layer precedence chain (feature gate > workload label); opt-in semantics for client-registration |
| `internal/webhook/injector/keycloak_client_credentials.go` | Operator-managed Keycloak Secret volume mounts and reinvocation patch logic |
| `internal/webhook/injector/resolved_config.go` | 3-layer config merge: PlatformConfig < namespace CMs < AgentRuntime CR |
| `internal/webhook/injector/resolved_config.go` | Webhook config merge: PlatformConfig < namespace CMs < AgentRuntime CR overrides (at admission time) |
| `internal/webhook/injector/agentruntime_config.go` | Typed AgentRuntime CR lookup and override extraction |
| `internal/webhook/injector/namespace_config.go` | Reads well-known ConfigMaps from workload namespace |
| `internal/webhook/injector/container_builder.go` | Dual-mode container construction (ValueFrom vs literal env vars) |
Expand Down
46 changes: 20 additions & 26 deletions kagenti-operator/docs/controller-webhook-interaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ sequenceDiagram
Ctrl->>API: Get cluster defaults ConfigMap (kagenti-platform-config)
Ctrl->>API: Get feature gates ConfigMap (kagenti-feature-gates)
Ctrl->>API: List namespace defaults ConfigMaps (kagenti.io/defaults=true)
Note over Ctrl: Merge config: cluster → namespace → CR spec
Note over Ctrl: Merge config: cluster → namespace (2-layer)
Note over Ctrl: Compute SHA256 config hash

Ctrl->>API: Patch Deployment:<br/>+ kagenti.io/type label<br/>+ managed-by label<br/>+ config-hash annotation on PodTemplateSpec
Expand Down Expand Up @@ -63,16 +63,11 @@ sequenceDiagram
API-->>Ctrl: Reconcile event

Ctrl->>API: Get target Deployment
Note over Ctrl: Recompute config hash with updated CR values
Note over Ctrl: New hash ≠ old hash → update needed
Note over Ctrl: Recompute config hash (2-layer, no CR fields)
Note over Ctrl: Hash unchanged → no rolling update needed

Ctrl->>API: Patch Deployment config-hash annotation
API-->>K8s: PodTemplateSpec changed → rolling update
K8s->>API: Create new Pod
API->>WH: Mutating admission (Pod CREATE)
Note over WH: Resolve config with updated AgentRuntime values
WH-->>API: Patch Pod with updated sidecar config
API-->>K8s: Pod created with new config
Note over Ctrl: CR spec changes do NOT trigger rolling updates.
Note over Ctrl: The webhook reads CR overrides at pod CREATE time.

Ctrl->>API: Update AgentRuntime status
```
Expand All @@ -91,12 +86,12 @@ sequenceDiagram

loop For each AgentRuntime targeting a workload
Ctrl->>API: Get target Deployment
Note over Ctrl: Recompute config hash with new defaults
Note over Ctrl: Recompute config hash (2-layer)
alt Hash changed
Ctrl->>API: Patch Deployment config-hash annotation
API-->>K8s: Rolling update → new Pods with updated sidecars
else Hash unchanged
Note over Ctrl: No-op (CR overrides masked the change)
Note over Ctrl: No-op (defaults unchanged)
end
Ctrl->>API: Update AgentRuntime status
end
Expand Down Expand Up @@ -128,39 +123,38 @@ sequenceDiagram

| Concern | Controller | Webhook |
|---------|-----------|---------|
| Detect config change | Yes (3-layer merge + hash) | No |
| Detect config change | Yes (2-layer merge + hash) | No |
| Trigger pod restart | Yes (annotation on PodTemplateSpec) | No |
| Read ConfigMap data | Yes (for hash computation) | Yes (for sidecar configuration) |
| Merge config values | Yes (same 3-layer merge) | Yes (independently, same algorithm) |
| Merge config values | Yes (2-layer platform config) | Yes (independently, includes CR overrides at admission time) |
| Mutate pod spec | No | Yes (sidecar injection) |
| Read AgentRuntime CR | Yes (primary resource) | Yes (for per-workload overrides) |
| Apply workload labels | Yes | No |
| Decide injection eligibility | No (encodes in labels) | Yes (objectSelector + precedence chain) |

## 3-Layer Configuration Merge
## 2-Layer Configuration Merge (Controller)

Both the controller and webhook perform the same 3-layer configuration merge independently:
The controller computes the config hash from platform-level configuration only (no CR fields):

```
┌──────────────────────────────────────┐
│ Layer 3: AgentRuntime CR overrides │ ← highest precedence
│ (spec.identity) │
├──────────────────────────────────────┤
│ Layer 2: Namespace defaults │
│ Layer 2: Namespace defaults │ ← higher precedence
│ (ConfigMap with │
│ kagenti.io/defaults=true label) │
├──────────────────────────────────────┤
│ Layer 1: Cluster defaults │ ← lowest precedence
│ Layer 1: Cluster defaults │ ← lower precedence
│ (kagenti-platform-config in │
│ kagenti-system namespace) │
└──────────────────────────────────────┘
```

**CR-level overrides** (type, identity, authBridgeMode, mtlsMode, skills) are **not** included in the controller's config hash. The webhook reads these fields at pod CREATE time.

**Feature gates** (`kagenti-feature-gates` ConfigMap) are platform-wide policy and are **not** part of the merge hierarchy. They control which sidecar components are enabled globally and cannot be overridden by namespace defaults or AgentRuntime CRs.

The controller uses the merged config to compute a deterministic SHA256 hash. This hash is set as the `kagenti.io/config-hash` annotation on the workload's PodTemplateSpec. When any layer changes, the hash changes, which triggers a Kubernetes rolling update.
The controller uses the merged config to compute a deterministic SHA256 hash. This hash is set as the `kagenti.io/config-hash` annotation on the workload's PodTemplateSpec. When platform config changes (cluster or namespace ConfigMaps), the hash changes, which triggers a Kubernetes rolling update. CR spec changes do **not** change the hash.

The webhook performs the same merge at Pod CREATE time to resolve the actual configuration values used for sidecar container environment variables.
The webhook performs its own merge at Pod CREATE time, including CR overrides, to resolve the actual configuration values used for sidecar container environment variables.

> **Note:** The controller and webhook use slightly different sources for layer 1. The controller reads the `kagenti-platform-config` ConfigMap from `kagenti-system` via the API server. The webhook uses compiled defaults overlaid with `/etc/kagenti/config.yaml` (PlatformConfig), which is designed to carry equivalent values. Both produce the same effective defaults in a correctly deployed cluster.

Expand All @@ -178,7 +172,7 @@ The webhook loads **PlatformConfig** at startup from compiled defaults overlaid
- SPIFFE trust domain and socket path
- Observability settings (trace endpoint, protocol, sampling)

PlatformConfig is hot-reloaded via fsnotify when the config file changes. It forms **layer 1** (lowest precedence) of the 3-layer merge.
PlatformConfig is hot-reloaded via fsnotify when the config file changes. It forms **layer 1** (lowest precedence) of the configuration merge.

### Feature Gates (Global Policy)

Expand All @@ -193,7 +187,7 @@ Feature gates are loaded from the `kagenti-feature-gates` ConfigMap (mounted at
| `injectTools` | `false` | Allow injection for `kagenti.io/type=tool` workloads |
| `perWorkloadConfigResolution` | `false` | Switch from ValueFrom refs to literal env var injection |

Feature gates are **not** part of the 3-layer merge — they cannot be overridden by namespace defaults or AgentRuntime CRs.
Feature gates are **not** part of the config merge — they cannot be overridden by namespace defaults or AgentRuntime CRs.

### Config Resolution Modes

Expand All @@ -206,7 +200,7 @@ When `perWorkloadConfigResolution` is **true**, the webhook resolves all config
When a workload has `kagenti.io/type` labels applied manually (without an AgentRuntime CR):

- The webhook still evaluates the workload for injection using PlatformConfig and feature gates
- The AgentRuntime override layer (layer 3) is skipped — configuration comes from PlatformConfig (layer 1) and namespace ConfigMaps (layer 2) only
- Configuration comes from PlatformConfig (layer 1) and namespace ConfigMaps (layer 2) only
- No controller manages the config hash — configuration drift is not detected automatically, and changes to cluster/namespace defaults do not trigger rolling updates
- The controller does not watch or reconcile these workloads
- Per-workload identity (SPIFFE trust domain) overrides are not available
Expand Down
Loading
Loading