diff --git a/kagenti-operator/GETTING_STARTED.md b/kagenti-operator/GETTING_STARTED.md
index c495f856..e67b2377 100644
--- a/kagenti-operator/GETTING_STARTED.md
+++ b/kagenti-operator/GETTING_STARTED.md
@@ -131,7 +131,7 @@ EOF
The controller will:
1. Resolve `targetRef` and verify the Deployment exists
2. Apply `kagenti.io/type: agent` and `app.kubernetes.io/managed-by: kagenti-operator` labels
-3. Compute a config hash from cluster/namespace/CR configuration and set it as a `kagenti.io/config-hash` annotation on the PodTemplateSpec
+3. Compute a config hash from cluster/namespace platform configuration and set it as a `kagenti.io/config-hash` annotation on the PodTemplateSpec
4. Trigger a rolling update — new Pods are created with the `kagenti.io/type` label, which the AuthBridge webhook matches to inject sidecars
### Step 3: Check Status
@@ -156,7 +156,7 @@ kubectl get pods -n team1 -l kagenti.io/type=agent -o jsonpath='{.items[0].spec.
### Updating Configuration
-When you update the AgentRuntime CR (e.g., changing the trust domain), the controller recomputes the config hash and triggers a rolling update automatically:
+When you update the AgentRuntime CR (e.g., changing the trust domain), the webhook picks up the new values at pod CREATE time. CR spec changes do **not** trigger a rolling update — only platform config changes (cluster or namespace ConfigMaps) do:
```bash
kubectl patch agentruntime weather-agent-runtime -n team1 --type merge -p '
diff --git a/kagenti-operator/docs/api-reference.md b/kagenti-operator/docs/api-reference.md
index 72d4253b..ed719b06 100644
--- a/kagenti-operator/docs/api-reference.md
+++ b/kagenti-operator/docs/api-reference.md
@@ -398,11 +398,12 @@ Both resources use the shared `TargetRef` type to reference the backing workload
### Configuration Precedence
-The controller merges configuration from three layers (highest priority wins):
+The controller computes the config hash from two platform layers (highest priority wins):
-1. **AgentRuntime CR spec** — per-workload overrides (trust domain, etc.)
-2. **Namespace defaults** — ConfigMap with `kagenti.io/defaults=true` label in the workload's namespace
-3. **Cluster defaults** — `kagenti-platform-config` ConfigMap in `kagenti-system`
+1. **Namespace defaults** — ConfigMap with `kagenti.io/defaults=true` label in the workload's namespace
+2. **Cluster defaults** — `kagenti-platform-config` ConfigMap in `kagenti-system`
+
+> **Note:** Per-CR overrides (identity, authBridgeMode, mtlsMode) are **not** included in the controller's config hash. The webhook reads these fields at pod CREATE time.
> **Note:** Feature gates (`kagenti-feature-gates`) are platform-wide policy and are **not** overrideable by namespace defaults or AgentRuntime CRs. They control which AuthBridge components (envoy proxy, SPIFFE helper, client registration) are enabled globally, and whether skill discovery (`skillDiscovery`) is active.
diff --git a/kagenti-operator/docs/architecture.md b/kagenti-operator/docs/architecture.md
index 814bd045..7a022f4f 100644
--- a/kagenti-operator/docs/architecture.md
+++ b/kagenti-operator/docs/architecture.md
@@ -41,7 +41,7 @@ The Kagenti Operator is a Kubernetes controller that implements the [Operator Pa
#### AgentRuntime CRD
- The declarative way to enroll a workload into the Kagenti platform
- Developer creates an AgentRuntime CR with `targetRef` — the controller applies labels and triggers injection
-- Configures identity (SPIFFE) per workload via 3-layer defaults (cluster → namespace → CR)
+- Platform-level config (cluster → namespace) drives the config hash; per-CR overrides are read by the webhook at pod CREATE time
- Uses `targetRef` to reference backing workloads (Deployment, StatefulSet)
- The `kagenti.io/type` label applied by the controller triggers the webhook's `objectSelector`
- Developer workloads only need a `protocol.kagenti.io/a2a` label — the controller applies `kagenti.io/type` and `managed-by` labels automatically
@@ -69,7 +69,7 @@ The Kagenti Operator is a Kubernetes controller that implements the [Operator Pa
#### AgentRuntime Controller
- Watches AgentRuntime CRs, Deployments, StatefulSets, and ConfigMaps
- Applies `kagenti.io/type` label and `kagenti.io/config-hash` annotation to target workloads
-- Computes config hash from 3-layer merged configuration (cluster defaults → namespace defaults → CR overrides)
+- Computes config hash from 2-layer merged configuration (cluster defaults → namespace defaults)
- Discovers linked skills by reading the `kagenti.io/skills` annotation from target workloads when the `skillDiscovery` feature gate is enabled
- Triggers rolling updates when configuration changes
- On CR deletion: removes type label, managed-by label and config-hash annotation (causing the workload to lose sidecars)
@@ -206,7 +206,7 @@ The AgentRuntime Controller reconciles AgentRuntime CRs by resolving the target
Note: feature gates are platform-wide policy — they are NOT
overrideable by namespace defaults or AgentRuntime CRs.
c. Read namespace defaults (ConfigMap with kagenti.io/defaults=true)
- d. Merge defaults: cluster → namespace → CR spec (CR wins)
+ d. Merge defaults: cluster → namespace (2-layer, no CR fields)
e. Hash the merged result (deterministic SHA256)
f. Surface warnings (e.g., multiple namespace defaults ConfigMaps)
as a ConfigResolved condition on the AgentRuntime status
@@ -231,10 +231,10 @@ AgentRuntime CR created/updated
| Concern | Controller | Webhook |
|---------|-----------|---------|
-| Detect config change | Yes (3-layer merge + hash) | No |
+| Detect config change | Yes (2-layer merge + hash) | No |
| Trigger pod restart | Yes (annotation on PodTemplateSpec) | No |
| Read ConfigMap data | Yes (for hash computation) | Yes (for sidecar configuration) |
-| Merge config values | Yes (same 3-layer merge) | Yes (independently) |
+| Merge config values | Yes (2-layer platform config) | Yes (independently) |
| Mutate pod spec | No | Yes (sidecar injection) |
#### Watches
diff --git a/kagenti-operator/docs/authbridge-webhook.md b/kagenti-operator/docs/authbridge-webhook.md
index a7daec8d..acef64be 100644
--- a/kagenti-operator/docs/authbridge-webhook.md
+++ b/kagenti-operator/docs/authbridge-webhook.md
@@ -125,7 +125,7 @@ Pod CREATE request
└─ Mount operator Keycloak Secret (if annotation present) → PATCH
```
-## Configuration Merge (3-Layer Config Resolution)
+## Configuration Merge (Webhook Config Resolution)
When the `perWorkloadConfigResolution` feature gate is enabled, the webhook resolves configuration values at admission time instead of deferring to kubelet's ConfigMapKeyRef/SecretKeyRef resolution. This merge happens in `ResolveConfig()` (`internal/webhook/injector/resolved_config.go`).
@@ -285,7 +285,7 @@ Per-workload iptables overrides for proxy-init:
| `internal/webhook/injector/pod_mutator.go` | Central orchestrator — pre-filtering, precedence evaluation, AgentRuntime gate, container/volume injection |
| `internal/webhook/injector/precedence.go` | Per-sidecar 2-layer precedence chain (feature gate > workload label); opt-in semantics for client-registration |
| `internal/webhook/injector/keycloak_client_credentials.go` | Operator-managed Keycloak Secret volume mounts and reinvocation patch logic |
-| `internal/webhook/injector/resolved_config.go` | 3-layer config merge: PlatformConfig < namespace CMs < AgentRuntime CR |
+| `internal/webhook/injector/resolved_config.go` | Webhook config merge: PlatformConfig < namespace CMs < AgentRuntime CR overrides (at admission time) |
| `internal/webhook/injector/agentruntime_config.go` | Typed AgentRuntime CR lookup and override extraction |
| `internal/webhook/injector/namespace_config.go` | Reads well-known ConfigMaps from workload namespace |
| `internal/webhook/injector/container_builder.go` | Dual-mode container construction (ValueFrom vs literal env vars) |
diff --git a/kagenti-operator/docs/controller-webhook-interaction.md b/kagenti-operator/docs/controller-webhook-interaction.md
index 519c04db..cb5361c3 100644
--- a/kagenti-operator/docs/controller-webhook-interaction.md
+++ b/kagenti-operator/docs/controller-webhook-interaction.md
@@ -33,7 +33,7 @@ sequenceDiagram
Ctrl->>API: Get cluster defaults ConfigMap (kagenti-platform-config)
Ctrl->>API: Get feature gates ConfigMap (kagenti-feature-gates)
Ctrl->>API: List namespace defaults ConfigMaps (kagenti.io/defaults=true)
- Note over Ctrl: Merge config: cluster → namespace → CR spec
+ Note over Ctrl: Merge config: cluster → namespace (2-layer)
Note over Ctrl: Compute SHA256 config hash
Ctrl->>API: Patch Deployment:
+ kagenti.io/type label
+ managed-by label
+ config-hash annotation on PodTemplateSpec
@@ -63,16 +63,11 @@ sequenceDiagram
API-->>Ctrl: Reconcile event
Ctrl->>API: Get target Deployment
- Note over Ctrl: Recompute config hash with updated CR values
- Note over Ctrl: New hash ≠ old hash → update needed
+ Note over Ctrl: Recompute config hash (2-layer, no CR fields)
+ Note over Ctrl: Hash unchanged → no rolling update needed
- Ctrl->>API: Patch Deployment config-hash annotation
- API-->>K8s: PodTemplateSpec changed → rolling update
- K8s->>API: Create new Pod
- API->>WH: Mutating admission (Pod CREATE)
- Note over WH: Resolve config with updated AgentRuntime values
- WH-->>API: Patch Pod with updated sidecar config
- API-->>K8s: Pod created with new config
+ Note over Ctrl: CR spec changes do NOT trigger rolling updates.
+ Note over Ctrl: The webhook reads CR overrides at pod CREATE time.
Ctrl->>API: Update AgentRuntime status
```
@@ -91,12 +86,12 @@ sequenceDiagram
loop For each AgentRuntime targeting a workload
Ctrl->>API: Get target Deployment
- Note over Ctrl: Recompute config hash with new defaults
+ Note over Ctrl: Recompute config hash (2-layer)
alt Hash changed
Ctrl->>API: Patch Deployment config-hash annotation
API-->>K8s: Rolling update → new Pods with updated sidecars
else Hash unchanged
- Note over Ctrl: No-op (CR overrides masked the change)
+ Note over Ctrl: No-op (defaults unchanged)
end
Ctrl->>API: Update AgentRuntime status
end
@@ -128,39 +123,38 @@ sequenceDiagram
| Concern | Controller | Webhook |
|---------|-----------|---------|
-| Detect config change | Yes (3-layer merge + hash) | No |
+| Detect config change | Yes (2-layer merge + hash) | No |
| Trigger pod restart | Yes (annotation on PodTemplateSpec) | No |
| Read ConfigMap data | Yes (for hash computation) | Yes (for sidecar configuration) |
-| Merge config values | Yes (same 3-layer merge) | Yes (independently, same algorithm) |
+| Merge config values | Yes (2-layer platform config) | Yes (independently, includes CR overrides at admission time) |
| Mutate pod spec | No | Yes (sidecar injection) |
| Read AgentRuntime CR | Yes (primary resource) | Yes (for per-workload overrides) |
| Apply workload labels | Yes | No |
| Decide injection eligibility | No (encodes in labels) | Yes (objectSelector + precedence chain) |
-## 3-Layer Configuration Merge
+## 2-Layer Configuration Merge (Controller)
-Both the controller and webhook perform the same 3-layer configuration merge independently:
+The controller computes the config hash from platform-level configuration only (no CR fields):
```
┌──────────────────────────────────────┐
-│ Layer 3: AgentRuntime CR overrides │ ← highest precedence
-│ (spec.identity) │
-├──────────────────────────────────────┤
-│ Layer 2: Namespace defaults │
+│ Layer 2: Namespace defaults │ ← higher precedence
│ (ConfigMap with │
│ kagenti.io/defaults=true label) │
├──────────────────────────────────────┤
-│ Layer 1: Cluster defaults │ ← lowest precedence
+│ Layer 1: Cluster defaults │ ← lower precedence
│ (kagenti-platform-config in │
│ kagenti-system namespace) │
└──────────────────────────────────────┘
```
+**CR-level overrides** (type, identity, authBridgeMode, mtlsMode, skills) are **not** included in the controller's config hash. The webhook reads these fields at pod CREATE time.
+
**Feature gates** (`kagenti-feature-gates` ConfigMap) are platform-wide policy and are **not** part of the merge hierarchy. They control which sidecar components are enabled globally and cannot be overridden by namespace defaults or AgentRuntime CRs.
-The controller uses the merged config to compute a deterministic SHA256 hash. This hash is set as the `kagenti.io/config-hash` annotation on the workload's PodTemplateSpec. When any layer changes, the hash changes, which triggers a Kubernetes rolling update.
+The controller uses the merged config to compute a deterministic SHA256 hash. This hash is set as the `kagenti.io/config-hash` annotation on the workload's PodTemplateSpec. When platform config changes (cluster or namespace ConfigMaps), the hash changes, which triggers a Kubernetes rolling update. CR spec changes do **not** change the hash.
-The webhook performs the same merge at Pod CREATE time to resolve the actual configuration values used for sidecar container environment variables.
+The webhook performs its own merge at Pod CREATE time, including CR overrides, to resolve the actual configuration values used for sidecar container environment variables.
> **Note:** The controller and webhook use slightly different sources for layer 1. The controller reads the `kagenti-platform-config` ConfigMap from `kagenti-system` via the API server. The webhook uses compiled defaults overlaid with `/etc/kagenti/config.yaml` (PlatformConfig), which is designed to carry equivalent values. Both produce the same effective defaults in a correctly deployed cluster.
@@ -178,7 +172,7 @@ The webhook loads **PlatformConfig** at startup from compiled defaults overlaid
- SPIFFE trust domain and socket path
- Observability settings (trace endpoint, protocol, sampling)
-PlatformConfig is hot-reloaded via fsnotify when the config file changes. It forms **layer 1** (lowest precedence) of the 3-layer merge.
+PlatformConfig is hot-reloaded via fsnotify when the config file changes. It forms **layer 1** (lowest precedence) of the configuration merge.
### Feature Gates (Global Policy)
@@ -193,7 +187,7 @@ Feature gates are loaded from the `kagenti-feature-gates` ConfigMap (mounted at
| `injectTools` | `false` | Allow injection for `kagenti.io/type=tool` workloads |
| `perWorkloadConfigResolution` | `false` | Switch from ValueFrom refs to literal env var injection |
-Feature gates are **not** part of the 3-layer merge — they cannot be overridden by namespace defaults or AgentRuntime CRs.
+Feature gates are **not** part of the config merge — they cannot be overridden by namespace defaults or AgentRuntime CRs.
### Config Resolution Modes
@@ -206,7 +200,7 @@ When `perWorkloadConfigResolution` is **true**, the webhook resolves all config
When a workload has `kagenti.io/type` labels applied manually (without an AgentRuntime CR):
- The webhook still evaluates the workload for injection using PlatformConfig and feature gates
-- The AgentRuntime override layer (layer 3) is skipped — configuration comes from PlatformConfig (layer 1) and namespace ConfigMaps (layer 2) only
+- Configuration comes from PlatformConfig (layer 1) and namespace ConfigMaps (layer 2) only
- No controller manages the config hash — configuration drift is not detected automatically, and changes to cluster/namespace defaults do not trigger rolling updates
- The controller does not watch or reconcile these workloads
- Per-workload identity (SPIFFE trust domain) overrides are not available
diff --git a/kagenti-operator/internal/controller/agentruntime_config.go b/kagenti-operator/internal/controller/agentruntime_config.go
index 50567823..7c249b86 100644
--- a/kagenti-operator/internal/controller/agentruntime_config.go
+++ b/kagenti-operator/internal/controller/agentruntime_config.go
@@ -27,8 +27,6 @@ import (
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
-
- agentv1alpha1 "github.com/kagenti/operator/api/v1alpha1"
)
const (
@@ -52,24 +50,12 @@ const (
)
// resolvedConfig is the canonical representation used for hash computation.
-// It captures the merged result of cluster defaults → namespace defaults → CR overrides.
-//
-// Structured fields (Type, TrustDomain) hold CR-level overrides.
-// FeatureGates and Defaults hold the raw ConfigMap data. The hash is computed
-// from the full struct — the webhook performs the same merge independently
-// at Pod CREATE time.
+// It captures the 2-layer merge of cluster defaults → namespace defaults.
+// CR-level fields (type, identity, skills, etc.) are NOT included — the
+// webhook reads those at pod CREATE time (RHAIENG-4936).
type resolvedConfig struct {
- Type string `json:"type"`
- TrustDomain string `json:"trustDomain,omitempty"`
FeatureGates map[string]string `json:"featureGates,omitempty"`
Defaults map[string]string `json:"defaults,omitempty"`
- // AuthBridgeMode and MTLSMode change the injected sidecar shape /
- // transport posture, both of which require a pod restart to take
- // effect. Including them here folds CR-edit changes into the
- // config-hash so applyWorkloadConfig stamps a new hash on the pod
- // template and the Deployment rolls.
- AuthBridgeMode string `json:"authBridgeMode,omitempty"`
- MTLSMode string `json:"mtlsMode,omitempty"`
// AuthBridgeRuntime captures the namespace authbridge-runtime-config
// ConfigMap's config.yaml content so namespace-level edits flow into
@@ -77,15 +63,6 @@ type resolvedConfig struct {
// pipelines/listener/mtls config drift through here in any shape and
// we want any byte change to roll the workload. Empty string when
// the ConfigMap doesn't exist in the namespace.
- //
- // Operational note: a single edit to authbridge-runtime-config
- // re-hashes every AgentRuntime in the namespace and reconciles them
- // in a burst. Kubernetes sequences the actual pod rolls per
- // Deployment, but the controller's reconcile load scales linearly
- // with the number of AgentRuntimes. For typical small namespaces
- // (single-digit agents) this is fine; in larger deployments,
- // formatting / whitespace edits to this CM during peak hours will
- // trigger a noticeable rollout fan-out.
AuthBridgeRuntime string `json:"authBridgeRuntime,omitempty"`
}
@@ -95,11 +72,11 @@ type ConfigResult struct {
Warnings []string
}
-// ComputeConfigHash computes a deterministic SHA256 hash from the 3-layer
-// merged configuration: cluster defaults → namespace defaults → AgentRuntime CR.
-// Both the controller and webhook perform the same merge independently.
-func ComputeConfigHash(ctx context.Context, c client.Reader, namespace string, spec *agentv1alpha1.AgentRuntimeSpec) (ConfigResult, error) {
- resolved, warnings := resolveConfig(ctx, c, namespace, spec)
+// ComputeConfigHash computes a deterministic SHA256 hash from the 2-layer
+// merged configuration: cluster defaults → namespace defaults.
+// CR-level fields are excluded — the webhook reads those at pod CREATE time.
+func ComputeConfigHash(ctx context.Context, c client.Reader, namespace string) (ConfigResult, error) {
+ resolved, warnings := resolveConfig(ctx, c, namespace)
hash, err := hashResolvedConfig(resolved)
if err != nil {
return ConfigResult{}, err
@@ -107,20 +84,10 @@ func ComputeConfigHash(ctx context.Context, c client.Reader, namespace string, s
return ConfigResult{Hash: hash, Warnings: warnings}, nil
}
-// ComputeDefaultsOnlyHash computes a hash using only cluster + namespace defaults
-// (no CR overrides). Used when an AgentRuntime is deleted to trigger a rolling
-// update back to platform defaults.
-func ComputeDefaultsOnlyHash(ctx context.Context, c client.Reader, namespace string) (string, error) {
- resolved, _ := resolveConfig(ctx, c, namespace, nil)
- return hashResolvedConfig(resolved)
-}
-
-// resolveConfig merges the three configuration layers:
+// resolveConfig merges the two platform configuration layers:
// 1. Cluster defaults (ConfigMaps in kagenti-system)
// 2. Namespace defaults (ConfigMap with kagenti.io/defaults=true label)
-// 3. AgentRuntime CR spec (highest priority)
-func resolveConfig(ctx context.Context, c client.Reader, namespace string, spec *agentv1alpha1.AgentRuntimeSpec) (resolvedConfig, []string) {
- logger := log.FromContext(ctx)
+func resolveConfig(ctx context.Context, c client.Reader, namespace string) (resolvedConfig, []string) {
var warnings []string
// Layer 1: cluster defaults
@@ -143,30 +110,11 @@ func resolveConfig(ctx context.Context, c client.Reader, namespace string, spec
abRuntime = data["config.yaml"]
}
- resolved := resolvedConfig{
+ return resolvedConfig{
FeatureGates: featureGates,
Defaults: merged,
AuthBridgeRuntime: abRuntime,
- }
-
- if spec == nil {
- logger.V(2).Info("Resolved config with defaults only", "namespace", namespace)
- return resolved, warnings
- }
-
- // Layer 3: CR overrides (highest priority).
- // Structured fields capture only CR-level overrides so they don't
- // duplicate values already present in the Defaults map.
- resolved.Type = string(spec.Type)
-
- if spec.Identity != nil && spec.Identity.SPIFFE != nil && spec.Identity.SPIFFE.TrustDomain != "" {
- resolved.TrustDomain = spec.Identity.SPIFFE.TrustDomain
- }
-
- resolved.AuthBridgeMode = spec.AuthBridgeMode
- resolved.MTLSMode = spec.MTLSMode
-
- return resolved, warnings
+ }, warnings
}
// readConfigMapData reads a specific ConfigMap by name and namespace.
diff --git a/kagenti-operator/internal/controller/agentruntime_config_test.go b/kagenti-operator/internal/controller/agentruntime_config_test.go
index 3a3dc08d..0c422c89 100644
--- a/kagenti-operator/internal/controller/agentruntime_config_test.go
+++ b/kagenti-operator/internal/controller/agentruntime_config_test.go
@@ -103,61 +103,41 @@ var _ = Describe("AgentRuntime Config", func() {
Context("ComputeConfigHash", func() {
It("should be deterministic", func() {
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-det"},
- }
-
- result1, err := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ result1, err := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(err).NotTo(HaveOccurred())
- result2, err := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ result2, err := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(err).NotTo(HaveOccurred())
Expect(result1.Hash).To(Equal(result2.Hash))
})
- It("should change when spec type changes", func() {
- spec1 := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-type"},
- }
- spec2 := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeTool,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-type"},
- }
-
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec1)
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec2)
-
- Expect(r1.Hash).NotTo(Equal(r2.Hash))
+ It("should NOT change when spec type changes", func() {
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ Expect(r1.Hash).To(Equal(r2.Hash))
})
- It("should change when identity changes", func() {
- spec1 := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-id"},
- Identity: &agentv1alpha1.IdentitySpec{SPIFFE: &agentv1alpha1.SPIFFEIdentity{TrustDomain: "example.org"}},
- }
- spec2 := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-id"},
- Identity: &agentv1alpha1.IdentitySpec{SPIFFE: &agentv1alpha1.SPIFFEIdentity{TrustDomain: "other.org"}},
- }
+ It("should NOT change when identity/TrustDomain changes", func() {
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ Expect(r1.Hash).To(Equal(r2.Hash))
+ })
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec1)
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec2)
+ It("should NOT change when MTLSMode changes", func() {
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ Expect(r1.Hash).To(Equal(r2.Hash))
+ })
- Expect(r1.Hash).NotTo(Equal(r2.Hash))
+ It("should NOT change when AuthBridgeMode changes", func() {
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
+ Expect(r1.Hash).To(Equal(r2.Hash))
})
It("should produce a non-empty hash even with missing ConfigMaps", func() {
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-missing"},
- }
-
- result, err := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ result, err := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(err).NotTo(HaveOccurred())
Expect(result.Hash).NotTo(BeEmpty())
})
@@ -166,18 +146,12 @@ var _ = Describe("AgentRuntime Config", func() {
cm := createClusterDefaults(ctx, map[string]string{"otel-endpoint": "collector-v1:4317"})
defer func() { _ = k8sClient.Delete(ctx, cm) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-cluster"},
- }
-
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
- // Update ConfigMap
cm.Data["otel-endpoint"] = "collector-v2:4317"
Expect(k8sClient.Update(ctx, cm)).To(Succeed())
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(r1.Hash).NotTo(Equal(r2.Hash))
})
@@ -185,17 +159,12 @@ var _ = Describe("AgentRuntime Config", func() {
fg := createClusterFeatureGates(ctx, map[string]string{"globalEnabled": "true"})
defer func() { _ = k8sClient.Delete(ctx, fg) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-fg"},
- }
-
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
fg.Data["globalEnabled"] = "false"
Expect(k8sClient.Update(ctx, fg)).To(Succeed())
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(r1.Hash).NotTo(Equal(r2.Hash))
})
@@ -206,18 +175,13 @@ var _ = Describe("AgentRuntime Config", func() {
})
defer func() { _ = k8sClient.Delete(ctx, fg) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-inject-tools"},
- }
-
- r1, err := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r1, err := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(err).NotTo(HaveOccurred())
fg.Data["injectTools"] = "true"
Expect(k8sClient.Update(ctx, fg)).To(Succeed())
- r2, err := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r2, err := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(err).NotTo(HaveOccurred())
Expect(r1.Hash).NotTo(Equal(r2.Hash))
})
@@ -226,65 +190,16 @@ var _ = Describe("AgentRuntime Config", func() {
nsCM := createNamespaceDefaults(ctx, "ns-defaults-hash", namespace, map[string]string{"sampling-rate": "0.1"})
defer func() { _ = k8sClient.Delete(ctx, nsCM) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-ns"},
- }
-
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
nsCM.Data["sampling-rate"] = "1.0"
Expect(k8sClient.Update(ctx, nsCM)).To(Succeed())
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
- Expect(r1.Hash).NotTo(Equal(r2.Hash))
- })
-
- It("should change when MTLSMode flips on the CR", func() {
- // CR-side parallel to the CM-edit test below: spec.mtlsMode
- // must feed the hash so flipping disabled→strict on a CR
- // rolls the workload. Without an explicit assertion a
- // future refactor that drops MTLSMode from resolvedConfig
- // would silently regress rollout-on-CR-edit.
- specOff := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-mtls-cr"},
- MTLSMode: "disabled",
- }
- specOn := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-mtls-cr"},
- MTLSMode: "strict",
- }
-
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, specOff)
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, specOn)
- Expect(r1.Hash).NotTo(Equal(r2.Hash))
- })
-
- It("should change when AuthBridgeMode flips on the CR", func() {
- // Bonus: AuthBridgeMode rollouts had a pre-existing gap
- // (not in resolvedConfig) that this PR closed. Lock it.
- specA := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-abm-cr"},
- AuthBridgeMode: "proxy-sidecar",
- }
- specB := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-abm-cr"},
- AuthBridgeMode: "lite",
- }
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, specA)
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, specB)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(r1.Hash).NotTo(Equal(r2.Hash))
})
It("should change when authbridge-runtime-config edits", func() {
- // Edits to the namespace authbridge-runtime-config (which the
- // admission webhook reads at pod creation) must roll
- // affected workloads. The hash captures its config.yaml
- // content as a raw string so any byte change registers.
abCM := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: AuthBridgeRuntimeConfigMapName,
@@ -297,44 +212,18 @@ var _ = Describe("AgentRuntime Config", func() {
Expect(k8sClient.Create(ctx, abCM)).To(Succeed())
defer func() { _ = k8sClient.Delete(ctx, abCM) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "hash-abruntime"},
- }
-
- r1, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r1, _ := ComputeConfigHash(ctx, k8sClient, namespace)
abCM.Data["config.yaml"] = "mode: proxy-sidecar\nmtls:\n mode: strict\n"
Expect(k8sClient.Update(ctx, abCM)).To(Succeed())
- r2, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
+ r2, _ := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(r1.Hash).NotTo(Equal(r2.Hash))
})
})
- Context("ComputeDefaultsOnlyHash", func() {
- It("should differ from spec hash", func() {
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "def-diff"},
- }
-
- specResult, _ := ComputeConfigHash(ctx, k8sClient, namespace, spec)
- defaultsHash, _ := ComputeDefaultsOnlyHash(ctx, k8sClient, namespace)
-
- Expect(specResult.Hash).NotTo(Equal(defaultsHash))
- })
-
- It("should be deterministic", func() {
- hash1, _ := ComputeDefaultsOnlyHash(ctx, k8sClient, namespace)
- hash2, _ := ComputeDefaultsOnlyHash(ctx, k8sClient, namespace)
-
- Expect(hash1).To(Equal(hash2))
- })
- })
-
- Context("resolveConfig three-layer merge", func() {
- It("should merge cluster → namespace → CR with correct precedence", func() {
+ Context("resolveConfig two-layer merge", func() {
+ It("should merge cluster → namespace with correct precedence", func() {
clusterCM := createClusterDefaults(ctx, map[string]string{
"otel-endpoint": "cluster-collector:4317",
"spiffe-trust-domain": "cluster.local",
@@ -354,17 +243,7 @@ var _ = Describe("AgentRuntime Config", func() {
})
defer func() { _ = k8sClient.Delete(ctx, nsCM) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "merge-test"},
- Identity: &agentv1alpha1.IdentitySpec{SPIFFE: &agentv1alpha1.SPIFFEIdentity{TrustDomain: "my-domain.org"}},
- }
-
- resolved, _ := resolveConfig(ctx, k8sClient, namespace, spec)
-
- // CR overrides
- Expect(resolved.Type).To(Equal("agent"))
- Expect(resolved.TrustDomain).To(Equal("my-domain.org"))
+ resolved, _ := resolveConfig(ctx, k8sClient, namespace)
// Namespace overrides cluster
Expect(resolved.Defaults["otel-endpoint"]).To(Equal("ns-collector:4317"))
@@ -379,25 +258,13 @@ var _ = Describe("AgentRuntime Config", func() {
Expect(resolved.FeatureGates["globalEnabled"]).To(Equal("true"))
})
- It("should return defaults only when spec is nil", func() {
- resolved, _ := resolveConfig(ctx, k8sClient, namespace, nil)
-
- Expect(resolved.Type).To(BeEmpty())
- Expect(resolved.TrustDomain).To(BeEmpty())
- })
-
- It("should not duplicate CR overrides in Defaults map", func() {
+ It("should not include CR fields in resolved config", func() {
clusterCM := createClusterDefaults(ctx, map[string]string{
"otel-endpoint": "cluster-collector:4317",
})
defer func() { _ = k8sClient.Delete(ctx, clusterCM) }()
- spec := &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "no-dup"},
- }
-
- resolved, _ := resolveConfig(ctx, k8sClient, namespace, spec)
+ resolved, _ := resolveConfig(ctx, k8sClient, namespace)
// ConfigMap value untouched in Defaults
Expect(resolved.Defaults["otel-endpoint"]).To(Equal("cluster-collector:4317"))
@@ -661,9 +528,7 @@ var _ = Describe("AgentRuntime Config", func() {
It("hashResolvedConfig should be deterministic and produce 64-char hex", func() {
config := resolvedConfig{
- Type: "agent",
- TrustDomain: "example.org",
- Defaults: map[string]string{"b": "2", "a": "1"},
+ Defaults: map[string]string{"b": "2", "a": "1"},
}
hash1, _ := hashResolvedConfig(config)
hash2, _ := hashResolvedConfig(config)
diff --git a/kagenti-operator/internal/controller/agentruntime_controller.go b/kagenti-operator/internal/controller/agentruntime_controller.go
index f264d7d3..da80a021 100644
--- a/kagenti-operator/internal/controller/agentruntime_controller.go
+++ b/kagenti-operator/internal/controller/agentruntime_controller.go
@@ -205,8 +205,8 @@ func (r *AgentRuntimeReconciler) Reconcile(ctx context.Context, req ctrl.Request
fmt.Sprintf("Namespace %s opted out of Istio mesh enrollment", rt.Namespace))
}
- // 5. Compute config hash from merged configuration (cluster → namespace → CR)
- configResult, err := ComputeConfigHash(ctx, r.Client, rt.Namespace, &rt.Spec)
+ // 5. Compute config hash from merged configuration (cluster → namespace)
+ configResult, err := ComputeConfigHash(ctx, r.Client, rt.Namespace)
if err != nil {
logger.Error(err, "Failed to compute config hash")
r.setPhase(rt, agentv1alpha1.RuntimePhaseError)
diff --git a/kagenti-operator/internal/controller/agentruntime_controller_test.go b/kagenti-operator/internal/controller/agentruntime_controller_test.go
index 3ae6c0aa..69178b89 100644
--- a/kagenti-operator/internal/controller/agentruntime_controller_test.go
+++ b/kagenti-operator/internal/controller/agentruntime_controller_test.go
@@ -459,7 +459,7 @@ var _ = Describe("AgentRuntime Controller", func() {
_ = k8sClient.Delete(ctx, dep)
})
- It("should produce a different config-hash than a minimal AgentRuntime", func() {
+ It("should produce the same config-hash as a minimal AgentRuntime (CR fields excluded from hash)", func() {
r := newReconciler()
// Reconcile the override RT
@@ -474,14 +474,11 @@ var _ = Describe("AgentRuntime Controller", func() {
Expect(k8sClient.Get(ctx, types.NamespacedName{Name: "override-deploy", Namespace: namespace}, overrideDep)).To(Succeed())
overrideHash := overrideDep.Spec.Template.Annotations[AnnotationConfigHash]
- // Compute hash for a minimal spec (no overrides)
- minimalResult, err := ComputeConfigHash(ctx, k8sClient, namespace, &agentv1alpha1.AgentRuntimeSpec{
- Type: agentv1alpha1.RuntimeTypeAgent,
- TargetRef: agentv1alpha1.TargetRef{APIVersion: "apps/v1", Kind: "Deployment", Name: "x"},
- })
+ // Compute hash (no CR fields — same namespace = same hash)
+ minimalResult, err := ComputeConfigHash(ctx, k8sClient, namespace)
Expect(err).NotTo(HaveOccurred())
- Expect(overrideHash).NotTo(Equal(minimalResult.Hash), "CR with overrides should have a different hash")
+ Expect(overrideHash).To(Equal(minimalResult.Hash), "CR with overrides should have the same hash (CR fields excluded)")
})
})
diff --git a/kagenti-operator/internal/webhook/injector/defaults_config_reconciler.go b/kagenti-operator/internal/webhook/injector/defaults_config_reconciler.go
index f274a3fd..a7fcdd6f 100644
--- a/kagenti-operator/internal/webhook/injector/defaults_config_reconciler.go
+++ b/kagenti-operator/internal/webhook/injector/defaults_config_reconciler.go
@@ -165,10 +165,11 @@ func (r *DefaultsConfigReconciler) reconcileWorkloadsInNamespace(ctx context.Con
func (r *DefaultsConfigReconciler) updateConfigHash(ctx context.Context, namespace, name, kind string) error {
logger := log.FromContext(ctx).WithValues("workload", name, "kind", kind)
- newHash, err := controller.ComputeDefaultsOnlyHash(ctx, r.Client, namespace)
+ configResult, err := controller.ComputeConfigHash(ctx, r.Client, namespace)
if err != nil {
return err
}
+ newHash := configResult.Hash
return retry.RetryOnConflict(retry.DefaultRetry, func() error {
key := types.NamespacedName{Name: name, Namespace: namespace}
diff --git a/kagenti-operator/test/e2e/README.md b/kagenti-operator/test/e2e/README.md
index 21531932..8eba5015 100644
--- a/kagenti-operator/test/e2e/README.md
+++ b/kagenti-operator/test/e2e/README.md
@@ -83,11 +83,11 @@ kind delete cluster
| Apply labels and config-hash | Agent lifecycle | AgentRuntime controller adds `kagenti.io/type=agent`, `managed-by`, config-hash, and triggers AgentCard auto-creation |
| Phase=Active and Ready=True | Agent lifecycle | AgentRuntime CR reaches Active phase with Ready=True condition |
| Idempotent re-reconcile | Agent lifecycle | Deployment generation stays stable over 30s (no spurious updates) |
-| Clean up on deletion | Agent lifecycle | Deletion preserves `kagenti.io/type`, removes `managed-by`, updates config-hash to defaults-only |
+| Clean up on deletion | Agent lifecycle | Deletion preserves `kagenti.io/type`, removes `managed-by`, config-hash stays the same (no CR fields in hash) |
| Missing target error | Error cases | AgentRuntime targeting non-existent Deployment sets Phase=Error |
| Tool type label | Tool type | AgentRuntime with type=tool applies `kagenti.io/type=tool` label and no AgentCard is created |
| StatefulSet target | StatefulSet target | AgentRuntime applies labels, config-hash, and reaches Active for a StatefulSet workload |
-| Identity/trace overrides | Identity and trace overrides | AgentRuntime with identity+trace spec produces a different config-hash than a minimal CR |
+| Identity/trace overrides | Identity and trace overrides | AgentRuntime with identity+trace spec produces the same config-hash as a minimal CR (CR fields excluded from hash) |
## Architecture
### What gets installed
@@ -177,7 +177,7 @@ kind load docker-image ▼ kubectl apply --server-si
The AgentRuntime E2E tests use a separate namespace (`e2e-agentruntime-test`) and lightweight
`pause:3.9` containers (no HTTP serving needed). The test creates ConfigMap fixtures to exercise
-the 3-layer config merge:
+the 2-layer config merge:
```
BeforeAll (AgentRuntime E2E)
@@ -198,7 +198,7 @@ AgentRuntime controller flow:
│
┌─ AgentRuntime CR ─────────┐ ┌─ Controller ────────────┐ │ ┌─ e2e-agentruntime-test ────────┐
│ spec.type: agent │────▶│ Resolve target │◀───┘ │ runtime-ns-defaults ConfigMap │
-│ spec.targetRef: │ │ Resolve config (3-layer)│◀────────│ (namespace defaults, layer 2) │
+│ spec.targetRef: │ │ Resolve config (2-layer)│◀────────│ (namespace defaults, layer 2) │
│ name: runtime-agent- │ │ Apply labels + hash │ │ │
│ target │ │ Set Phase=Active │ │ runtime-agent-target Deployment │
└───────────────────────────┘ └──────────┬───────────────┘ │ runtime-tool-target Deployment │
@@ -304,7 +304,7 @@ Test verifies: `SignatureVerified=True` (reason `SignatureValid`),
Deploys `runtime-agent-target` (pause container with `protocol.kagenti.io/a2a` label) and
creates an AgentRuntime CR with `type: agent` targeting it. The controller resolves the target,
-merges 3-layer config (cluster ConfigMap `kagenti-platform-config` in `kagenti-system` +
+merges 2-layer config (cluster ConfigMap `kagenti-platform-config` in `kagenti-system` +
namespace ConfigMap with `kagenti.io/defaults=true` + CR-level overrides), and applies labels
to the Deployment. Test verifies `kagenti.io/type=agent` on both workload metadata and pod
template, `app.kubernetes.io/managed-by=kagenti-operator` on workload metadata, and
@@ -333,9 +333,8 @@ Deletes the AgentRuntime CR and verifies the finalizer (`kagenti.io/cleanup`) ru
1. **Target Deployment still exists** — the controller cleans up labels, not the workload
2. **`kagenti.io/type=agent` preserved** — workload remains classified after runtime removal
3. **`app.kubernetes.io/managed-by` removed** — workload is no longer operator-managed
-4. **`kagenti.io/config-hash` changes** — updated to a defaults-only hash (cluster + namespace
- defaults without CR-level overrides), which differs from the initial hash and triggers a
- rolling update
+4. **`kagenti.io/config-hash` stays the same** — no CR fields in hash, so deletion does not
+ change the hash or trigger a rolling update
5. **AgentRuntime CR returns 404** — finalizer completed and CR was fully deleted
#### Missing target error
@@ -365,9 +364,8 @@ StatefulSet metadata, Phase=Active, and a valid 64-char config-hash on the pod t
Deploys two target Deployments and creates two AgentRuntime CRs: one minimal (no overrides)
and one with `spec.identity.spiffe.trustDomain` and `spec.trace` (endpoint, protocol, sampling
rate). Both CRs reach Phase=Active. The test records each Deployment's `kagenti.io/config-hash`
-annotation and asserts they differ, proving that identity and trace overrides are included in
-the config hash computation. This validates the full CRD → controller → config merge path for
-optional spec fields.
+annotation and asserts they are the same, confirming that CR-level overrides are excluded from
+the config hash (2-layer merge). The webhook reads CR overrides at pod CREATE time instead.
## Troubleshooting
diff --git a/kagenti-operator/test/e2e/e2e_test.go b/kagenti-operator/test/e2e/e2e_test.go
index 39c08d38..3e090b4e 100644
--- a/kagenti-operator/test/e2e/e2e_test.go
+++ b/kagenti-operator/test/e2e/e2e_test.go
@@ -1459,9 +1459,9 @@ rules:
overridesHash = hash
}).Should(Succeed())
- By("verifying config-hashes differ")
- Expect(overridesHash).NotTo(Equal(minimalHash),
- "identity overrides should produce a different config-hash")
+ By("verifying config-hashes are the same (CR fields excluded from 2-layer hash)")
+ Expect(overridesHash).To(Equal(minimalHash),
+ "identity overrides should NOT affect config-hash (CR fields excluded)")
By("cleaning up")
cmd := exec.Command("kubectl", "delete", "agentruntime", "test-minimal-runtime", "test-overrides-runtime",