Summary
- Bug: In
pkg/multicluster/storage.go, scope elevation for internal users happens AFTER mutation rejection check, allowing internal users with cross-cluster capability to have mutation keys incorrectly rewritten.
- Code Defect:
rejectAllClustersMutation() checks scope before storeAndKey() can elevate it, and storeAndKey() elevates scope for ALL operations (reads AND mutations), not just reads.
- Production Impact: LIMITED - the
allClustersLoopbackConfig is only used by the shared runtime's loopback client, and the shared runtime's controllers (reconcileCRDStatusKey, deleteCRInstances) use ensureClusterClient which uses the BASE loopback config WITHOUT the capability.
- Correctness Issue: The code is architecturally incorrect - scope elevation should only happen for read operations, or mutation rejection should check capability. The current design violates the documented intent that
ResourceScopeCrossClusterRead is "intended for internal readers only."
- Confirmed by Test: A runnable test demonstrates the bug - key written to
/registry/configmaps/clusters instead of /registry/configmaps/clusters/c1/default/cm.
Code with Bug
Location: pkg/multicluster/storage.go
Lines 181-189: Mutation methods check scope BEFORE elevation
func (c *clusteredStorage) Create(ctx context.Context, key string, obj, out runtime.Object, ttl uint64) error {
if err := c.rejectAllClustersMutation(ctx); err != nil { // <-- Checks ORIGINAL scope
return err
}
store, key, err := c.storeAndKey(ctx, key) // <-- May ELEVATE scope inside
if err != nil {
return err
}
return store.Create(ctx, key, obj, out, ttl)
}
Lines 344-349: Mutation rejection only checks scope, not capability
func (c *clusteredStorage) rejectAllClustersMutation(ctx context.Context) error {
_, scope, _ := FromContextScope(ctx) // <-- Gets ORIGINAL scope, not elevated
if scope == ResourceScopeCrossClusterRead {
return ErrAllClustersMutationForbidden
}
return nil // <-- PASSES if scope is Cluster, even if user has capability
}
Lines 308-342: Scope elevation happens for ALL operations
func (c *clusteredStorage) storeAndKey(ctx context.Context, key string) (storage.Interface, string, error) {
cid, scope, _ := FromContextScope(ctx)
if scope != ResourceScopeCrossClusterRead && HasInternalCrossClusterCapability(ctx) {
scope = ResourceScopeCrossClusterRead // <-- ELEVATES scope for mutations too
ctx = internalcap.WithAllClustersCapability(ctx)
}
// ...
if scope == ResourceScopeCrossClusterRead {
rewritten := c.kindRootPrefix() // Returns "/registry/<resource>/clusters"
// ...
return store, rewritten, nil // <-- WRONG KEY for per-cluster mutations!
}
// ...
}
Evidence
Evidence 1: Complete Runnable Test (VERIFIED)
The following test can be added to storage_test.go to demonstrate the bug. This test has been executed and confirmed to fail with the current code:
type keyRecordingStorage struct {
fakeStorage
lastKey string
}
func (s *keyRecordingStorage) Create(_ context.Context, key string, _, _ runtime.Object, _ uint64) error {
s.lastKey = key
return nil
}
func TestScopeElevationAfterMutationRejection(t *testing.T) {
record := &keyRecordingStorage{}
cs, destroy, err := newClusteredStorage(
func(*storagebackend.ConfigForResource, string, func(runtime.Object) (string, error), func() runtime.Object, func() runtime.Object, storage.AttrFunc, storage.IndexerFuncs, *cache.Indexers) (storage.Interface, factory.DestroyFunc, error) {
return record, func() {}, nil
},
&storagebackend.ConfigForResource{},
"/registry/configmaps",
func(obj runtime.Object) (string, error) { return "/registry/configmaps/default/cm", nil },
func() runtime.Object { return &corev1.ConfigMap{} },
func() runtime.Object { return &corev1.ConfigMapList{} },
nil, nil, nil,
Options{DefaultCluster: "root"},
)
if err != nil {
t.Fatalf("newClusteredStorage: %v", err)
}
defer destroy()
ctx := context.Background()
ctx = apirequest.WithUser(ctx, &user.DefaultInfo{
Name: DefaultInternalCrossClusterUser,
Extra: map[string][]string{DefaultInternalCrossClusterCapability: {"true"}},
})
ctx = WithCluster(ctx, "c1", false)
obj := &corev1.ConfigMap{ObjectMeta: metav1.ObjectMeta{Name: "cm"}}
err = cs.Create(ctx, "/registry/configmaps/default/cm", obj, &corev1.ConfigMap{}, 0)
if err != nil {
t.Fatalf("Create failed: %v", err)
}
expected := "/registry/configmaps/clusters/c1/default/cm"
if record.lastKey != expected {
t.Fatalf("BUG: expected key %q, got %q", expected, record.lastKey)
}
}
Test Output (confirmed bug):
=== RUN TestScopeElevationAfterMutationRejection
evidence_test.go:59: Context state: cid="c1" scope="cluster" hasInternalCap=true hasAllClustersCap=false
evidence_test.go:82: BUG CONFIRMED: expected key "/registry/configmaps/clusters/c1/default/cm", got "/registry/configmaps/clusters" - scope elevation happened AFTER mutation rejection
--- FAIL: TestScopeElevationAfterMutationRejection (0.00s)
The test proves:
- Context has
cid="c1" and scope="cluster" (correct per-cluster context)
hasInternalCap=true (user has capability)
hasAllClustersCap=false (capability not set by handler for per-cluster requests)
- Key written to wrong location:
/registry/configmaps/clusters instead of /registry/configmaps/clusters/c1/default/cm
Evidence 2: Architectural Defect Analysis
The code has a clear architectural defect:
-
Documented Intent: ResourceScopeCrossClusterRead is documented as "intended for internal readers only" (options.go:46-47)
-
Actual Behavior: The scope elevation in storeAndKey applies to ALL operations including mutations
-
Violation of Intent: If an internal user with capability makes a per-cluster mutation request:
rejectAllClustersMutation passes (scope is Cluster)
storeAndKey elevates scope to CrossClusterRead
- Key is rewritten to
kindRootPrefix() instead of rewriteKey(cid, key)
This is incorrect regardless of whether the production code path exists.
Evidence 3: Production Code Path Analysis
Where allClustersLoopbackConfig is used:
File: pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.go
- Line 183:
baseGeneric.LoopbackClientConfig = allClustersLoopbackConfig(baseGeneric.LoopbackClientConfig)
- This sets the loopback config for the SHARED runtime's API server
Where the shared runtime's loopback client would be used:
- Admission webhooks - CRD validation/defaulting webhooks calling back to the API server
- Internal authorization checks - Self-subject access reviews
- Built-in controllers - The apiextensions-apiserver has internal controllers that may use the loopback client
Why the production impact is LIMITED:
File: pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.go
The shared runtime's controllers use the BASE config:
// Line 722-725: reconcileCRDStatusKey uses ensureClusterClient
cs, err := m.ensureClusterClient(clusterID)
// Line 850-852: deleteCRInstances uses dynamicClientForCluster
dc, err := m.dynamicClientForCluster(clusterID)
// Line 237-242: ensureClusterClient uses baseLoopbackConfig
func (m *CRDRuntimeManager) ensureClusterClient(clusterID string) (...) {
if m.apiClientPool == nil {
base := m.baseLoopbackConfig() // <-- BASE config, NOT allClustersLoopbackConfig
m.apiClientPool = mc.NewAPIExtensionsClientPool(base, ...)
}
}
// Line 921-922: dynamicClientForCluster uses baseLoopbackConfig
func (m *CRDRuntimeManager) dynamicClientForCluster(clusterID string) (...) {
base := m.baseLoopbackConfig() // <-- BASE config
}
The capability is set via impersonation:
File: pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.go
// Lines 631-645
func allClustersLoopbackConfig(base *rest.Config) *rest.Config {
cfg := rest.CopyConfig(base)
cfg.Impersonate.UserName = mc.DefaultInternalCrossClusterUser // <-- IMPERSONATION
cfg.Impersonate.Extra[mc.DefaultInternalCrossClusterCapability] = []string{"true"}
return cfg
}
When the shared runtime's loopback client makes a request, it impersonates system:apiserver with the capability. The request context would then have:
- User:
system:apiserver with capability extra
- Scope: Determined by
WithClusterRouting based on the request path
Evidence 4: The Bug Would Trigger IF the Scenario Occurred
If a request with:
- User =
system:apiserver with capability (from allClustersLoopbackConfig impersonation)
- Scope =
ResourceScopeCluster (from WithClusterRouting for a per-cluster path)
- Operation = Mutation (Create/Delete/GuaranteedUpdate)
Were to hit clusteredStorage, the bug would trigger:
rejectAllClustersMutation checks scope == CrossClusterRead → FALSE (scope is Cluster)
- Mutation rejection PASSES
storeAndKey checks HasInternalCrossClusterCapability → TRUE
- Scope is ELEVATED to
CrossClusterRead
- Key is rewritten to
kindRootPrefix() instead of rewriteKey(cid, key)
Why Existing Tests Don't Catch This
The test TestWithClusterRoutingDoesNotTrustUserForScope (handler_test.go:29-63) verifies:
- Per-cluster requests get
ResourceScopeCluster scope
internalcap.HasAllClustersCapability(ctx) returns FALSE
But it doesn't test the storage layer where scope elevation happens.
The test TestStoreAndKey_AllClustersMutationRejected tests rejection when scope is ALREADY CrossClusterRead, not the elevation scenario.
Recommended Fix
Option 1 - Check capability in mutation rejection:
func (c *clusteredStorage) rejectAllClustersMutation(ctx context.Context) error {
_, scope, _ := FromContextScope(ctx)
if scope == ResourceScopeCrossClusterRead {
return ErrAllClustersMutationForbidden
}
// Also reject if internal capability present (would cause scope elevation)
if HasInternalCrossClusterCapability(ctx) {
return ErrAllClustersMutationForbidden
}
return nil
}
Option 2 - Move elevation to read-only methods only:
// Remove elevation from storeAndKey and add to Watch, Get, GetList only
func (c *clusteredStorage) Get(ctx context.Context, key string, opts storage.GetOptions, objPtr runtime.Object) error {
ctx = c.elevateScopeForRead(ctx) // Only elevate for reads
store, rewrittenKey, err := c.storeAndKey(ctx, key)
// ...
}
Conclusion
The code has a clear architectural defect: scope elevation happens after mutation rejection, which violates the documented intent that ResourceScopeCrossClusterRead is for readers only. While the production impact is currently limited (the shared runtime's controllers use the base config without capability), the code is incorrect and should be fixed for correctness and to prevent future bugs if the code paths change.
History
This bug was introduced in commit efcfa44 (@zachsmith1, 2026-02-16, PR #10). The commit added shared informers support with internal cross-cluster read capability, but the mutation rejection logic only checked the original scope (not the capability) while storeAndKey simultaneously elevated scope for internal users. The developer added scope elevation for internal users but forgot to check capability in mutation rejection, creating the order-of-checks bug where mutation rejection passes before scope gets elevated.
Summary
pkg/multicluster/storage.go, scope elevation for internal users happens AFTER mutation rejection check, allowing internal users with cross-cluster capability to have mutation keys incorrectly rewritten.rejectAllClustersMutation()checks scope beforestoreAndKey()can elevate it, andstoreAndKey()elevates scope for ALL operations (reads AND mutations), not just reads.allClustersLoopbackConfigis only used by the shared runtime's loopback client, and the shared runtime's controllers (reconcileCRDStatusKey,deleteCRInstances) useensureClusterClientwhich uses the BASE loopback config WITHOUT the capability.ResourceScopeCrossClusterReadis "intended for internal readers only."/registry/configmaps/clustersinstead of/registry/configmaps/clusters/c1/default/cm.Code with Bug
Location:
pkg/multicluster/storage.goLines 181-189: Mutation methods check scope BEFORE elevation
Lines 344-349: Mutation rejection only checks scope, not capability
Lines 308-342: Scope elevation happens for ALL operations
Evidence
Evidence 1: Complete Runnable Test (VERIFIED)
The following test can be added to
storage_test.goto demonstrate the bug. This test has been executed and confirmed to fail with the current code:Test Output (confirmed bug):
The test proves:
cid="c1"andscope="cluster"(correct per-cluster context)hasInternalCap=true(user has capability)hasAllClustersCap=false(capability not set by handler for per-cluster requests)/registry/configmaps/clustersinstead of/registry/configmaps/clusters/c1/default/cmEvidence 2: Architectural Defect Analysis
The code has a clear architectural defect:
Documented Intent:
ResourceScopeCrossClusterReadis documented as "intended for internal readers only" (options.go:46-47)Actual Behavior: The scope elevation in
storeAndKeyapplies to ALL operations including mutationsViolation of Intent: If an internal user with capability makes a per-cluster mutation request:
rejectAllClustersMutationpasses (scope isCluster)storeAndKeyelevates scope toCrossClusterReadkindRootPrefix()instead ofrewriteKey(cid, key)This is incorrect regardless of whether the production code path exists.
Evidence 3: Production Code Path Analysis
Where
allClustersLoopbackConfigis used:File:
pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.gobaseGeneric.LoopbackClientConfig = allClustersLoopbackConfig(baseGeneric.LoopbackClientConfig)Where the shared runtime's loopback client would be used:
Why the production impact is LIMITED:
File:
pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.goThe shared runtime's controllers use the BASE config:
The capability is set via impersonation:
File:
pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.goWhen the shared runtime's loopback client makes a request, it impersonates
system:apiserverwith the capability. The request context would then have:system:apiserverwith capability extraWithClusterRoutingbased on the request pathEvidence 4: The Bug Would Trigger IF the Scenario Occurred
If a request with:
system:apiserverwith capability (fromallClustersLoopbackConfigimpersonation)ResourceScopeCluster(fromWithClusterRoutingfor a per-cluster path)Were to hit
clusteredStorage, the bug would trigger:rejectAllClustersMutationchecksscope == CrossClusterRead→ FALSE (scope is Cluster)storeAndKeychecksHasInternalCrossClusterCapability→ TRUECrossClusterReadkindRootPrefix()instead ofrewriteKey(cid, key)Why Existing Tests Don't Catch This
The test
TestWithClusterRoutingDoesNotTrustUserForScope(handler_test.go:29-63) verifies:ResourceScopeClusterscopeinternalcap.HasAllClustersCapability(ctx)returns FALSEBut it doesn't test the storage layer where scope elevation happens.
The test
TestStoreAndKey_AllClustersMutationRejectedtests rejection when scope is ALREADYCrossClusterRead, not the elevation scenario.Recommended Fix
Option 1 - Check capability in mutation rejection:
Option 2 - Move elevation to read-only methods only:
Conclusion
The code has a clear architectural defect: scope elevation happens after mutation rejection, which violates the documented intent that
ResourceScopeCrossClusterReadis for readers only. While the production impact is currently limited (the shared runtime's controllers use the base config without capability), the code is incorrect and should be fixed for correctness and to prevent future bugs if the code paths change.History
This bug was introduced in commit efcfa44 (@zachsmith1, 2026-02-16, PR #10). The commit added shared informers support with internal cross-cluster read capability, but the mutation rejection logic only checked the original scope (not the capability) while
storeAndKeysimultaneously elevated scope for internal users. The developer added scope elevation for internal users but forgot to check capability in mutation rejection, creating the order-of-checks bug where mutation rejection passes before scope gets elevated.