Skip to content

[Detail Bug] Multicluster typed listers return non-NotFound errors for missing objects, breaking consumers that check IsNotFound #21

@detail-app

Description

@detail-app

Bug Report: Get Methods Return Wrong Error Type

Summary

The Get methods in pkg/multicluster/typedinformer/adapters.go return fmt.Errorf() instead of apierrors.NewNotFound() when objects are not found. This violates the Kubernetes lister interface contract and breaks consumers that check apierrors.IsNotFound(err) to distinguish "object doesn't exist" from "cache error."

The Bug

All 9 Get methods return plain string errors:

// adapters.go:66
func (l *secretNamespaceLister) Get(name string) (*corev1.Secret, error) {
    obj, ok := l.parent.mci.Get(l.parent.cluster, l.namespace, name)
    if !ok {
        return nil, fmt.Errorf("secret %s/%s not found", l.namespace, name)  // BUG
    }
    return convert[corev1.Secret](obj)
}

The same pattern exists on lines 66, 142, 198, 252, 288, 342, 396, 431, 466 for all resource types.

Evidence

1. Exact Source Code From Pinned Dependency

The pinned k8s dependency (go.mod line 32):

k8s.io/kubernetes => github.com/kplane-dev/kubernetes v0.0.0-20260404055358-aac72e7d04ad

The bootstrap authenticator source at this exact path:
/root/go/pkg/mod/github.com/kplane-dev/kubernetes@v0.0.0-20260404055358-aac72e7d04ad/plugin/pkg/auth/authenticator/token/bootstrap/bootstrap.go

Lines 98-105:

secret, err := t.lister.Get(secretName)
if err != nil {
    if errors.IsNotFound(err) {
        klog.V(3).Infof("No secret of name %s to match bootstrap bearer token", secretName)
        return nil, false, nil  // GRACEFULLY SKIP MISSING TOKENS
    }
    return nil, false, err  // PROPAGATE OTHER ERRORS
}

With the bug: errors.IsNotFound(err) returns false → error is propagated → authentication fails
With the fix: errors.IsNotFound(err) returns true → gracefully skip → normal behavior

2. Test Against Actual Code

=== RUN   TestAllListerGetMethods_ReturnNotFoundError/SecretNamespaceLister
    adapters_notfound_test.go:206: BUG: SecretNamespaceLister.Get returns error not detectable via IsNotFound
    adapters_notfound_test.go:207:   got error: secret ns/name not found (type: *errors.errorString)
    adapters_notfound_test.go:208:   expected error: secrets "name" not found (type: *errors.StatusError)
    adapters_notfound_test.go:209:  IsNotFound(got) = false, IsNotFound(expected) = true
--- FAIL: TestAllListerGetMethods_ReturnNotFoundError/SecretNamespaceLister

All 9 lister Get methods fail this test.

3. Complete Call Chain

Step 1: InformerRegistry creates MCI (informer_registry.go:72-78):

mci := informer.New(informer.Config{
    Storage:        store,
    ResourcePrefix: cs.kindRootPrefix(),
    GroupResource:  gr,
})
go mci.Run(r.ctx)

Step 2: Auth Manager gets MCI and creates lister (auth/manager.go:161-186):

func (m *Manager) coreListersForCluster(clusterID string) (*coreAuthListers, error) {
    secretsMCI, err := m.opts.InformerRegistry.Get(schema.GroupResource{Resource: "secrets"})
    // ...
    return &coreAuthListers{
        secrets: typedinformer.NewSecretLister(secretsMCI, clusterID),
        // ...
    }, nil
}

Step 3: Bootstrap authenticator receives lister (auth/manager.go:378-382):

if authConfig.BootstrapToken {
    authConfig.BootstrapTokenAuthenticator = bootstrap.NewTokenAuthenticator(
        listers.secrets.Secrets(metav1.NamespaceSystem),
    )
}

Step 4: Bootstrap authenticator calls Get and checks IsNotFound (see lines 98-105 above)

Addressing Reviewer Concerns

"Cache Miss != Object Does Not Exist"

The HasSynced() contract: after sync returns true, the cache is authoritative. This is identical to standard Kubernetes informers. From k8s.io/client-go/listers/generic_helpers.go:

func Get[T runtime.Object](indexer cache.Indexer, resource string, name string) (T, error) {
    obj, exists, err := indexer.GetByKey(name)
    if err != nil {
        return obj, err
    }
    if !exists {
        return obj, errors.NewNotFound(schema.GroupResource{Resource: resource}, name)
    }
    return obj, nil
}

Standard informers return NewNotFound for cache misses after sync. The multi-cluster informer should follow the same contract.

"ServiceAccountTokenGetter Fallback Is Intentional"

The fallback defeats caching for ALL cache misses. With proper NewNotFound errors, consumers could optimize:

  • IsNotFound → object confirmed to not exist, no API call needed
  • Other errors → cache issue, fallback to API

"HasSynced Guards Already Solve the Problem"

The codebase waits for HasSynced for RBAC resources (auth/manager.go:230-240) but NOT for core auth listers. Regardless, the error type bug exists in both synced and unsynced states - the error is never detectable via IsNotFound.

"Bootstrap Token Auth Is Disabled by Default"

A bug that manifests when a feature is enabled is still a bug. Bootstrap token auth is a standard Kubernetes feature that should work in multi-cluster setups.

"The Tests Are Contrived"

The tests prove the interface contract violation: Get() returns an error, but IsNotFound() cannot detect it. This is the core issue regardless of cache state.

Why This Bug Is Real

  1. Interface Contract Violation: Kubernetes listers return NewNotFound for missing objects. This contract enables consumers to distinguish "not found" from "error."

  2. Consumer Code Breaks: The pinned k8s bootstrap authenticator explicitly checks IsNotFound and fails with the current implementation.

  3. Testable: Tests demonstrate the bug against actual code.

  4. Pinned Dependency: The exact version containing the IsNotFound check is pinned in go.mod.

Recommended Fix

import (
    apierrors "k8s.io/apimachinery/pkg/api/errors"
    "k8s.io/apimachinery/pkg/runtime/schema"
)

func (l *secretNamespaceLister) Get(name string) (*corev1.Secret, error) {
    obj, ok := l.parent.mci.Get(l.parent.cluster, l.namespace, name)
    if !ok {
        return nil, apierrors.NewNotFound(schema.GroupResource{Group: "", Resource: "secrets"}, name)
    }
    return convert[corev1.Secret](obj)
}

Apply to all 9 Get methods with appropriate GroupResource values.

History

This bug was introduced in commit 12c4811 (@zachsmith1, 2026-03-03, PR #14). The commit created pkg/multicluster/typedinformer/adapters.go as a new file implementing typed lister adapters for the multi-cluster informer. The Get methods were written with fmt.Errorf("... not found") instead of apierrors.NewNotFound(). Notably, the same commit's author correctly used apierrors.NewNotFound for CRD lookups elsewhere (e.g., pkg/multicluster/bootstrap/crd_runtime_manager_wrapped.go), indicating awareness of the proper error type — the oversight was specific to the new typed informer adapters.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdetail

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions