Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9a9e632
OCPBUGS-77056: make external cert validation asynchronous
bentito Mar 3, 2026
5f7fe27
Fix async SAR to use callback and re-enqueue route
bentito Mar 9, 2026
1119f93
Address PR feedback for async SAR
bentito Mar 9, 2026
2ea5f83
OCPBUGS-77056: Throttle concurrent async SAR checks
bentito Mar 12, 2026
d26ab28
Address CodeRabbit feedback: non-blocking semaphore and handle missin…
bentito Mar 12, 2026
47ac0a8
Address CodeRabbit feedback: make onComplete required and add SAR tim…
bentito Mar 12, 2026
e17092f
Update vendor tree to remove unused library-go and apiserver dependen…
bentito Mar 13, 2026
4e1eae4
Fix RouteSecretManager panic and update library-go dependency
bentito Mar 17, 2026
3fe8e4a
fix: resolve async SAR race condition that prevented external cert se…
bentito Mar 26, 2026
324594a
fix: remove stray blank line to pass gofmt check
bentito Mar 26, 2026
4f0a9b4
chore: update library-go to grab async SAR fixes
bentito Mar 26, 2026
277448b
chore: bump library-go to grab test fixes
bentito Mar 26, 2026
676251b
chore: revert local library-go replace directive
bentito Mar 27, 2026
db2142c
chore: sync vendor directory after removing authorizationutil
bentito Mar 27, 2026
b107829
Fix Async SAR robustness, cache TTL, and synchronous short-circuit
bentito Mar 30, 2026
aa7af73
fix(async-sar): introduce dynamic fast-path based on semaphore availa…
bentito Mar 31, 2026
152fb7c
fix(async-sar): do not cache validation failures
bentito Apr 2, 2026
9bc422f
fix(async-sar): handle library-go async registration and zombie callb…
bentito Apr 3, 2026
22858a9
Refactor: Replace asynchronous Subject Access Review with synchronous…
bentito Apr 4, 2026
c8a93a4
Fix formatting in route_secret_manager_test.go
bentito Apr 6, 2026
2741705
build: fix debug makefile to explicitly cross-compile linux/amd64
bentito Apr 15, 2026
5ff1268
perf: enable true parallel route status and SAR validation
bentito May 4, 2026
9a962f9
perf: implement asynchronous external certificate validation and non-…
bentito May 11, 2026
f8db4ec
test: remove scale-test script from PR
bentito May 11, 2026
d1c3df2
test: fix data race and timeout in route secret manager tests
bentito May 12, 2026
115bab8
Optimize external certificate secret monitoring using SharedInformer
bentito May 15, 2026
de2f6a2
gofmt shared_secret_manager.go
bentito May 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,5 @@ require (
sigs.k8s.io/structured-merge-diff/v6 v6.3.0 // indirect
sigs.k8s.io/yaml v1.6.0 // indirect
)

replace github.com/openshift/library-go => github.com/bentito/library-go v0.0.0-20260326212156-ef7716e77898
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ github.com/antlr4-go/antlr/v4 v4.13.0 h1:lxCg3LAv+EUK6t1i0y1V6/SLeUi0eKEKdhQAlS8
github.com/antlr4-go/antlr/v4 v4.13.0/go.mod h1:pfChB/xh/Unjila75QW7+VU4TSnWnnk9UTnmpPaOR2g=
github.com/bcicen/go-haproxy v0.0.0-20180203142132-ff5824fe38be h1:7b8/p7wJ3N/OjXbZmwJdyR9eOoCla5SbUIe18WHio5s=
github.com/bcicen/go-haproxy v0.0.0-20180203142132-ff5824fe38be/go.mod h1:MxVpaKTkNjZu5awzzr6mk6CIKaZYUFGxbmNwMvyVfeM=
github.com/bentito/library-go v0.0.0-20260326212156-ef7716e77898 h1:nzgD5pjb18eOkkf/FeXaNofDNHsG5JqJl3JsbFFgH34=
github.com/bentito/library-go v0.0.0-20260326212156-ef7716e77898/go.mod h1:K3FoNLgNBFYbFuG+Kr8usAnQxj1w84XogyUp2M8rK8k=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/blang/semver/v4 v4.0.0 h1:1PFHFE6yCCTv8C1TeyNNarDzntLi7wMI5i/pzqYIsAM=
Expand Down Expand Up @@ -184,8 +186,6 @@ github.com/openshift/api v0.0.0-20260212193555-c06ab675261f h1:l1IgsK48Ym/nED30y
github.com/openshift/api v0.0.0-20260212193555-c06ab675261f/go.mod h1:d5uzF0YN2nQQFA0jIEWzzOZ+edmo6wzlGLvx5Fhz4uY=
github.com/openshift/client-go v0.0.0-20260108185524-48f4ccfc4e13 h1:6rd4zSo2UaWQcAPZfHK9yzKVqH0BnMv1hqMzqXZyTds=
github.com/openshift/client-go v0.0.0-20260108185524-48f4ccfc4e13/go.mod h1:YvOmPmV7wcJxpfhTDuFqqs2Xpb3M3ovsM6Qs/i2ptq4=
github.com/openshift/library-go v0.0.0-20260223145824-7b234b47a906 h1:PkG3CmlU8+HtlW1rspnhwhbKki8rrwYN+L26aH11t2E=
github.com/openshift/library-go v0.0.0-20260223145824-7b234b47a906/go.mod h1:K3FoNLgNBFYbFuG+Kr8usAnQxj1w84XogyUp2M8rK8k=
github.com/orisano/pixelmatch v0.0.0-20220722002657-fb0b55479cde/go.mod h1:nZgzbfBr3hhjoZnS66nKrHmduYNpc34ny7RK4z5/HM0=
github.com/pkg/profile v1.7.0 h1:hnbDkaNWPCLMO9wGLdBFTIZvzDrDfBM2072E1S9gJkA=
github.com/pkg/profile v1.7.0/go.mod h1:8Uer0jas47ZQMJ7VD+OHknK4YDY07LPUC6dEvqDjvNo=
Expand Down
3 changes: 2 additions & 1 deletion hack/Makefile.debug
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# -*- mode: makefile -*-

export GOOS=linux
export GOARCH ?= amd64
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not needed for this PR, but we should do it so that people building debug images on non-amd64 arch machines still get it right.


REGISTRY ?= quay.io
IMAGE ?= openshift/openshift-router
Expand All @@ -11,7 +12,7 @@ REMOTE_IMAGE ?= $(REGISTRY)/$(IMAGE):$(TAG)
OPENSHIFT_ENDPOINT ?= $(shell oc config view --minify --template '{{(index .clusters 0).cluster.server}}' | grep -o '//[^ :]*' | sed 's/^..//')

new-openshift-router-image:
GO111MODULE=on CGO_ENABLED=0 GOFLAGS=-mod=vendor go build -o openshift-router -gcflags=all="-N -l" ./cmd/openshift-router
GO111MODULE=on CGO_ENABLED=0 GOOS=$(GOOS) GOARCH=$(GOARCH) GOFLAGS=-mod=vendor go build -o openshift-router -gcflags=all="-N -l" ./cmd/openshift-router
$(IMAGEBUILDER) build -t $(LOCAL_IMAGE) -f hack/Dockerfile.debug .

push:
Expand Down
7 changes: 7 additions & 0 deletions pkg/cmd/infra/router/clientcmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,13 @@ func (cfg *Config) KubeConfig() (*restclient.Config, string, error) {
if err != nil {
return nil, "", err
}

// Increase client-side rate limiting to support higher throughput during
// router startup, especially when many external certificate routes are
// present.
clientConfig.QPS = 50
clientConfig.Burst = 100

return clientConfig, namespace, nil
}

Expand Down
6 changes: 3 additions & 3 deletions pkg/cmd/infra/router/template.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,12 @@ import (
routelisters "github.com/openshift/client-go/route/listers/route/v1"
"github.com/openshift/library-go/pkg/crypto"
"github.com/openshift/library-go/pkg/proc"
"github.com/openshift/library-go/pkg/route/secretmanager"

"github.com/openshift/router/pkg/router"
"github.com/openshift/router/pkg/router/controller"
"github.com/openshift/router/pkg/router/metrics"
"github.com/openshift/router/pkg/router/metrics/haproxy"
"github.com/openshift/router/pkg/router/routeapihelpers"
"github.com/openshift/router/pkg/router/shutdown"
templateplugin "github.com/openshift/router/pkg/router/template"
haproxyconfigmanager "github.com/openshift/router/pkg/router/template/configmanager/haproxy"
Expand Down Expand Up @@ -755,7 +755,7 @@ func (o *TemplateRouterOptions) Run(stopCh <-chan struct{}) error {
return err
}

secretManager := secretmanager.NewManager(kc, nil)
secretManager := controller.NewSharedSecretManager(kc, nil)

pluginCfg := templateplugin.TemplatePluginConfig{
WorkingDir: o.WorkingDir,
Expand Down Expand Up @@ -800,7 +800,7 @@ func (o *TemplateRouterOptions) Run(stopCh <-chan struct{}) error {
informer := factory.CreateRoutesSharedInformer()
routeLister := routelisters.NewRouteLister(informer.GetIndexer())
if o.UpdateStatus {
lease := writerlease.New(time.Minute, 3*time.Second)
lease := writerlease.New(time.Minute, 3*time.Second, routeapihelpers.MaxConcurrentSARChecks)
go lease.Run(stopCh)
tracker := controller.NewSimpleContentionTracker(informer, o.RouterName, o.ResyncInterval/10)
tracker.SetConflictMessage(fmt.Sprintf("The router detected another process is writing conflicting updates to route status with name %q. Please ensure that the configuration of all routers is consistent. Route status will not be updated as long as conflicts are detected.", o.RouterName))
Expand Down
100 changes: 75 additions & 25 deletions pkg/router/controller/route_secret_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,10 +111,15 @@ func (p *RouteSecretManager) HandleRoute(eventType watch.EventType, route *route
case watch.Added:
// register with secret monitor
if hasExternalCertificate(route) {
log.V(4).Info("Validating and registering external certificate", "namespace", route.Namespace, "secret", route.Spec.TLS.ExternalCertificate.Name, "route", route.Name)
if err := p.validateAndRegister(route); err != nil {
return err
}
go func(route *routev1.Route) {
log.V(4).Info("Validating and registering external certificate (async)", "namespace", route.Namespace, "secret", route.Spec.TLS.ExternalCertificate.Name, "route", route.Name)
if err := p.validateAndRegister(route); err != nil {
log.Error(err, "failed to validate and register external certificate", "namespace", route.Namespace, "route", route.Name)
return
}
p.plugin.HandleRoute(watch.Added, route)
}(route.DeepCopy())
return nil
}

case watch.Modified:
Expand All @@ -127,14 +132,21 @@ func (p *RouteSecretManager) HandleRoute(eventType watch.EventType, route *route
// Both new and old routes have externalCertificate
if oldSecret != route.Spec.TLS.ExternalCertificate.Name {
// ExternalCertificate is updated
log.V(4).Info("Validating and registering updated external certificate", "namespace", route.Namespace, "oldSecret", oldSecret, "newSecret", route.Spec.TLS.ExternalCertificate.Name, "route", route.Name)
log.V(4).Info("Unregistering old external certificate", "namespace", route.Namespace, "oldSecret", oldSecret, "route", route.Name)
// Unregister the old and register the new external certificate
if err := p.unregister(route); err != nil {
return err
}
if err := p.validateAndRegister(route); err != nil {
return err
}

go func(route *routev1.Route) {
log.V(4).Info("Validating and registering updated external certificate (async)", "namespace", route.Namespace, "newSecret", route.Spec.TLS.ExternalCertificate.Name, "route", route.Name)
if err := p.validateAndRegister(route); err != nil {
log.Error(err, "failed to validate and register updated external certificate", "namespace", route.Namespace, "route", route.Name)
return
}
p.plugin.HandleRoute(watch.Modified, route)
}(route.DeepCopy())
return nil
} else {
// ExternalCertificate is not updated
// Re-validate and update the in-memory TLS certificate and key (even if ExternalCertificate remains unchanged)
Expand All @@ -156,24 +168,35 @@ func (p *RouteSecretManager) HandleRoute(eventType watch.EventType, route *route
//
// Therefore, it is essential to re-sync the secret to ensure the plugin chain correctly handles the route.

log.V(4).Info("Re-validating existing external certificate", "namespace", route.Namespace, "secret", oldSecret, "route", route.Name)
// re-validate
if err := p.validate(route); err != nil {
return err
}
// read referenced secret and update TLS certificate and key
if err := p.populateRouteTLSFromSecret(route); err != nil {
return err
}
go func(route *routev1.Route) {
log.V(4).Info("Re-validating existing external certificate (async)", "namespace", route.Namespace, "secret", oldSecret, "route", route.Name)
// re-validate (synchronous, throttled by semaphore)
if err := p.validate(route); err != nil {
log.Error(err, "failed to re-validate external certificate", "namespace", route.Namespace, "route", route.Name)
return
}
// read referenced secret and update TLS certificate and key
if err := p.populateRouteTLSFromSecret(route); err != nil {
log.Error(err, "failed to populate TLS from secret", "namespace", route.Namespace, "route", route.Name)
return
}
p.plugin.HandleRoute(watch.Modified, route)
}(route.DeepCopy())
return nil
}

case newHasExt && !oldHadExt:
// New route has externalCertificate, old route did not
log.V(4).Info("Validating and registering new external certificate", "namespace", route.Namespace, "secret", route.Spec.TLS.ExternalCertificate.Name, "route", route.Name)
// register with secret monitor
if err := p.validateAndRegister(route); err != nil {
return err
}
go func(route *routev1.Route) {
log.V(4).Info("Validating and registering new external certificate (async)", "namespace", route.Namespace, "secret", route.Spec.TLS.ExternalCertificate.Name, "route", route.Name)
// register with secret monitor
if err := p.validateAndRegister(route); err != nil {
log.Error(err, "failed to validate and register new external certificate", "namespace", route.Namespace, "route", route.Name)
return
}
p.plugin.HandleRoute(watch.Modified, route)
}(route.DeepCopy())
return nil

case !newHasExt && oldHadExt:
// Old route had externalCertificate, new route does not
Expand Down Expand Up @@ -204,15 +227,18 @@ func (p *RouteSecretManager) HandleRoute(eventType watch.EventType, route *route
// validateAndRegister validates the route's externalCertificate configuration and registers it with the secret manager.
// It also updates the in-memory TLS certificate and key after reading from secret informer's cache.
func (p *RouteSecretManager) validateAndRegister(route *routev1.Route) error {
// validate
// validate (synchronous, throttled by semaphore)
if err := p.validate(route); err != nil {
return err
}
// register route with secretManager
handler := p.generateSecretHandler(route.Namespace, route.Name)
if err := p.secretManager.RegisterRoute(context.TODO(), route.Namespace, route.Name, route.Spec.TLS.ExternalCertificate.Name, handler); err != nil {
p.recorder.RecordRouteRejection(route, ExtCrtStatusReasonGetFailed, err.Error())
p.plugin.HandleRoute(watch.Deleted, route)
return fmt.Errorf("failed to register router: %w", err)
}

// read referenced secret and update TLS certificate and key
if err := p.populateRouteTLSFromSecret(route); err != nil {
return err
Expand Down Expand Up @@ -260,6 +286,7 @@ func (p *RouteSecretManager) generateSecretHandler(namespace, routeName string)
AddFunc: func(obj interface{}) {
secret := obj.(*kapi.Secret)
log.V(4).Info("Secret added for route", "namespace", namespace, "secret", secret.Name, "route", routeName)
routeapihelpers.InvalidateAsyncSARCache(namespace, secret.Name)

// Secret re-creation scenario
// Check if the route key exists in the deletedSecrets map, indicating that the secret was previously deleted for this route.
Expand Down Expand Up @@ -287,8 +314,12 @@ func (p *RouteSecretManager) generateSecretHandler(namespace, routeName string)
UpdateFunc: func(old interface{}, new interface{}) {
secretOld := old.(*kapi.Secret)
secretNew := new.(*kapi.Secret)
if secretOld.ResourceVersion == secretNew.ResourceVersion {
return
}
key := generateKey(namespace, routeName)
log.V(4).Info("Secret updated for route", "namespace", namespace, "secret", secretNew.Name, "oldSecretVersion", secretOld.ResourceVersion, "newSecretVersion", secretNew.ResourceVersion, "route", routeName)
routeapihelpers.InvalidateAsyncSARCache(namespace, secretNew.Name)

// Ensure fetching the updated route
route, err := p.routelister.Routes(namespace).Get(routeName)
Expand All @@ -309,10 +340,23 @@ func (p *RouteSecretManager) generateSecretHandler(namespace, routeName string)
},

DeleteFunc: func(obj interface{}) {
secret := obj.(*kapi.Secret)
secret, ok := obj.(*kapi.Secret)
if !ok {
tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
if !ok {
log.Error(nil, "Couldn't get object from tombstone", "obj", obj)
return
}
secret, ok = tombstone.Obj.(*kapi.Secret)
if !ok {
log.Error(nil, "Tombstone contained object that is not a secret", "obj", tombstone.Obj)
return
}
}
key := generateKey(namespace, routeName)
msg := fmt.Sprintf("secret %q deleted for route %q", secret.Name, key)
log.V(4).Info(msg)
routeapihelpers.InvalidateAsyncSARCache(namespace, secret.Name)

// keep the secret monitor active and mark the secret as deleted for this route.
p.deletedSecrets.Store(key, true)
Expand All @@ -334,10 +378,14 @@ func (p *RouteSecretManager) generateSecretHandler(namespace, routeName string)
// If the validation fails, it records the route rejection and triggers
// the deletion of the route by calling the HandleRoute method with a watch.Deleted event.
//
// This function is synchronous: it blocks until the SAR check completes.
// Concurrency is throttled by the semaphore in ValidateTLSExternalCertificate.
//
// NOTE: TLS data validation and sanitization are handled by the next plugin `ExtendedValidator`,
// by reading the "tls.crt" and "tls.key" added by populateRouteTLSFromSecret.
func (p *RouteSecretManager) validate(route *routev1.Route) error {
fldPath := field.NewPath("spec").Child("tls").Child("externalCertificate")

if err := routeapihelpers.ValidateTLSExternalCertificate(route, fldPath, p.sarClient, p.secretsGetter).ToAggregate(); err != nil {
log.Error(err, "skipping route due to invalid externalCertificate configuration", "namespace", route.Namespace, "route", route.Name)
p.recorder.RecordRouteRejection(route, ExtCrtStatusReasonValidationFailed, err.Error())
Expand All @@ -352,7 +400,9 @@ func (p *RouteSecretManager) validate(route *routev1.Route) error {
// the deletion of the route by calling the HandleRoute method with a watch.Deleted event.
// Note: This function performs an in-place update of the route. The caller should be aware that the route's TLS configuration will be modified directly.
func (p *RouteSecretManager) populateRouteTLSFromSecret(route *routev1.Route) error {
// read referenced secret
// read referenced secret from the informer cache.
// RegisterRoute blocks until cache sync completes, so the secret should be
// available immediately after registration.
secret, err := p.secretManager.GetSecret(context.TODO(), route.Namespace, route.Name)
if err != nil {
log.Error(err, "failed to get referenced secret")
Expand Down
Loading