Skip to content

Added ValidatingAdmissionPolicy check to find policies which block cluster upgrades#180

Open
ericsmith-do wants to merge 1 commit into
masterfrom
ericsmith/admission-policy-check
Open

Added ValidatingAdmissionPolicy check to find policies which block cluster upgrades#180
ericsmith-do wants to merge 1 commit into
masterfrom
ericsmith/admission-policy-check

Conversation

@ericsmith-do

@ericsmith-do ericsmith-do commented Jun 18, 2026

Copy link
Copy Markdown

Overview

DOKS clusters can get stuck mid-upgrade when a ValidatingAdmissionPolicy denies the operations our reconciler needs to roll the worker nodes. We hit this recently when a cluster sat in reconcile pending for ages with nothing useful in the reconciler logs, and it turned out a safe-upgrades.gateway.networking.k8s.io policy was quietly rejecting the reconciler's requests. Someone had to manually poke around the cluster's CRDs to figure that out.

clusterlint already catches the equivalent problem for admission webhooks, but it had no idea ValidatingAdmissionPolicies existed. This PR adds a new check so we catch this class of problem automatically instead of debugging it by hand.

The scoping is intentionally conservative , matching the webhook check. The literal policy from the incident targets Gateway API resources cluster-wide, so it wouldn't be caught as-is. Happy to broaden this if we'd rather err toward more findings.

Testing

Unit tests cover the meta/registration plus every filter branch. I also ran it against a real cluster to confirm the fetch wiring and RBAC, not just the logic:

  1. kind create cluster --image kindest/node:v1.30.0
  2. Applied a Deny-enforcing policy + binding scoped to core resources in kube-system → clusterlint run -c validating-admission-policy reported the error as expected.
  3. Then flipped each condition one at a time and re-ran:
    • binding set to Warn → no finding
    • failurePolicy: Ignore → no finding
    • binding deleted → no finding
    • policy excludes kube-system → no finding
    • policy scoped to batch/jobs → no finding

So only the genuinely risky shape gets flagged, and breaking any single condition silences it.

@ericsmith-do ericsmith-do force-pushed the ericsmith/admission-policy-check branch from c9ca212 to d61182c Compare June 19, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant