Skip to content

fix(base): replace ensureLease init container with pre-install hook Job#149

Merged
davidf-null merged 1 commit into
mainfrom
fix/ratelimit
May 21, 2026
Merged

fix(base): replace ensureLease init container with pre-install hook Job#149
davidf-null merged 1 commit into
mainfrom
fix/ratelimit

Conversation

@davidf-null

Copy link
Copy Markdown
Collaborator

Problem

The ensure-lease init container in the nullplatform-log-controller DaemonSet pulls bitnami/kubectl from Docker Hub on every pod startup. Since a DaemonSet runs one pod per node, a cluster with N nodes triggers N image pulls per deploy — causing Docker Hub rate limit errors reported by customers.

Additionally, lease.yaml was a regular Helm template, causing Helm's SSA to conflict with the k8s-logs-controller over holderIdentity on every upgrade (revisions 3, 5 in minikube history showed this exact failure).

Changes

  • daemonset.yaml: Remove ensure-lease init container entirely — eliminates all Docker Hub pulls from the DaemonSet
  • pre-install-lease.yaml (new): Hook Job (pre-install,pre-upgrade) that creates the Lease once per Helm operation, regardless of node count. Controlled by existing logging.ensureLease flag
  • lease.yaml: Convert from regular template to pre-install hook — Helm creates it on install only, never SSA-applies it on upgrade. Removes holderIdentity from spec so the controller has full SSA ownership
  • serviceaccount.yaml / clusterroles.yaml / clusterrolebindings.yaml: Add hook RBAC (hook-weight: -5) for the new Job SA

ArgoCD compatibility

  • Hook resources are excluded from ArgoCD's desired state — no more drift loop where ArgoCD fights the controller over holderIdentity
  • The hook Job maps to ArgoCD's PreSync phase, which is the correct behavior (Lease exists before DaemonSet pods start)

Test plan

  • helm template --set logging.ensureLease=true renders Job hook + RBAC, no initContainers in DaemonSet
  • helm upgrade --set logging.ensureLease=true on minikube — Job completes, Lease created (lease.coordination.k8s.io/nullplatform-metrics-extractor configured)
  • Two consecutive upgrades succeed without SSA conflict (previously failed on rev 3, 5, 8)
  • helm template (default ensureLease=false) renders zero lease-installer resources

🤖 Generated with Claude Code

The kubectl init container in the log-controller DaemonSet was pulling
bitnami/kubectl from Docker Hub once per node per deploy, causing rate
limit errors on clusters with many nodes.

- Replace init container with a pre-install,pre-upgrade hook Job so
  kubectl is pulled only once per Helm operation regardless of node count
- Convert lease.yaml to a pre-install hook to prevent Helm SSA from
  fighting the logs controller over holderIdentity on every upgrade
- Add dedicated RBAC (SA, ClusterRole, ClusterRoleBinding) as hook
  resources with weight -5 so they exist before the Job runs
- Remove holderIdentity from the lease spec so the controller has full
  SSA ownership of that field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@davidf-null davidf-null merged commit e172807 into main May 21, 2026
3 checks passed
@davidf-null davidf-null deleted the fix/ratelimit branch May 21, 2026 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants