Skip to content

feat(driver/kubernetes): support pod affinity/topology spread via driver-opt or pod template ConfigMap #3919

Description

@regressEdo

Problem

When running multiple parallel docker buildx build invocations with the Kubernetes driver (one builder per image context), all BuildKit pods tend to be scheduled on the same cluster node by the Kubernetes bin-packing scheduler. This causes disk I/O contention during layer extraction — all pods decompress and write the same base image layers simultaneously on shared ephemeral storage, multiplying extraction time by 3–4×.

Example: extracting a 180 MB layer takes ~15s in isolation but ~55s when 3 pods share the same node.

Requested Feature

Expose a mechanism to inject pod affinity / anti-affinity / topology spread constraints into the BuildKit pod spec created by the Kubernetes driver.

Option A: New driver-opt parameters

Add direct driver-opt support for the most common scheduling hints:

--driver-opt=topology-spread.topologyKey=kubernetes.io/hostname
--driver-opt=topology-spread.maxSkew=1
--driver-opt=topology-spread.whenUnsatisfiable=ScheduleAnyway

This would generate a topologySpreadConstraints entry in the pod spec matching the buildkit pods by their app label.

Option B: Pod template ConfigMap (more flexible)

Allow specifying the name of a ConfigMap containing a partial pod spec YAML that gets merged into the generated pod spec:

--driver-opt=podTemplateConfigMap=my-buildkit-template

The ConfigMap (in the same namespace) would contain arbitrary pod spec fields — affinity, topologySpreadConstraints, securityContext, priorityClassName, etc. — giving operators full control without requiring new driver-opt parameters for every Kubernetes scheduling feature.

# ConfigMap data
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                app: buildkitd
            topologyKey: kubernetes.io/hostname

Current Workaround

None available without a MutatingAdmissionWebhook or third-party policy engine (e.g. Kyverno), which adds significant operational overhead just to influence BuildKit pod scheduling.

Environment

  • docker buildx version: v0.34.1
  • Kubernetes driver
  • GKE Spot nodes with cluster autoscaler

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions