Skip to content

helm: allow Service creation and custom annotations for Prometheus scraping#1267

Open
lvg-dexters wants to merge 1 commit intoaws:mainfrom
lvg-dexters:helm-service-annotations-imds-mode
Open

helm: allow Service creation and custom annotations for Prometheus scraping#1267
lvg-dexters wants to merge 1 commit intoaws:mainfrom
lvg-dexters:helm-service-annotations-imds-mode

Conversation

@lvg-dexters
Copy link
Copy Markdown

@lvg-dexters lvg-dexters commented Apr 24, 2026

Issue #, if available: Closes #1266

Description of changes:

Adds two opt-in values to the Helm chart so the metrics Service can be used for Prometheus scrape discovery in IMDS/DaemonSet mode, and so any chart-created Service can carry custom annotations.

Motivation

Today the Service template at config/helm/aws-node-termination-handler/templates/service.yaml is only rendered when enableSqsTerminationDraining AND enablePrometheusServer are both true, and the Service metadata has no annotations block. That leaves users running NTH in IMDS mode with enablePrometheusServer: true without a chart-native path to expose metrics for endpointslice-based discovery (vanilla Prometheus, Grafana Alloy, grafana/k8s-monitoring). podMonitor.create addresses Prometheus Operator users, but annotation-based discovery has no option.

Full problem statement and design rationale in the linked issue (#1266).

Changes

Two files, 30 additions total.

config/helm/aws-node-termination-handler/values.yaml

  • New service.create: false (boolean, default false). When true in IMDS mode, creates a headless Service.
  • New service.annotations: {} (default empty). Applied to the Service metadata when present.

config/helm/aws-node-termination-handler/templates/service.yaml

  • Condition expanded: if and .Values.enablePrometheusServer (or .Values.enableSqsTerminationDraining .Values.service.create).
  • Metadata labels and spec.selector branch on mode: labelsDeployment / selectorLabelsDeployment in SQS mode (unchanged), labelsDaemonset / selectorLabelsDaemonset in IMDS mode. Both helpers already exist in _helpers.tpl.
  • spec.clusterIP: None added in IMDS mode only (headless). SQS mode keeps existing type: ClusterIP behavior exactly.
  • metadata.annotations block threads .Values.service.annotations when set.

Backward compatibility

Strictly additive. Behavior matrix:

enableSqsTerminationDraining enablePrometheusServer service.create Service created? vs current chart
any false any No same
true true false Yes, ClusterIP, Deployment selector same
true true true Yes, ClusterIP, Deployment selector same (opt-in no-op in SQS mode)
false true false No same
false true true Yes, headless, DaemonSet selector new (opt-in only)

Existing SQS users: zero change. Existing IMDS users who do not set service.create: zero change. Only IMDS users who explicitly opt in see new behavior.

How you tested your changes:
Environment (Linux / Windows): Linux (macOS host, Helm 3.19.0)
Kubernetes Version: Not applied to a cluster for this change (pure chart templating). Expected to work on any Kubernetes version supporting v1 Service (all supported versions).

Ran helm lint (passes) and helm template across the full behavior matrix:

cd config/helm/aws-node-termination-handler

# 1. enablePrometheusServer=false: 0 Service in all modes
helm template nth . --set enablePrometheusServer=false | grep -cE '^kind: Service$'
# -> 0

# 2. IMDS + PromServer + service.create=false (backward compat IMDS)
helm template nth . --set enableSqsTerminationDraining=false \
  --set enablePrometheusServer=true --set service.create=false | grep -cE '^kind: Service$'
# -> 0

# 3. IMDS + PromServer + service.create=true (new, headless)
helm template nth . --set enableSqsTerminationDraining=false \
  --set enablePrometheusServer=true --set service.create=true | grep -cE '^kind: Service$'
# -> 1  (verified: clusterIP: None, component: daemonset)

# 4. SQS + PromServer (backward compat SQS)
helm template nth . --set enableSqsTerminationDraining=true \
  --set enablePrometheusServer=true --set queueURL='https://sqs.example.com/test' \
  | grep -cE '^kind: Service$'
# -> 1  (verified: type: ClusterIP, component: deployment, no clusterIP: None)

# 5. Both SQS=true AND service.create=true (verify coherent behavior)
helm template nth . --set enableSqsTerminationDraining=true --set enablePrometheusServer=true \
  --set service.create=true --set queueURL='https://sqs.example.com/test'
# -> Service matches Deployment workload (SQS mode wins, since only Deployment workload exists)

# 6. service.annotations propagates
helm template nth . --set enableSqsTerminationDraining=false \
  --set enablePrometheusServer=true --set service.create=true \
  --set 'service.annotations.prometheus\.io/scrape=true'
# -> Service metadata.annotations contains prometheus.io/scrape: true

All matrix rows produce expected output. Annotations propagate correctly.

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…raping

Adds two opt-in values to the chart:

- service.create: when true in IMDS/DaemonSet mode, creates a headless
  Service (clusterIP: None) selecting DaemonSet pods. Default false
  preserves existing behavior. SQS mode continues to auto-create a Service
  as before.
- service.annotations: applied to the Service metadata. Enables attaching
  prometheus.io/* annotations for scrape discovery via endpointslice-based
  mechanisms (vanilla Prometheus, Grafana Alloy, grafana/k8s-monitoring).

Strictly additive and backward-compatible. Existing SQS users see no
change. Existing IMDS users who do not set service.create see no change.
Only IMDS users who explicitly opt in get the new Service.

Verified via helm template across behavior matrix:
- enablePrometheusServer=false: no Service (all modes)
- SQS=false, PromServer=true, service.create=false: no Service (legacy IMDS)
- SQS=false, PromServer=true, service.create=true: headless Service with
  DaemonSet selector
- SQS=true, PromServer=true: ClusterIP Service with Deployment selector
  (legacy SQS, unchanged)

Refs aws#1266
@lvg-dexters lvg-dexters requested a review from a team as a code owner April 24, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Helm] Allow Service creation and custom annotations for Prometheus scraping (non-SQS mode)

1 participant