From 14f43ad3376e883f9067a374e7214ee7f734f6fb Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 18:25:55 +0200 Subject: [PATCH 01/12] docs: design spec for observability support (#4463) Adds the brainstorming design for exposing ArcadeDB 26.7.1 observability (health probes, OTLP metrics, structured logging, tracing) in the Helm chart. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...2026-06-17-observability-support-design.md | 214 ++++++++++++++++++ 1 file changed, 214 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-17-observability-support-design.md diff --git a/docs/superpowers/specs/2026-06-17-observability-support-design.md b/docs/superpowers/specs/2026-06-17-observability-support-design.md new file mode 100644 index 0000000..25e3917 --- /dev/null +++ b/docs/superpowers/specs/2026-06-17-observability-support-design.md @@ -0,0 +1,214 @@ +# Design: Observability support for the ArcadeDB Helm chart + +**Date:** 2026-06-17 +**Status:** Approved +**Tracking:** ArcadeData/arcadedb#4463 (sub-issues #4464, #4465, #4466, #4467) + +## Summary + +ArcadeDB 26.7.1 adds an opt-in, behavior-preserving observability stack across four +pillars: dependency-free health probes (#4464), metrics depth with OTLP export +(#4465), structured JSON logging with correlation/trace IDs (#4466), and distributed +tracing (#4467). All features are present in the `latest` image today and ship in the +`26.7.1` release. + +This chart is a thin wrapper that passes ArcadeDB settings as `-D` JVM args in the +StatefulSet `command`, wires plugins through `_helpers.tpl`, and exposes ports through +the services. Supporting observability therefore means: a first-class `observability:` +values surface, helpers that translate it into `-D` args, a new ServiceMonitor template, +optional scrape annotations, a revised liveness probe default, and tests/docs. + +Everything is default-off and behavior-preserving except the liveness probe default, +which changes together with the `appVersion` bump to `26.7.1` (see Health pillar). + +## Goals + +- Expose all four observability pillars as documented, validated, first-class chart knobs. +- Provide Kubernetes-native Prometheus discovery the raw `-D` args cannot: a + ServiceMonitor CRD and scrape annotations. +- Improve the liveness probe to a dependency-free endpoint that cannot trigger + restart loops on slow startup. +- Keep existing deployments unchanged when the new knobs are left at their defaults. + +## Non-goals + +- Standing up an OTel collector or Prometheus Operator inside the chart — users point + the chart at their own. +- Heavy end-to-end CI for collectors/operators in kind. Rendering is covered by unit + tests; a single `/api/v1/health` smoke assertion is added to the existing kind job. +- A generic settings passthrough. The existing `arcadedb.extraCommands` remains the + escape hatch for any setting not promoted to a first-class knob. + +## Design choice + +The values surface is a **dedicated, feature-grouped `observability:` section** rather +than a 1:1 mirror of ArcadeDB's internal key hierarchy or a raw passthrough map. This +matches how the chart already abstracts plugins and persistence into purpose-built +sections, gives a cleanly testable surface, and keeps documentation focused. Templates +translate the chart-level knobs into the underlying ArcadeDB `-D` settings. + +**Boundary:** the Prometheus *plugin* (what loads `/prometheus` inside ArcadeDB) stays +under the existing `arcadedb.plugins.prometheus`, consistent with every other plugin. +The `observability.metrics.prometheus.*` block handles only the Kubernetes-side +integration (ServiceMonitor + annotations). A template guard fails with a clear message +if a ServiceMonitor or scrape annotations are enabled while the prometheus plugin is off. + +## Components + +### 1. Values surface (`values.yaml`) + +New top-level block; all knobs default-off / behavior-preserving: + +```yaml +## @section observability +observability: + metrics: + prometheus: + serviceMonitor: + enabled: false + interval: 30s + scrapeTimeout: "" + path: /prometheus + labels: {} # e.g. release: kube-prometheus-stack + annotations: {} + relabelings: [] + metricRelabelings: [] + basicAuth: + enabled: false # scrape with root creds from the chart-managed secret + podAnnotations: + enabled: false # classic annotation-based discovery (no Operator) + path: /prometheus + # port defaults to service.http.port + otlp: + enabled: false + endpoint: http://localhost:4317 + tracing: + enabled: false + endpoint: http://localhost:4317 + samplingRate: 0.0 # parent-based ratio [0.0, 1.0] + logging: + format: text # text | json + includeTrace: false # append [traceId=…] to text logs + health: + readinessRequiresHA: false # /api/v1/ready waits for Raft join on HA clusters +``` + +The existing `arcadedb.plugins.prometheus` example is extended with a +`requireAuthentication` knob. + +### 2. Template translation (`_helpers.tpl`, `statefulset.yaml`) + +A new helper `arcadedb.observability.args` emits the `-D` args, included in the +StatefulSet `command` after the existing plugin parameters: + +| Knob | Emitted `-D` arg(s) | +|------|---------------------| +| `logging.format: json` | `-Darcadedb.server.logFormat=json` | +| `logging.includeTrace: true` | `-Darcadedb.server.logIncludeTrace=true` | +| `metrics.otlp.enabled: true` | `-Darcadedb.serverMetrics.otlp.enabled=true` + `...otlp.endpoint=` | +| `tracing.enabled: true` | `...tracing.enabled=true` + `...tracing.endpoint=` + `...tracing.samplingRate=` | +| `health.readinessRequiresHA: true` | `-Darcadedb.server.readinessRequiresHA=true` | + +Plus: +- `_arcadedb.plugin.ports` / `arcadedb.plugin.parameters` extended so + `arcadedb.plugins.prometheus.requireAuthentication: false` emits + `-Darcadedb.serverMetrics.prometheus.requireAuthentication=false`. +- StatefulSet pod-template `metadata.annotations` merges `.Values.podAnnotations` with + computed `prometheus.io/scrape|port|path` annotations when + `observability.metrics.prometheus.podAnnotations.enabled` is true (port defaults to + `service.http.port`). + +### 3. ServiceMonitor (`templates/servicemonitor.yaml`, new) + +Gated on `observability.metrics.prometheus.serviceMonitor.enabled`. Selects the existing +`{fullname}-http` Service via the chart's standard selector labels and scrapes its named +`http` port at the configured path — no new Service or port. Optional `basicAuth` +references the chart's root credential secret (username `root`) so an authenticated +`/prometheus` can still be scraped; default off, with the documented happy path being +`arcadedb.plugins.prometheus.requireAuthentication: false` + unauthenticated scrape. +Supports `labels`, `annotations`, `interval`, `scrapeTimeout`, `relabelings`, +`metricRelabelings`. Fully gated, so clusters without the Operator CRD are unaffected +when disabled. + +### 4. Guard helper + +A small helper invoked from both the ServiceMonitor template and the StatefulSet that +calls `fail` with a clear message if `serviceMonitor.enabled` or +`podAnnotations.enabled` is true while `arcadedb.plugins.prometheus.enabled` is not. + +### 5. Health probes (`values.yaml`) + +```yaml +livenessProbe: + httpGet: + path: /api/v1/health # was /api/v1/ready — liveness must not depend on DB state + port: http +readinessProbe: + httpGet: + path: /api/v1/ready # unchanged — gates traffic until ONLINE + port: http +``` + +`/api/v1/health` does no DB I/O and never returns 503, eliminating the restart-loop risk +of liveness depending on `/api/v1/ready`. This endpoint exists only in `26.7.1`+, so the +default change ships together with the `appVersion` bump. It is the single +default-changing item in this design; every other knob is default-off and harmless on +older images. + +### 6. Version (`Chart.yaml`) + +`version` and `appVersion` → `26.7.1`. The default `image.tag` then resolves to an image +that serves `/api/v1/health`, keeping chart and image consistent. Prepared against +`latest` (== 26.7.1 content), released when 26.7.1 publishes. + +## Testing + +helm-unittest, one file per concern (existing chart convention): + +- `tests/observability_test.yaml` (new): each knob renders the correct `-D` arg and + nothing when default-off — logging json/text, otlp on/off, tracing args incl. + `samplingRate`, `readinessRequiresHA`, the prometheus `requireAuthentication` arg, and + the pod scrape annotations. +- `tests/servicemonitor_test.yaml` (new): CRD absent when disabled; correct + selector/port/path/interval, `basicAuth` block, labels; and the `fail` guard when + enabled without the prometheus plugin. +- `tests/statefulset_test.yaml` (update): liveness default `/api/v1/health`, readiness + unchanged. +- `tests/helpers_test.yaml` (update): plugin `requireAuthentication` rendering. + +CI: the existing kind integration job already runs `latest`; add one assertion phase +that curls `/api/v1/health` and expects HTTP 204, locking in the new liveness contract. +Collector/Operator end-to-end scenarios are out of scope. + +## Documentation + +- `@section`/`@param` annotations across the new values so the readme-generator emits + them; regenerate `charts/arcadedb/README.md`. +- A short "Observability" subsection in the top-level `README.md` with enable recipes: + Prometheus + ServiceMonitor, OTLP metrics, distributed tracing, and JSON logging. + +## Files touched + +- `charts/arcadedb/values.yaml` — new `observability:` section; `requireAuthentication` + on the prometheus plugin example; revised liveness probe default. +- `charts/arcadedb/templates/_helpers.tpl` — `arcadedb.observability.args` helper, + plugin `requireAuthentication`, guard helper. +- `charts/arcadedb/templates/statefulset.yaml` — include observability args; merge scrape + pod annotations. +- `charts/arcadedb/templates/servicemonitor.yaml` — new, gated. +- `charts/arcadedb/Chart.yaml` — `version`/`appVersion` → `26.7.1`. +- `charts/arcadedb/tests/observability_test.yaml`, `tests/servicemonitor_test.yaml` — new. +- `charts/arcadedb/tests/statefulset_test.yaml`, `tests/helpers_test.yaml` — updated. +- `charts/arcadedb/README.md`, `README.md` — regenerated/updated docs. +- kind integration job — one `/api/v1/health` smoke assertion. + +## Risks & mitigations + +- **Version skew (liveness 404 on < 26.7.1):** mitigated by bumping `appVersion` to + `26.7.1` so the default image serves the endpoint; release the chart only when the + image is published. +- **ServiceMonitor without the Operator CRD:** template fully gated; no effect when + disabled. +- **Scrape config without the prometheus plugin:** guard helper fails fast with guidance. +- **Authenticated `/prometheus` blocking scrapes:** documented happy path disables + prometheus auth; optional `basicAuth` covers the authenticated case. From 5a22573bf14c05c2db98b685d6e8746e368cae1e Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 18:45:16 +0200 Subject: [PATCH 02/12] docs: implementation plan for observability support (#4463) Co-Authored-By: Claude Opus 4.8 (1M context) --- .../plans/2026-06-17-observability-support.md | 980 ++++++++++++++++++ 1 file changed, 980 insertions(+) create mode 100644 docs/superpowers/plans/2026-06-17-observability-support.md diff --git a/docs/superpowers/plans/2026-06-17-observability-support.md b/docs/superpowers/plans/2026-06-17-observability-support.md new file mode 100644 index 0000000..34e667b --- /dev/null +++ b/docs/superpowers/plans/2026-06-17-observability-support.md @@ -0,0 +1,980 @@ +# Observability Support Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Expose ArcadeDB 26.7.1's opt-in observability stack (health probes, OTLP metrics, structured logging, distributed tracing, Prometheus discovery) as first-class Helm chart knobs. + +**Architecture:** The chart passes ArcadeDB settings as `-D` JVM args in the StatefulSet `command` and wires Kubernetes resources from `values.yaml`. A new feature-grouped `observability:` values section is translated by helpers into `-D` args, a gated ServiceMonitor template, optional scrape pod annotations, and a revised liveness probe default. Every knob is default-off and behavior-preserving except the liveness probe default, which ships with the `appVersion` bump to 26.7.1. + +**Tech Stack:** Helm 3 (Go templating / Sprig), helm-unittest 0.5.2, bash + kind for integration tests. + +**Design spec:** `docs/superpowers/specs/2026-06-17-observability-support-design.md` + +--- + +## Conventions for every task + +- Run a single unit-test suite with: `helm unittest -f 'tests/.yaml' charts/arcadedb` + (the helm-unittest plugin auto-installs via `make plugin-install`; run that once first if `helm unittest` reports an unknown command). +- Run the whole suite with: `make test-unit` +- Lint with: `make lint` +- helm-unittest `contains`/`notContains` on a `command` list match exact strings; rendered YAML formatting/whitespace does not affect assertions. +- Commit after each green task. + +--- + +## Task 1: Observability `-D` args helper (logging, OTLP, tracing, readiness) + +Adds the four settings-only pillars: JSON logging, OTLP metrics export, tracing, and HA-aware readiness. One helper emits all of them; nothing renders when defaults are unchanged. + +**Files:** +- Modify: `charts/arcadedb/values.yaml` (add `observability:` section) +- Modify: `charts/arcadedb/templates/_helpers.tpl` (add `arcadedb.observability.args`) +- Modify: `charts/arcadedb/templates/statefulset.yaml:70` (include the helper) +- Test: `charts/arcadedb/tests/observability_test.yaml` (create) + +- [ ] **Step 1: Write the failing test** + +Create `charts/arcadedb/tests/observability_test.yaml`: + +```yaml +suite: Observability args (asserted via StatefulSet) +templates: + - statefulset.yaml +release: + name: test + namespace: default +tests: + - it: emits no observability args by default + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.enabled=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.enabled=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.readinessRequiresHA=true" + + - it: logging.format=json emits the logFormat arg + set: + observability.logging.format: json + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + + - it: text logging (default) emits no logFormat arg + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + + - it: logging.includeTrace=true emits the logIncludeTrace arg + set: + observability.logging.includeTrace: true + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logIncludeTrace=true" + + - it: otlp metrics enabled emits enable + endpoint args + set: + observability.metrics.otlp.enabled: true + observability.metrics.otlp.endpoint: http://otel-collector:4317 + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.enabled=true" + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.endpoint=http://otel-collector:4317" + + - it: tracing enabled emits enable + endpoint + samplingRate args + set: + observability.tracing.enabled: true + observability.tracing.endpoint: http://otel-collector:4317 + observability.tracing.samplingRate: 0.1 + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.enabled=true" + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.endpoint=http://otel-collector:4317" + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.samplingRate=0.1" + + - it: readinessRequiresHA emits the arg + set: + observability.health.readinessRequiresHA: true + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.readinessRequiresHA=true" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `helm unittest -f 'tests/observability_test.yaml' charts/arcadedb` +Expected: FAIL — the `set` cases error because `observability` does not exist in values yet (nil map access), and the "emits no args" case may pass trivially. + +- [ ] **Step 3: Add the `observability:` section to values.yaml** + +Append to `charts/arcadedb/values.yaml` (after the `networkPolicy` section at the end): + +```yaml +## @section observability +## Opt-in, behavior-preserving observability (ArcadeDB 26.7.1+). +## Every knob below defaults off; existing deployments are unchanged. +observability: + ## @section observability.metrics + metrics: + prometheus: + ## Prometheus Operator ServiceMonitor. Requires the prometheus plugin + ## (arcadedb.plugins.prometheus.enabled=true) so /prometheus is served. + serviceMonitor: + ## @param observability.metrics.prometheus.serviceMonitor.enabled Create a ServiceMonitor CRD + enabled: false + ## @param observability.metrics.prometheus.serviceMonitor.interval Scrape interval + interval: 30s + ## @param observability.metrics.prometheus.serviceMonitor.scrapeTimeout Scrape timeout (empty = Prometheus default) + scrapeTimeout: "" + ## @param observability.metrics.prometheus.serviceMonitor.path Metrics path + path: /prometheus + ## @param observability.metrics.prometheus.serviceMonitor.labels Extra labels (e.g. release: kube-prometheus-stack) + labels: {} + ## @param observability.metrics.prometheus.serviceMonitor.annotations Extra annotations + annotations: {} + ## @param observability.metrics.prometheus.serviceMonitor.relabelings Prometheus relabelings + relabelings: [] + ## @param observability.metrics.prometheus.serviceMonitor.metricRelabelings Prometheus metric relabelings + metricRelabelings: [] + basicAuth: + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.enabled Scrape with basic auth + enabled: false + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.secretName Secret holding scrape credentials (username + password keys) + secretName: "" + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.usernameKey Key in the secret holding the username + usernameKey: username + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.passwordKey Key in the secret holding the password + passwordKey: password + ## Annotation-based discovery (classic Prometheus, no Operator). + podAnnotations: + ## @param observability.metrics.prometheus.podAnnotations.enabled Add prometheus.io/* scrape annotations to pods + enabled: false + ## @param observability.metrics.prometheus.podAnnotations.path Scrape path annotation value + path: /prometheus + ## @param observability.metrics.prometheus.podAnnotations.port Scrape port (empty = service.http.port) + port: "" + ## Push metrics to an OpenTelemetry collector alongside /prometheus. + otlp: + ## @param observability.metrics.otlp.enabled Enable the OTLP metrics registry + enabled: false + ## @param observability.metrics.otlp.endpoint OTLP/gRPC metrics endpoint + endpoint: http://localhost:4317 + ## @section observability.tracing + tracing: + ## @param observability.tracing.enabled Enable distributed tracing (plugin ships in the standard image) + enabled: false + ## @param observability.tracing.endpoint OTLP/gRPC trace endpoint + endpoint: http://localhost:4317 + ## @param observability.tracing.samplingRate Parent-based sampling ratio [0.0, 1.0] + samplingRate: 0.0 + ## @section observability.logging + logging: + ## @param observability.logging.format Log format: text or json + format: text + ## @param observability.logging.includeTrace Append [traceId=…] to text logs while a trace is active + includeTrace: false + ## @section observability.health + health: + ## @param observability.health.readinessRequiresHA /api/v1/ready waits for Raft join on HA clusters + readinessRequiresHA: false +``` + +- [ ] **Step 4: Add the args helper to _helpers.tpl** + +Append to `charts/arcadedb/templates/_helpers.tpl`: + +``` +{{/* +Observability -D JVM args (logging, OTLP metrics, tracing, readiness). +All opt-in; emits nothing when defaults are unchanged. +*/}} +{{- define "arcadedb.observability.args" -}} +{{- $o := .Values.observability -}} +{{- if eq $o.logging.format "json" }} +- -Darcadedb.server.logFormat=json +{{- end }} +{{- if $o.logging.includeTrace }} +- -Darcadedb.server.logIncludeTrace=true +{{- end }} +{{- if $o.metrics.otlp.enabled }} +- -Darcadedb.serverMetrics.otlp.enabled=true +- -Darcadedb.serverMetrics.otlp.endpoint={{ $o.metrics.otlp.endpoint }} +{{- end }} +{{- if $o.tracing.enabled }} +- -Darcadedb.serverMetrics.tracing.enabled=true +- -Darcadedb.serverMetrics.tracing.endpoint={{ $o.tracing.endpoint }} +- -Darcadedb.serverMetrics.tracing.samplingRate={{ $o.tracing.samplingRate }} +{{- end }} +{{- if $o.health.readinessRequiresHA }} +- -Darcadedb.server.readinessRequiresHA=true +{{- end }} +{{- end -}} +``` + +- [ ] **Step 5: Include the helper in the StatefulSet command** + +In `charts/arcadedb/templates/statefulset.yaml`, find line 70: + +``` + {{- include "arcadedb.plugin.parameters" . | nindent 12 }} +``` + +Add the observability include immediately after it: + +``` + {{- include "arcadedb.plugin.parameters" . | nindent 12 }} + {{- include "arcadedb.observability.args" . | nindent 12 }} +``` + +- [ ] **Step 6: Run test to verify it passes** + +Run: `helm unittest -f 'tests/observability_test.yaml' charts/arcadedb` +Expected: PASS (all cases green) + +- [ ] **Step 7: Lint and commit** + +```bash +helm lint charts/arcadedb +git add charts/arcadedb/values.yaml charts/arcadedb/templates/_helpers.tpl charts/arcadedb/templates/statefulset.yaml charts/arcadedb/tests/observability_test.yaml +git commit -m "feat(helm): observability -D args (logging, OTLP, tracing, readiness)" +``` + +--- + +## Task 2: Prometheus plugin `requireAuthentication` + +The Prometheus plugin enables `/prometheus`; scraping it from Prometheus needs unauthenticated access. Add a `requireAuthentication` knob that emits the corresponding `-D` arg. + +**Files:** +- Modify: `charts/arcadedb/templates/_helpers.tpl` (the `arcadedb.plugin.parameters` define) +- Modify: `charts/arcadedb/values.yaml` (prometheus plugin example comment) +- Test: `charts/arcadedb/tests/helpers_test.yaml` (add cases) + +- [ ] **Step 1: Write the failing test** + +Append these tests to the `tests:` list in `charts/arcadedb/tests/helpers_test.yaml`: + +```yaml + - it: prometheus plugin emits requireAuthentication=false when set + set: + arcadedb.plugins.prometheus.enabled: true + arcadedb.plugins.prometheus.requireAuthentication: false + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=false" + + - it: prometheus plugin omits requireAuthentication arg when not set + set: + arcadedb.plugins.prometheus.enabled: true + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=false" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=true" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `helm unittest -f 'tests/helpers_test.yaml' charts/arcadedb` +Expected: FAIL — the requireAuthentication arg is not emitted yet. + +- [ ] **Step 3: Emit the arg in the prometheus branch of `arcadedb.plugin.parameters`** + +In `charts/arcadedb/templates/_helpers.tpl`, find the prometheus branch inside `arcadedb.plugin.parameters` (currently): + +``` + {{- else if eq $plugin "prometheus" -}} + {{- $plugins = append $plugins "Prometheus:com.arcadedb.metrics.prometheus.PrometheusMetricsPlugin" -}} +``` + +Replace it with a version that also reads `requireAuthentication` from the original values (the helper iterates over the port-map produced by `_arcadedb.plugin.ports`, which does not carry `requireAuthentication`, so read it from `.Values.arcadedb.plugins.prometheus`): + +``` + {{- else if eq $plugin "prometheus" -}} + {{- $plugins = append $plugins "Prometheus:com.arcadedb.metrics.prometheus.PrometheusMetricsPlugin" -}} + {{- with $.Values.arcadedb.plugins.prometheus -}} + {{- if hasKey . "requireAuthentication" -}} + {{- $params = append $params (printf "-Darcadedb.serverMetrics.prometheus.requireAuthentication=%v" .requireAuthentication) -}} + {{- end -}} + {{- end -}} +``` + +Note: `$` is the root context. Inside the `range` the dot is the map entry, so the root values are reached via `$.Values`. `hasKey` ensures the arg is emitted only when the user explicitly sets the field (true or false), preserving the "omit when unset" behavior the test asserts. + +- [ ] **Step 4: Run test to verify it passes** + +Run: `helm unittest -f 'tests/helpers_test.yaml' charts/arcadedb` +Expected: PASS + +- [ ] **Step 5: Document the knob in the values example** + +In `charts/arcadedb/values.yaml`, update the commented prometheus plugin example (currently lines ~57-58): + +```yaml + # prometheus: + # enabled: false +``` + +to: + +```yaml + # prometheus: + # enabled: false + # # Set false to allow unauthenticated scraping of /prometheus (needed for + # # ServiceMonitor / annotation-based discovery without basic auth). + # requireAuthentication: true +``` + +- [ ] **Step 6: Lint and commit** + +```bash +helm lint charts/arcadedb +git add charts/arcadedb/templates/_helpers.tpl charts/arcadedb/values.yaml charts/arcadedb/tests/helpers_test.yaml +git commit -m "feat(helm): prometheus plugin requireAuthentication knob" +``` + +--- + +## Task 3: Scrape pod annotations + plugin guard + +Add a guard that fails fast when scrape discovery is enabled without the prometheus plugin, and merge `prometheus.io/*` annotations into the pod template when enabled. + +**Files:** +- Modify: `charts/arcadedb/templates/_helpers.tpl` (add `arcadedb.observability.validate` and `arcadedb.podAnnotations`) +- Modify: `charts/arcadedb/templates/statefulset.yaml` (call guard; use merged annotations) +- Test: `charts/arcadedb/tests/observability_test.yaml` (add cases) + +- [ ] **Step 1: Write the failing test** + +Append to the `tests:` list in `charts/arcadedb/tests/observability_test.yaml`: + +```yaml + - it: scrape pod annotations render when enabled with the prometheus plugin + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.podAnnotations.enabled: true + asserts: + - equal: + path: spec.template.metadata.annotations["prometheus.io/scrape"] + value: "true" + - equal: + path: spec.template.metadata.annotations["prometheus.io/port"] + value: "2480" + - equal: + path: spec.template.metadata.annotations["prometheus.io/path"] + value: /prometheus + + - it: scrape pod annotation port honours service.http.port and custom path + set: + arcadedb.plugins.prometheus.enabled: true + service.http.port: 9090 + observability.metrics.prometheus.podAnnotations.enabled: true + observability.metrics.prometheus.podAnnotations.path: /metrics + asserts: + - equal: + path: spec.template.metadata.annotations["prometheus.io/port"] + value: "9090" + - equal: + path: spec.template.metadata.annotations["prometheus.io/path"] + value: /metrics + + - it: scrape pod annotation port can be overridden explicitly + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.podAnnotations.enabled: true + observability.metrics.prometheus.podAnnotations.port: 1234 + asserts: + - equal: + path: spec.template.metadata.annotations["prometheus.io/port"] + value: "1234" + + - it: user podAnnotations are preserved alongside scrape annotations + set: + arcadedb.plugins.prometheus.enabled: true + podAnnotations: + my-team: data + observability.metrics.prometheus.podAnnotations.enabled: true + asserts: + - equal: + path: spec.template.metadata.annotations["my-team"] + value: data + - equal: + path: spec.template.metadata.annotations["prometheus.io/scrape"] + value: "true" + + - it: no scrape annotations when podAnnotations discovery is disabled + asserts: + - notExists: + path: spec.template.metadata.annotations["prometheus.io/scrape"] + + - it: fails when scrape annotations enabled without the prometheus plugin + set: + observability.metrics.prometheus.podAnnotations.enabled: true + asserts: + - failedTemplate: + errorPattern: "require arcadedb.plugins.prometheus.enabled=true" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `helm unittest -f 'tests/observability_test.yaml' charts/arcadedb` +Expected: FAIL — annotations are not merged and no guard exists. + +- [ ] **Step 3: Add guard and annotation helpers to _helpers.tpl** + +Append to `charts/arcadedb/templates/_helpers.tpl`: + +``` +{{/* +Guard: scrape discovery (ServiceMonitor or pod annotations) needs the +prometheus plugin so /prometheus is actually served. +*/}} +{{- define "arcadedb.observability.validate" -}} +{{- $p := .Values.observability.metrics.prometheus -}} +{{- if or $p.serviceMonitor.enabled $p.podAnnotations.enabled -}} + {{- $promEnabled := false -}} + {{- with .Values.arcadedb.plugins.prometheus -}} + {{- if .enabled -}}{{- $promEnabled = true -}}{{- end -}} + {{- end -}} + {{- if not $promEnabled -}} + {{- fail "observability.metrics.prometheus serviceMonitor/podAnnotations require arcadedb.plugins.prometheus.enabled=true (the /prometheus endpoint must be served)" -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{/* +Merge user-supplied podAnnotations with computed prometheus.io/* scrape +annotations. Returns YAML (possibly empty). +*/}} +{{- define "arcadedb.podAnnotations" -}} +{{- $annotations := deepCopy (default dict .Values.podAnnotations) -}} +{{- $pa := .Values.observability.metrics.prometheus.podAnnotations -}} +{{- if $pa.enabled -}} + {{- $port := int .Values.service.http.port -}} + {{- if $pa.port -}}{{- $port = int $pa.port -}}{{- end -}} + {{- $_ := set $annotations "prometheus.io/scrape" "true" -}} + {{- $_ := set $annotations "prometheus.io/port" (printf "%d" $port) -}} + {{- $_ := set $annotations "prometheus.io/path" $pa.path -}} +{{- end -}} +{{- if $annotations -}} +{{- toYaml $annotations -}} +{{- end -}} +{{- end -}} +``` + +- [ ] **Step 4: Wire guard and merged annotations into statefulset.yaml** + +In `charts/arcadedb/templates/statefulset.yaml`, add the guard as the very first line of the file (before `apiVersion: apps/v1`): + +``` +{{- include "arcadedb.observability.validate" . -}} +apiVersion: apps/v1 +``` + +Then replace the pod-template annotations block (currently lines 18-21): + +``` + {{- with .Values.podAnnotations }} + annotations: + {{- toYaml . | nindent 8 }} + {{- end }} +``` + +with: + +``` + {{- with (include "arcadedb.podAnnotations" . | fromYaml) }} + annotations: + {{- toYaml . | nindent 8 }} + {{- end }} +``` + +(`fromYaml` of the empty string yields an empty map, so `with` correctly skips the block when there are no annotations.) + +- [ ] **Step 5: Run test to verify it passes** + +Run: `helm unittest -f 'tests/observability_test.yaml' charts/arcadedb` +Expected: PASS + +Also re-run the existing StatefulSet suite to confirm the annotations refactor didn't regress the existing `podAnnotations flow through` test: + +Run: `helm unittest -f 'tests/statefulset_test.yaml' charts/arcadedb` +Expected: PASS + +- [ ] **Step 6: Lint and commit** + +```bash +helm lint charts/arcadedb +git add charts/arcadedb/templates/_helpers.tpl charts/arcadedb/templates/statefulset.yaml charts/arcadedb/tests/observability_test.yaml +git commit -m "feat(helm): prometheus scrape pod annotations + plugin guard" +``` + +--- + +## Task 4: ServiceMonitor template + +A gated `ServiceMonitor` for Prometheus Operator setups, selecting the existing `{fullname}-http` Service. + +**Files:** +- Create: `charts/arcadedb/templates/servicemonitor.yaml` +- Modify: `charts/arcadedb/templates/_helpers.tpl` (extend guard for basicAuth secret) +- Test: `charts/arcadedb/tests/servicemonitor_test.yaml` (create) + +- [ ] **Step 1: Write the failing test** + +Create `charts/arcadedb/tests/servicemonitor_test.yaml`: + +```yaml +suite: ServiceMonitor +templates: + - servicemonitor.yaml +release: + name: test + namespace: default +tests: + - it: is not rendered by default + asserts: + - hasDocuments: { count: 0 } + + - it: renders when enabled with the prometheus plugin + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + asserts: + - hasDocuments: { count: 1 } + - isKind: { of: ServiceMonitor } + - equal: { path: metadata.name, value: test-arcadedb } + - equal: + path: spec.selector.matchLabels["app.kubernetes.io/name"] + value: arcadedb + - equal: { path: spec.endpoints[0].port, value: http } + - equal: { path: spec.endpoints[0].path, value: /prometheus } + - equal: { path: spec.endpoints[0].interval, value: 30s } + + - it: honours interval, scrapeTimeout, path, and extra labels + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.interval: 15s + observability.metrics.prometheus.serviceMonitor.scrapeTimeout: 10s + observability.metrics.prometheus.serviceMonitor.path: /metrics + observability.metrics.prometheus.serviceMonitor.labels: + release: kube-prometheus-stack + asserts: + - equal: { path: spec.endpoints[0].interval, value: 15s } + - equal: { path: spec.endpoints[0].scrapeTimeout, value: 10s } + - equal: { path: spec.endpoints[0].path, value: /metrics } + - equal: + path: metadata.labels.release + value: kube-prometheus-stack + + - it: renders basicAuth referencing the supplied secret + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.basicAuth.enabled: true + observability.metrics.prometheus.serviceMonitor.basicAuth.secretName: scrape-creds + asserts: + - equal: + path: spec.endpoints[0].basicAuth.username.name + value: scrape-creds + - equal: + path: spec.endpoints[0].basicAuth.username.key + value: username + - equal: + path: spec.endpoints[0].basicAuth.password.name + value: scrape-creds + - equal: + path: spec.endpoints[0].basicAuth.password.key + value: password + + - it: fails when ServiceMonitor enabled without the prometheus plugin + set: + observability.metrics.prometheus.serviceMonitor.enabled: true + asserts: + - failedTemplate: + errorPattern: "require arcadedb.plugins.prometheus.enabled=true" + + - it: fails when basicAuth enabled without a secretName + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.basicAuth.enabled: true + asserts: + - failedTemplate: + errorPattern: "requires basicAuth.secretName" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `helm unittest -f 'tests/servicemonitor_test.yaml' charts/arcadedb` +Expected: FAIL — `servicemonitor.yaml` does not exist (0 documents always). + +- [ ] **Step 3: Create the ServiceMonitor template** + +Create `charts/arcadedb/templates/servicemonitor.yaml`: + +```yaml +{{- if .Values.observability.metrics.prometheus.serviceMonitor.enabled }} +{{- include "arcadedb.observability.validate" . -}} +{{- $sm := .Values.observability.metrics.prometheus.serviceMonitor -}} +{{- if and $sm.basicAuth.enabled (not $sm.basicAuth.secretName) -}} +{{- fail "serviceMonitor.basicAuth.enabled requires basicAuth.secretName (a secret with username + password keys)" -}} +{{- end -}} +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ include "arcadedb.fullname" . }} + labels: + {{- include "arcadedb.labels" . | nindent 4 }} + {{- with $sm.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with $sm.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + selector: + matchLabels: + {{- include "arcadedb.selectorLabels" . | nindent 6 }} + endpoints: + - port: http + path: {{ $sm.path }} + interval: {{ $sm.interval }} + {{- with $sm.scrapeTimeout }} + scrapeTimeout: {{ . }} + {{- end }} + {{- if $sm.basicAuth.enabled }} + basicAuth: + username: + name: {{ $sm.basicAuth.secretName }} + key: {{ $sm.basicAuth.usernameKey }} + password: + name: {{ $sm.basicAuth.secretName }} + key: {{ $sm.basicAuth.passwordKey }} + {{- end }} + {{- with $sm.relabelings }} + relabelings: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with $sm.metricRelabelings }} + metricRelabelings: + {{- toYaml . | nindent 8 }} + {{- end }} +{{- end }} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `helm unittest -f 'tests/servicemonitor_test.yaml' charts/arcadedb` +Expected: PASS + +- [ ] **Step 5: Lint and commit** + +```bash +helm lint charts/arcadedb +git add charts/arcadedb/templates/servicemonitor.yaml charts/arcadedb/tests/servicemonitor_test.yaml +git commit -m "feat(helm): optional Prometheus Operator ServiceMonitor" +``` + +--- + +## Task 5: Liveness probe default + appVersion bump to 26.7.1 + +Switch the default liveness probe to the dependency-free `/api/v1/health` (exists in 26.7.1+) and move chart `version`/`appVersion` to 26.7.1 so the default image serves it. Readiness stays on `/api/v1/ready`. + +**Files:** +- Modify: `charts/arcadedb/values.yaml` (livenessProbe path) +- Modify: `charts/arcadedb/Chart.yaml` (version + appVersion) +- Modify: `charts/arcadedb/tests/statefulset_test.yaml` (probe + image-literal assertions) + +- [ ] **Step 1: Update the failing assertions in statefulset_test.yaml** + +In `charts/arcadedb/tests/statefulset_test.yaml`, replace the probe test (currently the `it: liveness and readiness probes default to /api/v1/ready on http port` block, lines ~104-117) with: + +```yaml + - it: liveness probe defaults to /api/v1/health and readiness to /api/v1/ready + asserts: + - equal: + path: "spec.template.spec.containers[0].livenessProbe.httpGet.path" + value: /api/v1/health + - equal: + path: "spec.template.spec.containers[0].livenessProbe.httpGet.port" + value: http + - equal: + path: "spec.template.spec.containers[0].readinessProbe.httpGet.path" + value: /api/v1/ready + - equal: + path: "spec.template.spec.containers[0].readinessProbe.httpGet.port" + value: http +``` + +In the same file, update the two pinned image literals from `26.6.1` to `26.7.1`: +- `it: image string composes registry/repository:tag, defaulting tag to AppVersion` → `value: arcadedata/arcadedb:26.7.1` +- `it: image.registry and image.repository overrides flow through` → `value: my-registry.example.com/arcadedb-fork:26.7.1` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `helm unittest -f 'tests/statefulset_test.yaml' charts/arcadedb` +Expected: FAIL — liveness still renders `/api/v1/ready` and image is still `26.6.1`. + +- [ ] **Step 3: Change the liveness probe default** + +In `charts/arcadedb/values.yaml`, update the `livenessProbe` section: + +```yaml +## @section livenessProbe +## Liveness uses the dependency-free /api/v1/health endpoint (no DB I/O, +## never returns 503). Requires ArcadeDB 26.7.1+. +livenessProbe: + httpGet: + path: /api/v1/health + port: http +``` + +Leave `readinessProbe` pointing at `/api/v1/ready` unchanged. + +- [ ] **Step 4: Bump the chart version and appVersion** + +In `charts/arcadedb/Chart.yaml`, change both: + +```yaml +version: 26.7.1 +appVersion: "26.7.1" +``` + +- [ ] **Step 5: Run tests to verify they pass** + +Run: `helm unittest -f 'tests/statefulset_test.yaml' charts/arcadedb` +Expected: PASS + +Run: `make test-unit` +Expected: PASS (full suite green) + +- [ ] **Step 6: Lint and commit** + +```bash +helm lint charts/arcadedb +git add charts/arcadedb/values.yaml charts/arcadedb/Chart.yaml charts/arcadedb/tests/statefulset_test.yaml +git commit -m "feat(helm): default liveness to /api/v1/health; bump to 26.7.1" +``` + +> **Merge gate:** the PR integration job pulls the default `image.tag` (now `26.7.1`). It goes green only once `arcadedata/arcadedb:26.7.1` is published. Until then, the `latest-image` guard (which pins `:latest`, where the features already live) validates the new behavior. Do not merge/release the chart before the 26.7.1 image exists. + +--- + +## Task 6: Integration test — `/api/v1/health` liveness smoke + +Add one phase to the shared kind integration script asserting the new liveness endpoint returns HTTP 204. It runs in both the PR job and the latest-image guard. + +**Files:** +- Modify: `ci/integration-test.sh` + +- [ ] **Step 1: Add an unauthenticated 204 health-probe helper and phase** + +In `ci/integration-test.sh`, after the `api()` helper (around line 42), add a helper that asserts the liveness endpoint returns 204 with no auth: + +```bash +assert_health_204() { # assert_health_204 + local port=$1 code + code=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 \ + "http://localhost:${port}/api/v1/health") + if [[ "$code" != "204" ]]; then + echo "ERROR: /api/v1/health returned ${code}, expected 204" + return 1 + fi + echo " /api/v1/health -> 204 (unauthenticated liveness OK)" + return 0 +} +``` + +- [ ] **Step 2: Renumber phase headers and insert the health phase** + +The script currently runs 6 phases labelled `[1/6]`…`[6/6]`. Bump the count to 7 and insert the health phase as phase 2 (right after rollout, before Raft formation). Update the six existing `==> [N/6]` echo lines to `[N/7]` with their numbers shifted by one for phases after the new one. Concretely: + +- Keep phase 1 (rollout) as `==> [1/7] Waiting for StatefulSet rollout ...`. +- Insert immediately after the rollout phase (after its `echo " All 3 pods Ready."` line, ~line 137): + +```bash +# ── phase 2: liveness health probe ──────────────────────────────────────────── + +echo "==> [2/7] Asserting /api/v1/health liveness endpoint..." +HP_PID=$(pf_start 0 "$HTTP_PORT") +pf_wait "$HTTP_PORT" || { echo "ERROR: port-forward to pod-0 failed"; exit 1; } +assert_health_204 "$HTTP_PORT" || exit 1 +pf_stop "$HP_PID" +``` + +- Renumber the remaining phase banners: Raft formation `[2/6]`→`[3/7]`, write `[3/6]`→`[4/7]`, read `[4/6]`→`[5/7]`, STATUS `[5/6]`→`[6/7]`, leadership transfer `[6/6]`→`[7/7]`. + +- [ ] **Step 3: Verify the script is syntactically valid** + +Run: `bash -n ci/integration-test.sh` +Expected: no output, exit 0. + +(The full kind run executes in CI; locally it requires Docker + kind. `bash -n` confirms syntax without a cluster.) + +- [ ] **Step 4: Commit** + +```bash +git add ci/integration-test.sh +git commit -m "test(integration): assert /api/v1/health liveness returns 204" +``` + +--- + +## Task 7: Documentation + +Document the new knobs in the chart README param table and add enable recipes to the top-level README. + +**Files:** +- Modify: `charts/arcadedb/README.md` (params table) +- Modify: `README.md` (Observability section) + +- [ ] **Step 1: Add the observability params to the chart README** + +Open `charts/arcadedb/README.md`, locate the parameters table, and add an `### observability` subsection that follows the existing table style (`| Name | Description | Value |`). Use these rows: + +```markdown +### observability + +| Name | Description | Value | +| ------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------ | +| `observability.metrics.prometheus.serviceMonitor.enabled` | Create a Prometheus Operator ServiceMonitor | `false` | +| `observability.metrics.prometheus.serviceMonitor.interval` | Scrape interval | `30s` | +| `observability.metrics.prometheus.serviceMonitor.scrapeTimeout` | Scrape timeout (empty = Prometheus default) | `""` | +| `observability.metrics.prometheus.serviceMonitor.path` | Metrics path | `/prometheus` | +| `observability.metrics.prometheus.serviceMonitor.labels` | Extra labels (e.g. release: kube-prometheus-stack) | `{}` | +| `observability.metrics.prometheus.serviceMonitor.annotations` | Extra annotations | `{}` | +| `observability.metrics.prometheus.serviceMonitor.relabelings` | Prometheus relabelings | `[]` | +| `observability.metrics.prometheus.serviceMonitor.metricRelabelings` | Prometheus metric relabelings | `[]` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.enabled` | Scrape with basic auth | `false` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.secretName` | Secret with scrape credentials (username + password keys) | `""` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.usernameKey` | Secret key holding the username | `username` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.passwordKey` | Secret key holding the password | `password` | +| `observability.metrics.prometheus.podAnnotations.enabled` | Add prometheus.io/* scrape annotations to pods | `false` | +| `observability.metrics.prometheus.podAnnotations.path` | Scrape path annotation value | `/prometheus` | +| `observability.metrics.prometheus.podAnnotations.port` | Scrape port (empty = service.http.port) | `""` | +| `observability.metrics.otlp.enabled` | Enable the OTLP metrics registry | `false` | +| `observability.metrics.otlp.endpoint` | OTLP/gRPC metrics endpoint | `http://localhost:4317` | +| `observability.tracing.enabled` | Enable distributed tracing | `false` | +| `observability.tracing.endpoint` | OTLP/gRPC trace endpoint | `http://localhost:4317` | +| `observability.tracing.samplingRate` | Parent-based sampling ratio [0.0, 1.0] | `0.0` | +| `observability.logging.format` | Log format: text or json | `text` | +| `observability.logging.includeTrace` | Append [traceId=…] to text logs while a trace is active | `false` | +| `observability.health.readinessRequiresHA` | /api/v1/ready waits for Raft join on HA clusters | `false` | +``` + +If the repo has a README-generator config (e.g. a `readme-generator` step), regenerate instead of hand-editing; otherwise hand-edit to match the table format above. + +- [ ] **Step 2: Add an Observability section to the top-level README** + +In `README.md`, after the `## Configuration` section, add: + +```markdown +## Observability + +ArcadeDB 26.7.1+ exposes opt-in, behavior-preserving observability. All knobs +default off. + +**Prometheus scraping (Operator):** + +```bash +helm install my-arcadedb arcadedb/arcadedb \ + --set arcadedb.plugins.prometheus.enabled=true \ + --set arcadedb.plugins.prometheus.requireAuthentication=false \ + --set observability.metrics.prometheus.serviceMonitor.enabled=true \ + --set observability.metrics.prometheus.serviceMonitor.labels.release=kube-prometheus-stack +``` + +For non-Operator Prometheus, use annotation discovery instead: +`--set observability.metrics.prometheus.podAnnotations.enabled=true`. + +**OTLP metrics export** (alongside /prometheus): + +```bash +--set observability.metrics.otlp.enabled=true \ +--set observability.metrics.otlp.endpoint=http://otel-collector:4317 +``` + +**Distributed tracing:** + +```bash +--set observability.tracing.enabled=true \ +--set observability.tracing.endpoint=http://otel-collector:4317 \ +--set observability.tracing.samplingRate=0.1 +``` + +**Structured JSON logging:** `--set observability.logging.format=json` + +The liveness probe uses the dependency-free `/api/v1/health` endpoint; +readiness stays on `/api/v1/ready`. Set +`observability.health.readinessRequiresHA=true` to gate readiness on Raft +membership in HA clusters. +``` + +- [ ] **Step 3: Lint and commit** + +```bash +helm lint charts/arcadedb +git add charts/arcadedb/README.md README.md +git commit -m "docs: document observability configuration" +``` + +--- + +## Final verification + +- [ ] **Run the full unit suite and lint** + +Run: `make lint && make test-unit` +Expected: lint clean; all suites pass. + +- [ ] **Confirm the rendered manifests for a fully-enabled config** + +Run: +```bash +helm template t charts/arcadedb \ + --set arcadedb.plugins.prometheus.enabled=true \ + --set arcadedb.plugins.prometheus.requireAuthentication=false \ + --set observability.metrics.prometheus.serviceMonitor.enabled=true \ + --set observability.metrics.prometheus.podAnnotations.enabled=true \ + --set observability.metrics.otlp.enabled=true \ + --set observability.tracing.enabled=true \ + --set observability.logging.format=json \ + --set observability.health.readinessRequiresHA=true +``` +Expected: a ServiceMonitor document renders; the StatefulSet command contains the otlp/tracing/logFormat/readinessRequiresHA/requireAuthentication `-D` args; pod template has `prometheus.io/*` annotations; liveness path is `/api/v1/health`. + +--- + +## Notes for the implementer + +- **TDD discipline:** each task writes/updates the test first, watches it fail, then implements. Do not skip the "verify it fails" step — it proves the test exercises the new behavior. +- **The guard runs on every render.** `arcadedb.observability.validate` is invoked from both `statefulset.yaml` (top of file) and `servicemonitor.yaml`. A misconfiguration (`serviceMonitor`/`podAnnotations` enabled without the prometheus plugin) fails `helm template`/`install` fast with a clear message — this is intended. +- **basicAuth uses a user-supplied secret**, not the chart's managed `arcadedb-credentials-secret` (which holds only `rootPassword`, no username key). The documented happy path is unauthenticated scraping via `requireAuthentication=false`. +- **Version coupling:** Task 5 is the only one changing a default. Keep it as a single commit so the liveness-default change and the 26.7.1 bump move together, and respect the merge gate. +``` From cd4658ab8a33f85483286b8ecd760180c6f9fb38 Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 18:48:09 +0200 Subject: [PATCH 03/12] feat(helm): observability -D args (logging, OTLP, tracing, readiness) --- charts/arcadedb/templates/_helpers.tpl | 26 ++++++ charts/arcadedb/templates/statefulset.yaml | 1 + charts/arcadedb/tests/observability_test.yaml | 88 +++++++++++++++++++ charts/arcadedb/values.yaml | 68 ++++++++++++++ 4 files changed, 183 insertions(+) create mode 100644 charts/arcadedb/tests/observability_test.yaml diff --git a/charts/arcadedb/templates/_helpers.tpl b/charts/arcadedb/templates/_helpers.tpl index 6f81e93..0439632 100644 --- a/charts/arcadedb/templates/_helpers.tpl +++ b/charts/arcadedb/templates/_helpers.tpl @@ -170,3 +170,29 @@ Create service configuration for the enabled plugins {{- end -}} {{- end -}} {{- end -}} + +{{/* +Observability -D JVM args (logging, OTLP metrics, tracing, readiness). +All opt-in; emits nothing when defaults are unchanged. +*/}} +{{- define "arcadedb.observability.args" -}} +{{- $o := .Values.observability -}} +{{- if eq $o.logging.format "json" }} +- -Darcadedb.server.logFormat=json +{{- end }} +{{- if $o.logging.includeTrace }} +- -Darcadedb.server.logIncludeTrace=true +{{- end }} +{{- if $o.metrics.otlp.enabled }} +- -Darcadedb.serverMetrics.otlp.enabled=true +- -Darcadedb.serverMetrics.otlp.endpoint={{ $o.metrics.otlp.endpoint }} +{{- end }} +{{- if $o.tracing.enabled }} +- -Darcadedb.serverMetrics.tracing.enabled=true +- -Darcadedb.serverMetrics.tracing.endpoint={{ $o.tracing.endpoint }} +- -Darcadedb.serverMetrics.tracing.samplingRate={{ $o.tracing.samplingRate }} +{{- end }} +{{- if $o.health.readinessRequiresHA }} +- -Darcadedb.server.readinessRequiresHA=true +{{- end }} +{{- end -}} diff --git a/charts/arcadedb/templates/statefulset.yaml b/charts/arcadedb/templates/statefulset.yaml index 641c3a3..935f50c 100644 --- a/charts/arcadedb/templates/statefulset.yaml +++ b/charts/arcadedb/templates/statefulset.yaml @@ -68,6 +68,7 @@ spec: {{- toYaml . | nindent 12 }} {{- end }} {{- include "arcadedb.plugin.parameters" . | nindent 12 }} + {{- include "arcadedb.observability.args" . | nindent 12 }} {{- with .Values.livenessProbe }} livenessProbe: {{- toYaml . | nindent 12 }} diff --git a/charts/arcadedb/tests/observability_test.yaml b/charts/arcadedb/tests/observability_test.yaml new file mode 100644 index 0000000..7476379 --- /dev/null +++ b/charts/arcadedb/tests/observability_test.yaml @@ -0,0 +1,88 @@ +suite: Observability args (asserted via StatefulSet) +templates: + - statefulset.yaml +release: + name: test + namespace: default +tests: + - it: emits no observability args by default + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.enabled=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.enabled=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.readinessRequiresHA=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logIncludeTrace=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.endpoint=http://localhost:4317" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.samplingRate=0.0" + + - it: logging.format=json emits the logFormat arg + set: + observability.logging.format: json + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + + - it: text logging (default) emits no logFormat arg + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + + - it: logging.includeTrace=true emits the logIncludeTrace arg + set: + observability.logging.includeTrace: true + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logIncludeTrace=true" + + - it: otlp metrics enabled emits enable + endpoint args + set: + observability.metrics.otlp.enabled: true + observability.metrics.otlp.endpoint: http://otel-collector:4317 + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.enabled=true" + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.endpoint=http://otel-collector:4317" + + - it: tracing enabled emits enable + endpoint + samplingRate args + set: + observability.tracing.enabled: true + observability.tracing.endpoint: http://otel-collector:4317 + observability.tracing.samplingRate: 0.1 + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.enabled=true" + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.endpoint=http://otel-collector:4317" + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.tracing.samplingRate=0.1" + + - it: readinessRequiresHA emits the arg + set: + observability.health.readinessRequiresHA: true + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.readinessRequiresHA=true" diff --git a/charts/arcadedb/values.yaml b/charts/arcadedb/values.yaml index 5224131..7f09949 100644 --- a/charts/arcadedb/values.yaml +++ b/charts/arcadedb/values.yaml @@ -268,3 +268,71 @@ networkPolicy: ## When enabled: HTTP (2480) is open to all cluster pods; Raft gRPC (2434) is restricted ## to ArcadeDB pods only. Recommended for production multi-tenant clusters. enabled: false + +## @section observability +## Opt-in, behavior-preserving observability (ArcadeDB 26.7.1+). +## Every knob below defaults off; existing deployments are unchanged. +observability: + ## @section observability.metrics + metrics: + prometheus: + ## Prometheus Operator ServiceMonitor. Requires the prometheus plugin + ## (arcadedb.plugins.prometheus.enabled=true) so /prometheus is served. + serviceMonitor: + ## @param observability.metrics.prometheus.serviceMonitor.enabled Create a ServiceMonitor CRD + enabled: false + ## @param observability.metrics.prometheus.serviceMonitor.interval Scrape interval + interval: 30s + ## @param observability.metrics.prometheus.serviceMonitor.scrapeTimeout Scrape timeout (empty = Prometheus default) + scrapeTimeout: "" + ## @param observability.metrics.prometheus.serviceMonitor.path Metrics path + path: /prometheus + ## @param observability.metrics.prometheus.serviceMonitor.labels Extra labels (e.g. release: kube-prometheus-stack) + labels: {} + ## @param observability.metrics.prometheus.serviceMonitor.annotations Extra annotations + annotations: {} + ## @param observability.metrics.prometheus.serviceMonitor.relabelings Prometheus relabelings + relabelings: [] + ## @param observability.metrics.prometheus.serviceMonitor.metricRelabelings Prometheus metric relabelings + metricRelabelings: [] + basicAuth: + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.enabled Scrape with basic auth + enabled: false + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.secretName Secret holding scrape credentials (username + password keys) + secretName: "" + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.usernameKey Key in the secret holding the username + usernameKey: username + ## @param observability.metrics.prometheus.serviceMonitor.basicAuth.passwordKey Key in the secret holding the password + passwordKey: password + ## Annotation-based discovery (classic Prometheus, no Operator). + podAnnotations: + ## @param observability.metrics.prometheus.podAnnotations.enabled Add prometheus.io/* scrape annotations to pods + enabled: false + ## @param observability.metrics.prometheus.podAnnotations.path Scrape path annotation value + path: /prometheus + ## @param observability.metrics.prometheus.podAnnotations.port Scrape port (empty = service.http.port) + port: "" + ## Push metrics to an OpenTelemetry collector alongside /prometheus. + otlp: + ## @param observability.metrics.otlp.enabled Enable the OTLP metrics registry + enabled: false + ## @param observability.metrics.otlp.endpoint OTLP/gRPC metrics endpoint + endpoint: http://localhost:4317 + ## @section observability.tracing + tracing: + ## @param observability.tracing.enabled Enable distributed tracing (plugin ships in the standard image) + enabled: false + ## @param observability.tracing.endpoint OTLP/gRPC trace endpoint + endpoint: http://localhost:4317 + ## @param observability.tracing.samplingRate Parent-based sampling ratio [0.0, 1.0] + samplingRate: 0.0 + ## @section observability.logging + logging: + ## @param observability.logging.format Log format: text or json + format: text + ## @param observability.logging.includeTrace Append [traceId=…] to text logs while a trace is active + includeTrace: false + ## @section observability.health + health: + ## @param observability.health.readinessRequiresHA /api/v1/ready waits for Raft join on HA clusters + readinessRequiresHA: false From 14392e8d8b65e7ebbfb9728a9958626252a1be02 Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 18:52:39 +0200 Subject: [PATCH 04/12] feat(helm): prometheus plugin requireAuthentication knob --- charts/arcadedb/templates/_helpers.tpl | 5 +++++ charts/arcadedb/tests/helpers_test.yaml | 29 +++++++++++++++++++++++++ charts/arcadedb/values.yaml | 3 +++ 3 files changed, 37 insertions(+) diff --git a/charts/arcadedb/templates/_helpers.tpl b/charts/arcadedb/templates/_helpers.tpl index 0439632..470dbfd 100644 --- a/charts/arcadedb/templates/_helpers.tpl +++ b/charts/arcadedb/templates/_helpers.tpl @@ -144,6 +144,11 @@ Create a comma separated list of plugins to be enabled in arcadedb {{- $params = append $params (printf "-Darcadedb.redis.port=%d" (int $config.port)) -}} {{- else if eq $plugin "prometheus" -}} {{- $plugins = append $plugins "Prometheus:com.arcadedb.metrics.prometheus.PrometheusMetricsPlugin" -}} + {{- with $.Values.arcadedb.plugins.prometheus -}} + {{- if hasKey . "requireAuthentication" -}} + {{- $params = append $params (printf "-Darcadedb.serverMetrics.prometheus.requireAuthentication=%v" .requireAuthentication) -}} + {{- end -}} + {{- end -}} {{- else -}} {{- $plugins = append $plugins (printf "%s:%s" $plugin $config.class) -}} {{- end -}} diff --git a/charts/arcadedb/tests/helpers_test.yaml b/charts/arcadedb/tests/helpers_test.yaml index b0797cf..6c9d3f8 100644 --- a/charts/arcadedb/tests/helpers_test.yaml +++ b/charts/arcadedb/tests/helpers_test.yaml @@ -101,3 +101,32 @@ tests: - contains: path: spec.template.spec.containers[0].command content: "-Darcadedb.server.plugins=myplugin:com.example.MyPlugin" + + - it: prometheus plugin emits requireAuthentication=false when set + set: + arcadedb.plugins.prometheus.enabled: true + arcadedb.plugins.prometheus.requireAuthentication: false + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=false" + + - it: prometheus plugin omits requireAuthentication arg when not set + set: + arcadedb.plugins.prometheus.enabled: true + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=false" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=true" + + - it: prometheus plugin emits requireAuthentication=true when explicitly set + set: + arcadedb.plugins.prometheus.enabled: true + arcadedb.plugins.prometheus.requireAuthentication: true + asserts: + - contains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.prometheus.requireAuthentication=true" diff --git a/charts/arcadedb/values.yaml b/charts/arcadedb/values.yaml index 7f09949..bafa1d8 100644 --- a/charts/arcadedb/values.yaml +++ b/charts/arcadedb/values.yaml @@ -56,6 +56,9 @@ arcadedb: # port: 6379 # prometheus: # enabled: false + # # Set false to allow unauthenticated scraping of /prometheus (needed for + # # ServiceMonitor / annotation-based discovery without basic auth). + # requireAuthentication: true ## Custom plugin example: # myPlugin: # enabled: true From 63ddf321960ccc3139786448d66058caac9942b1 Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 18:57:01 +0200 Subject: [PATCH 05/12] feat(helm): prometheus scrape pod annotations + plugin guard --- charts/arcadedb/templates/_helpers.tpl | 36 ++++++++++ charts/arcadedb/templates/statefulset.yaml | 3 +- charts/arcadedb/tests/observability_test.yaml | 72 +++++++++++++++++++ 3 files changed, 110 insertions(+), 1 deletion(-) diff --git a/charts/arcadedb/templates/_helpers.tpl b/charts/arcadedb/templates/_helpers.tpl index 470dbfd..134eb14 100644 --- a/charts/arcadedb/templates/_helpers.tpl +++ b/charts/arcadedb/templates/_helpers.tpl @@ -201,3 +201,39 @@ All opt-in; emits nothing when defaults are unchanged. - -Darcadedb.server.readinessRequiresHA=true {{- end }} {{- end -}} + +{{/* +Guard: scrape discovery (ServiceMonitor or pod annotations) needs the +prometheus plugin so /prometheus is actually served. +*/}} +{{- define "arcadedb.observability.validate" -}} +{{- $p := .Values.observability.metrics.prometheus -}} +{{- if or $p.serviceMonitor.enabled $p.podAnnotations.enabled -}} + {{- $promEnabled := false -}} + {{- with .Values.arcadedb.plugins.prometheus -}} + {{- if .enabled -}}{{- $promEnabled = true -}}{{- end -}} + {{- end -}} + {{- if not $promEnabled -}} + {{- fail "observability.metrics.prometheus serviceMonitor/podAnnotations require arcadedb.plugins.prometheus.enabled=true (the /prometheus endpoint must be served)" -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{/* +Merge user-supplied podAnnotations with computed prometheus.io/* scrape +annotations. Returns YAML (possibly empty). +*/}} +{{- define "arcadedb.podAnnotations" -}} +{{- $annotations := deepCopy (default dict .Values.podAnnotations) -}} +{{- $pa := .Values.observability.metrics.prometheus.podAnnotations -}} +{{- if $pa.enabled -}} + {{- $port := int .Values.service.http.port -}} + {{- if $pa.port -}}{{- $port = int $pa.port -}}{{- end -}} + {{- $_ := set $annotations "prometheus.io/scrape" "true" -}} + {{- $_ := set $annotations "prometheus.io/port" (printf "%d" $port) -}} + {{- $_ := set $annotations "prometheus.io/path" (default "/prometheus" $pa.path) -}} +{{- end -}} +{{- if $annotations -}} +{{- toYaml $annotations -}} +{{- end -}} +{{- end -}} diff --git a/charts/arcadedb/templates/statefulset.yaml b/charts/arcadedb/templates/statefulset.yaml index 935f50c..fe092ad 100644 --- a/charts/arcadedb/templates/statefulset.yaml +++ b/charts/arcadedb/templates/statefulset.yaml @@ -1,3 +1,4 @@ +{{- include "arcadedb.observability.validate" . -}} apiVersion: apps/v1 kind: StatefulSet metadata: @@ -15,7 +16,7 @@ spec: {{- include "arcadedb.selectorLabels" . | nindent 6 }} template: metadata: - {{- with .Values.podAnnotations }} + {{- with (include "arcadedb.podAnnotations" . | fromYaml) }} annotations: {{- toYaml . | nindent 8 }} {{- end }} diff --git a/charts/arcadedb/tests/observability_test.yaml b/charts/arcadedb/tests/observability_test.yaml index 7476379..883fe7e 100644 --- a/charts/arcadedb/tests/observability_test.yaml +++ b/charts/arcadedb/tests/observability_test.yaml @@ -86,3 +86,75 @@ tests: - contains: path: spec.template.spec.containers[0].command content: "-Darcadedb.server.readinessRequiresHA=true" + + - it: scrape pod annotations render when enabled with the prometheus plugin + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.podAnnotations.enabled: true + asserts: + - equal: + path: spec.template.metadata.annotations["prometheus.io/scrape"] + value: "true" + - equal: + path: spec.template.metadata.annotations["prometheus.io/port"] + value: "2480" + - equal: + path: spec.template.metadata.annotations["prometheus.io/path"] + value: /prometheus + + - it: scrape pod annotation port honours service.http.port and custom path + set: + arcadedb.plugins.prometheus.enabled: true + service.http.port: 9090 + observability.metrics.prometheus.podAnnotations.enabled: true + observability.metrics.prometheus.podAnnotations.path: /metrics + asserts: + - equal: + path: spec.template.metadata.annotations["prometheus.io/port"] + value: "9090" + - equal: + path: spec.template.metadata.annotations["prometheus.io/path"] + value: /metrics + + - it: scrape pod annotation port can be overridden explicitly + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.podAnnotations.enabled: true + observability.metrics.prometheus.podAnnotations.port: 1234 + asserts: + - equal: + path: spec.template.metadata.annotations["prometheus.io/port"] + value: "1234" + + - it: user podAnnotations are preserved alongside scrape annotations + set: + arcadedb.plugins.prometheus.enabled: true + podAnnotations: + my-team: data + observability.metrics.prometheus.podAnnotations.enabled: true + asserts: + - equal: + path: spec.template.metadata.annotations["my-team"] + value: data + - equal: + path: spec.template.metadata.annotations["prometheus.io/scrape"] + value: "true" + + - it: no scrape annotations when podAnnotations discovery is disabled + asserts: + - notExists: + path: spec.template.metadata.annotations["prometheus.io/scrape"] + + - it: fails when scrape annotations enabled without the prometheus plugin + set: + observability.metrics.prometheus.podAnnotations.enabled: true + asserts: + - failedTemplate: + errorPattern: "require arcadedb.plugins.prometheus.enabled=true" + + - it: fails when serviceMonitor enabled without the prometheus plugin + set: + observability.metrics.prometheus.serviceMonitor.enabled: true + asserts: + - failedTemplate: + errorPattern: "require arcadedb.plugins.prometheus.enabled=true" From 41205f8aeed73b9b6d4d8169ae5ca49f68ddd0ed Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 19:02:18 +0200 Subject: [PATCH 06/12] feat(helm): optional Prometheus Operator ServiceMonitor --- charts/arcadedb/templates/service.yaml | 1 + charts/arcadedb/templates/servicemonitor.yaml | 49 +++++++++ charts/arcadedb/tests/service_test.yaml | 10 ++ .../arcadedb/tests/servicemonitor_test.yaml | 100 ++++++++++++++++++ 4 files changed, 160 insertions(+) create mode 100644 charts/arcadedb/templates/servicemonitor.yaml create mode 100644 charts/arcadedb/tests/servicemonitor_test.yaml diff --git a/charts/arcadedb/templates/service.yaml b/charts/arcadedb/templates/service.yaml index aa9b7a4..d5a4df1 100644 --- a/charts/arcadedb/templates/service.yaml +++ b/charts/arcadedb/templates/service.yaml @@ -4,6 +4,7 @@ metadata: name: {{ include "arcadedb.fullname" . }}-http labels: {{- include "arcadedb.labels" . | nindent 4 }} + app.kubernetes.io/component: http spec: type: {{ .Values.service.http.type }} ports: diff --git a/charts/arcadedb/templates/servicemonitor.yaml b/charts/arcadedb/templates/servicemonitor.yaml new file mode 100644 index 0000000..fff4c02 --- /dev/null +++ b/charts/arcadedb/templates/servicemonitor.yaml @@ -0,0 +1,49 @@ +{{- if .Values.observability.metrics.prometheus.serviceMonitor.enabled }} +{{- include "arcadedb.observability.validate" . -}} +{{- $sm := .Values.observability.metrics.prometheus.serviceMonitor -}} +{{- if and $sm.basicAuth.enabled (not $sm.basicAuth.secretName) -}} +{{- fail "serviceMonitor.basicAuth.enabled requires basicAuth.secretName (a secret with username + password keys)" -}} +{{- end -}} +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ include "arcadedb.fullname" . }} + labels: + {{- include "arcadedb.labels" . | nindent 4 }} + {{- with $sm.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with $sm.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + selector: + matchLabels: + {{- include "arcadedb.selectorLabels" . | nindent 6 }} + app.kubernetes.io/component: http + endpoints: + - port: http + path: {{ $sm.path }} + interval: {{ $sm.interval }} + {{- with $sm.scrapeTimeout }} + scrapeTimeout: {{ . }} + {{- end }} + {{- if $sm.basicAuth.enabled }} + basicAuth: + username: + name: {{ $sm.basicAuth.secretName }} + key: {{ $sm.basicAuth.usernameKey }} + password: + name: {{ $sm.basicAuth.secretName }} + key: {{ $sm.basicAuth.passwordKey }} + {{- end }} + {{- with $sm.relabelings }} + relabelings: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with $sm.metricRelabelings }} + metricRelabelings: + {{- toYaml . | nindent 8 }} + {{- end }} +{{- end }} diff --git a/charts/arcadedb/tests/service_test.yaml b/charts/arcadedb/tests/service_test.yaml index 420d2f9..839cb47 100644 --- a/charts/arcadedb/tests/service_test.yaml +++ b/charts/arcadedb/tests/service_test.yaml @@ -25,6 +25,16 @@ tests: path: spec.selector["app.kubernetes.io/instance"] value: test + - it: only the http Service carries the component=http label + asserts: + - equal: + path: metadata.labels["app.kubernetes.io/component"] + value: http + documentIndex: 0 + - notExists: + path: metadata.labels["app.kubernetes.io/component"] + documentIndex: 1 + - it: client service type can be overridden to LoadBalancer set: service.http.type: LoadBalancer diff --git a/charts/arcadedb/tests/servicemonitor_test.yaml b/charts/arcadedb/tests/servicemonitor_test.yaml new file mode 100644 index 0000000..4984184 --- /dev/null +++ b/charts/arcadedb/tests/servicemonitor_test.yaml @@ -0,0 +1,100 @@ +suite: ServiceMonitor +templates: + - servicemonitor.yaml +release: + name: test + namespace: default +tests: + - it: is not rendered by default + asserts: + - hasDocuments: { count: 0 } + + - it: renders when enabled with the prometheus plugin + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + asserts: + - hasDocuments: { count: 1 } + - isKind: { of: ServiceMonitor } + - equal: { path: metadata.name, value: test-arcadedb } + - equal: + path: spec.selector.matchLabels["app.kubernetes.io/name"] + value: arcadedb + - equal: + path: spec.selector.matchLabels["app.kubernetes.io/component"] + value: http + - equal: { path: "spec.endpoints[0].port", value: http } + - equal: { path: "spec.endpoints[0].path", value: /prometheus } + - equal: { path: "spec.endpoints[0].interval", value: 30s } + + - it: scrapeTimeout omitted by default; annotations and relabelings pass through + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.annotations: + team: data + observability.metrics.prometheus.serviceMonitor.relabelings: + - sourceLabels: [__meta_kubernetes_pod_name] + targetLabel: pod + asserts: + - notExists: + path: spec.endpoints[0].scrapeTimeout + - equal: + path: metadata.annotations.team + value: data + - equal: + path: spec.endpoints[0].relabelings[0].targetLabel + value: pod + + - it: honours interval, scrapeTimeout, path, and extra labels + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.interval: 15s + observability.metrics.prometheus.serviceMonitor.scrapeTimeout: 10s + observability.metrics.prometheus.serviceMonitor.path: /metrics + observability.metrics.prometheus.serviceMonitor.labels: + release: kube-prometheus-stack + asserts: + - equal: { path: "spec.endpoints[0].interval", value: 15s } + - equal: { path: "spec.endpoints[0].scrapeTimeout", value: 10s } + - equal: { path: "spec.endpoints[0].path", value: /metrics } + - equal: + path: metadata.labels.release + value: kube-prometheus-stack + + - it: renders basicAuth referencing the supplied secret + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.basicAuth.enabled: true + observability.metrics.prometheus.serviceMonitor.basicAuth.secretName: scrape-creds + asserts: + - equal: + path: spec.endpoints[0].basicAuth.username.name + value: scrape-creds + - equal: + path: spec.endpoints[0].basicAuth.username.key + value: username + - equal: + path: spec.endpoints[0].basicAuth.password.name + value: scrape-creds + - equal: + path: spec.endpoints[0].basicAuth.password.key + value: password + + - it: fails when ServiceMonitor enabled without the prometheus plugin + set: + observability.metrics.prometheus.serviceMonitor.enabled: true + asserts: + - failedTemplate: + errorPattern: "require arcadedb.plugins.prometheus.enabled=true" + + - it: fails when basicAuth enabled without a secretName + set: + arcadedb.plugins.prometheus.enabled: true + observability.metrics.prometheus.serviceMonitor.enabled: true + observability.metrics.prometheus.serviceMonitor.basicAuth.enabled: true + asserts: + - failedTemplate: + errorPattern: "requires basicAuth.secretName" From b16f1b218aca134e5632f823ee3de468b93fa3a7 Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 19:09:18 +0200 Subject: [PATCH 07/12] feat(helm): default liveness to /api/v1/health; bump to 26.7.1 --- charts/arcadedb/Chart.yaml | 4 ++-- charts/arcadedb/tests/statefulset_test.yaml | 8 ++++---- charts/arcadedb/values.yaml | 4 +++- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/charts/arcadedb/Chart.yaml b/charts/arcadedb/Chart.yaml index b3aa97f..7cbb077 100644 --- a/charts/arcadedb/Chart.yaml +++ b/charts/arcadedb/Chart.yaml @@ -5,8 +5,8 @@ description: | type: application -version: 26.6.1 +version: 26.7.1 -appVersion: "26.6.1" +appVersion: "26.7.1" annotations: artifacthub.io/repositoryID: "fb85acb7-fb5b-4572-b44b-374a2b52658d" diff --git a/charts/arcadedb/tests/statefulset_test.yaml b/charts/arcadedb/tests/statefulset_test.yaml index 45e01cd..0fbcaeb 100644 --- a/charts/arcadedb/tests/statefulset_test.yaml +++ b/charts/arcadedb/tests/statefulset_test.yaml @@ -47,7 +47,7 @@ tests: asserts: - equal: path: "spec.template.spec.containers[0].image" - value: arcadedata/arcadedb:26.6.1 + value: arcadedata/arcadedb:26.7.1 - it: image.tag override wins over AppVersion default set: @@ -64,7 +64,7 @@ tests: asserts: - equal: path: "spec.template.spec.containers[0].image" - value: my-registry.example.com/arcadedb-fork:26.6.1 + value: my-registry.example.com/arcadedb-fork:26.7.1 - it: image.pullPolicy default is IfNotPresent and is overridable asserts: @@ -101,11 +101,11 @@ tests: path: "spec.template.spec.containers[0].ports" content: { name: rpc, containerPort: 5555, protocol: TCP } - - it: liveness and readiness probes default to /api/v1/ready on http port + - it: liveness probe defaults to /api/v1/health and readiness to /api/v1/ready asserts: - equal: path: "spec.template.spec.containers[0].livenessProbe.httpGet.path" - value: /api/v1/ready + value: /api/v1/health - equal: path: "spec.template.spec.containers[0].livenessProbe.httpGet.port" value: http diff --git a/charts/arcadedb/values.yaml b/charts/arcadedb/values.yaml index bafa1d8..8e853c8 100644 --- a/charts/arcadedb/values.yaml +++ b/charts/arcadedb/values.yaml @@ -165,9 +165,11 @@ resources: {} # memory: 4Gi # no CPU limit - avoids throttling JVM GC pauses ## @section livenessProbe +## Liveness uses the dependency-free /api/v1/health endpoint (no DB I/O, +## never returns 503). Requires ArcadeDB 26.7.1+. livenessProbe: httpGet: - path: /api/v1/ready + path: /api/v1/health port: http ## @section readinessProbe From 6ea455692e9ea11c8d581241f6be20ad0d29fb5b Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 19:12:53 +0200 Subject: [PATCH 08/12] test(integration): assert /api/v1/health liveness returns 204 --- ci/integration-test.sh | 42 +++++++++++++++++++++++++++++++----------- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/ci/integration-test.sh b/ci/integration-test.sh index 76f0e7a..6d2d7bb 100755 --- a/ci/integration-test.sh +++ b/ci/integration-test.sh @@ -41,6 +41,18 @@ api() { # api [body] fi } +assert_health_204() { # assert_health_204 + local port=$1 code + code=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 \ + "http://localhost:${port}/api/v1/health" || true) + if [[ "$code" != "204" ]]; then + echo "ERROR: /api/v1/health returned ${code}, expected 204" + return 1 + fi + echo " /api/v1/health -> 204 (unauthenticated liveness OK)" + return 0 +} + cleanup() { [[ -n "${PF_PID:-}" ]] && { kill "$PF_PID" 2>/dev/null || true; } } @@ -131,21 +143,29 @@ PASSWORD=$(kubectl get secret arcadedb-credentials-secret \ # ── phase 1: pod readiness ──────────────────────────────────────────────────── -echo "==> [1/6] Waiting for StatefulSet rollout (timeout ${ROLLOUT_TIMEOUT}s)..." +echo "==> [1/7] Waiting for StatefulSet rollout (timeout ${ROLLOUT_TIMEOUT}s)..." kubectl rollout status statefulset/"$RELEASE" \ -n "$NAMESPACE" --timeout="${ROLLOUT_TIMEOUT}s" echo " All 3 pods Ready." -# ── phase 2: raft formation ─────────────────────────────────────────────────── +# ── phase 2: liveness health probe ──────────────────────────────────────────── + +echo "==> [2/7] Asserting /api/v1/health liveness endpoint..." +PF_PID=$(pf_start 0 "$HTTP_PORT") +pf_wait "$HTTP_PORT" || { echo "ERROR: port-forward to pod-0 failed"; exit 1; } +assert_health_204 "$HTTP_PORT" || exit 1 +pf_stop "$PF_PID" + +# ── phase 3: raft formation ─────────────────────────────────────────────────── -echo "==> [2/6] Checking Raft leader consensus (timeout ${RAFT_TIMEOUT}s)..." +echo "==> [3/7] Checking Raft leader consensus (timeout ${RAFT_TIMEOUT}s)..." assert_quorum_n 3 || exit 1 -# ── phase 3: write ──────────────────────────────────────────────────────────── +# ── phase 4: write ──────────────────────────────────────────────────────────── # LEADER_ORDINAL is set by assert_quorum_n above. -echo "==> [3/6] Writing test data via leader pod-${LEADER_ORDINAL}..." +echo "==> [4/7] Writing test data via leader pod-${LEADER_ORDINAL}..." PF_PID=$(pf_start "$LEADER_ORDINAL" "$HTTP_PORT") pf_wait "$HTTP_PORT" || { echo "ERROR: port-forward to leader pod-${LEADER_ORDINAL} failed"; exit 1; } @@ -163,9 +183,9 @@ api "$HTTP_PORT" POST /api/v1/command/integration-test \ echo " Write complete." -# ── phase 4: read and assert ────────────────────────────────────────────────── +# ── phase 5: read and assert ────────────────────────────────────────────────── -echo "==> [4/6] Reading back test data..." +echo "==> [5/7] Reading back test data..." RESULT=$(api "$HTTP_PORT" POST /api/v1/query/integration-test \ '{"language":"sql","command":"SELECT name FROM TestDoc WHERE name = '\''hello-kind'\''"}' \ | jq -r '.result[0].name // empty') || { @@ -182,9 +202,9 @@ fi echo " Got: '${RESULT}'" -# ── phase 5: STATUS column ──────────────────────────────────────────────────── +# ── phase 6: STATUS column ──────────────────────────────────────────────────── -echo "==> [5/6] Asserting STATUS=HEALTHY for all peers..." +echo "==> [6/7] Asserting STATUS=HEALTHY for all peers..." PF_PID=$(pf_start "$LEADER_ORDINAL" "$HTTP_PORT") pf_wait "$HTTP_PORT" || { echo "ERROR: port-forward to leader failed"; exit 1; } @@ -192,9 +212,9 @@ cluster_status_assert_healthy "$HTTP_PORT" || exit 1 pf_stop "$PF_PID" -# ── phase 6: leadership transfer ────────────────────────────────────────────── +# ── phase 7: leadership transfer ────────────────────────────────────────────── -echo "==> [6/6] Transferring Raft leadership..." +echo "==> [7/7] Transferring Raft leadership..." PF_PID=$(pf_start "$LEADER_ORDINAL" "$HTTP_PORT") pf_wait "$HTTP_PORT" || { echo "ERROR: port-forward to leader failed"; exit 1; } From e75014e4c8028e1617745550a1533747e1341066 Mon Sep 17 00:00:00 2001 From: robfrank Date: Wed, 17 Jun 2026 19:17:36 +0200 Subject: [PATCH 09/12] docs: document observability configuration --- README.md | 40 ++++++++++++++++++++++++++++++++ charts/arcadedb/README.md | 49 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 88 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index b07890b..9348f21 100644 --- a/README.md +++ b/README.md @@ -16,6 +16,46 @@ helm install my-arcadedb arcadedb/arcadedb See [charts/arcadedb/README.md](charts/arcadedb/README.md) and [charts/arcadedb/values.yaml](charts/arcadedb/values.yaml) for all available options. +## Observability + +ArcadeDB 26.7.1+ exposes opt-in, behavior-preserving observability. All knobs +default off. + +**Prometheus scraping (Operator):** + +```bash +helm install my-arcadedb arcadedb/arcadedb \ + --set arcadedb.plugins.prometheus.enabled=true \ + --set arcadedb.plugins.prometheus.requireAuthentication=false \ + --set observability.metrics.prometheus.serviceMonitor.enabled=true \ + --set observability.metrics.prometheus.serviceMonitor.labels.release=kube-prometheus-stack +``` + +For non-Operator Prometheus, use annotation discovery instead: +`--set observability.metrics.prometheus.podAnnotations.enabled=true`. + +**OTLP metrics export** (alongside /prometheus) — append to the install/upgrade command: + +```bash +--set observability.metrics.otlp.enabled=true \ +--set observability.metrics.otlp.endpoint=http://otel-collector:4317 +``` + +**Distributed tracing** — append to the install/upgrade command: + +```bash +--set observability.tracing.enabled=true \ +--set observability.tracing.endpoint=http://otel-collector:4317 \ +--set observability.tracing.samplingRate=0.1 +``` + +**Structured JSON logging:** `--set observability.logging.format=json` + +The liveness probe uses the dependency-free `/api/v1/health` endpoint; +readiness stays on `/api/v1/ready`. Set +`observability.health.readinessRequiresHA=true` to gate readiness on Raft +membership in HA clusters. + ## Development Run checks locally: diff --git a/charts/arcadedb/README.md b/charts/arcadedb/README.md index d65768a..4bee9f0 100644 --- a/charts/arcadedb/README.md +++ b/charts/arcadedb/README.md @@ -120,7 +120,7 @@ The command removes all the Kubernetes components associated with the chart and | Name | Description | Value | |------------------------------|-------------|-----------------| -| `livenessProbe.httpGet.path` | | `/api/v1/ready` | +| `livenessProbe.httpGet.path` | | `/api/v1/health` | | `livenessProbe.httpGet.port` | | `http` | ### This is to setup the liveness and readiness probes more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ @@ -172,6 +172,53 @@ The command removes all the Kubernetes components associated with the chart and | `affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0].weight` | | `100` | | `extraManifests` | - Include any amount of extra arbitrary manifests | `{}` | +### observability + +Opt-in, behavior-preserving observability (ArcadeDB 26.7.1+). Every knob below defaults off; existing deployments are unchanged. + +### observability.metrics + +| Name | Description | Value | +|-----------------------------------------------------------------------|----------------------------------------------------------|--------------------------| +| `observability.metrics.prometheus.serviceMonitor.enabled` | Create a Prometheus Operator ServiceMonitor | `false` | +| `observability.metrics.prometheus.serviceMonitor.interval` | Scrape interval | `30s` | +| `observability.metrics.prometheus.serviceMonitor.scrapeTimeout` | Scrape timeout (empty = Prometheus default) | `""` | +| `observability.metrics.prometheus.serviceMonitor.path` | Metrics path | `/prometheus` | +| `observability.metrics.prometheus.serviceMonitor.labels` | Extra labels (e.g. release: kube-prometheus-stack) | `{}` | +| `observability.metrics.prometheus.serviceMonitor.annotations` | Extra annotations | `{}` | +| `observability.metrics.prometheus.serviceMonitor.relabelings` | Prometheus relabelings | `[]` | +| `observability.metrics.prometheus.serviceMonitor.metricRelabelings` | Prometheus metric relabelings | `[]` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.enabled` | Scrape with basic auth | `false` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.secretName` | Secret with scrape credentials (username + password keys) | `""` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.usernameKey` | Secret key holding the username | `username` | +| `observability.metrics.prometheus.serviceMonitor.basicAuth.passwordKey` | Secret key holding the password | `password` | +| `observability.metrics.prometheus.podAnnotations.enabled` | Add prometheus.io/* scrape annotations to pods | `false` | +| `observability.metrics.prometheus.podAnnotations.path` | Scrape path annotation value | `/prometheus` | +| `observability.metrics.prometheus.podAnnotations.port` | Scrape port (empty = service.http.port) | `""` | +| `observability.metrics.otlp.enabled` | Enable the OTLP metrics registry | `false` | +| `observability.metrics.otlp.endpoint` | OTLP/gRPC metrics endpoint | `http://localhost:4317` | + +### observability.tracing + +| Name | Description | Value | +|-----------------------------------|--------------------------------------|-------------------------| +| `observability.tracing.enabled` | Enable distributed tracing | `false` | +| `observability.tracing.endpoint` | OTLP/gRPC trace endpoint | `http://localhost:4317` | +| `observability.tracing.samplingRate` | Parent-based sampling ratio [0.0, 1.0] | `0.0` | + +### observability.logging + +| Name | Description | Value | +|-------------------------------------|----------------------------------------------------------|--------| +| `observability.logging.format` | Log format: text or json | `text` | +| `observability.logging.includeTrace` | Append [traceId=…] to text logs while a trace is active | `false` | + +### observability.health + +| Name | Description | Value | +|--------------------------------------------|---------------------------------------------------|---------| +| `observability.health.readinessRequiresHA` | /api/v1/ready waits for Raft join on HA clusters | `false` | + Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`. For example: ```bash From 9e17bbb9507f6f50481fbd0fb9326178ed18ad6e Mon Sep 17 00:00:00 2001 From: robfrank Date: Thu, 18 Jun 2026 08:53:16 +0200 Subject: [PATCH 10/12] fix(helm): nil-safe navigation of observability values Guards the observability helpers and ServiceMonitor template against an explicitly nulled observability map/sub-map (e.g. --set observability=null), which panicked with a nil pointer dereference. Normal partial overrides and omissions were already safe via Helm's value coalescing; this covers the explicit-null edge. Adds regression tests. Co-Authored-By: Claude Opus 4.8 (1M context) --- charts/arcadedb/templates/_helpers.tpl | 30 +++++++++++-------- charts/arcadedb/templates/servicemonitor.yaml | 17 ++++++----- charts/arcadedb/tests/observability_test.yaml | 25 ++++++++++++++++ .../arcadedb/tests/servicemonitor_test.yaml | 6 ++++ 4 files changed, 58 insertions(+), 20 deletions(-) diff --git a/charts/arcadedb/templates/_helpers.tpl b/charts/arcadedb/templates/_helpers.tpl index 134eb14..3fcb56f 100644 --- a/charts/arcadedb/templates/_helpers.tpl +++ b/charts/arcadedb/templates/_helpers.tpl @@ -181,23 +181,27 @@ Observability -D JVM args (logging, OTLP metrics, tracing, readiness). All opt-in; emits nothing when defaults are unchanged. */}} {{- define "arcadedb.observability.args" -}} -{{- $o := .Values.observability -}} -{{- if eq $o.logging.format "json" }} +{{- $o := .Values.observability | default dict -}} +{{- $logging := $o.logging | default dict -}} +{{- $otlp := (($o.metrics | default dict).otlp) | default dict -}} +{{- $tracing := $o.tracing | default dict -}} +{{- $health := $o.health | default dict -}} +{{- if eq $logging.format "json" }} - -Darcadedb.server.logFormat=json {{- end }} -{{- if $o.logging.includeTrace }} +{{- if $logging.includeTrace }} - -Darcadedb.server.logIncludeTrace=true {{- end }} -{{- if $o.metrics.otlp.enabled }} +{{- if $otlp.enabled }} - -Darcadedb.serverMetrics.otlp.enabled=true -- -Darcadedb.serverMetrics.otlp.endpoint={{ $o.metrics.otlp.endpoint }} +- -Darcadedb.serverMetrics.otlp.endpoint={{ $otlp.endpoint }} {{- end }} -{{- if $o.tracing.enabled }} +{{- if $tracing.enabled }} - -Darcadedb.serverMetrics.tracing.enabled=true -- -Darcadedb.serverMetrics.tracing.endpoint={{ $o.tracing.endpoint }} -- -Darcadedb.serverMetrics.tracing.samplingRate={{ $o.tracing.samplingRate }} +- -Darcadedb.serverMetrics.tracing.endpoint={{ $tracing.endpoint }} +- -Darcadedb.serverMetrics.tracing.samplingRate={{ $tracing.samplingRate }} {{- end }} -{{- if $o.health.readinessRequiresHA }} +{{- if $health.readinessRequiresHA }} - -Darcadedb.server.readinessRequiresHA=true {{- end }} {{- end -}} @@ -207,8 +211,10 @@ Guard: scrape discovery (ServiceMonitor or pod annotations) needs the prometheus plugin so /prometheus is actually served. */}} {{- define "arcadedb.observability.validate" -}} -{{- $p := .Values.observability.metrics.prometheus -}} -{{- if or $p.serviceMonitor.enabled $p.podAnnotations.enabled -}} +{{- $p := (((.Values.observability | default dict).metrics | default dict).prometheus) | default dict -}} +{{- $serviceMonitor := $p.serviceMonitor | default dict -}} +{{- $podAnnotations := $p.podAnnotations | default dict -}} +{{- if or $serviceMonitor.enabled $podAnnotations.enabled -}} {{- $promEnabled := false -}} {{- with .Values.arcadedb.plugins.prometheus -}} {{- if .enabled -}}{{- $promEnabled = true -}}{{- end -}} @@ -225,7 +231,7 @@ annotations. Returns YAML (possibly empty). */}} {{- define "arcadedb.podAnnotations" -}} {{- $annotations := deepCopy (default dict .Values.podAnnotations) -}} -{{- $pa := .Values.observability.metrics.prometheus.podAnnotations -}} +{{- $pa := (((((.Values.observability | default dict).metrics | default dict).prometheus) | default dict).podAnnotations) | default dict -}} {{- if $pa.enabled -}} {{- $port := int .Values.service.http.port -}} {{- if $pa.port -}}{{- $port = int $pa.port -}}{{- end -}} diff --git a/charts/arcadedb/templates/servicemonitor.yaml b/charts/arcadedb/templates/servicemonitor.yaml index fff4c02..166f889 100644 --- a/charts/arcadedb/templates/servicemonitor.yaml +++ b/charts/arcadedb/templates/servicemonitor.yaml @@ -1,7 +1,8 @@ -{{- if .Values.observability.metrics.prometheus.serviceMonitor.enabled }} +{{- $sm := (((((.Values.observability | default dict).metrics | default dict).prometheus) | default dict).serviceMonitor) | default dict -}} +{{- if $sm.enabled }} {{- include "arcadedb.observability.validate" . -}} -{{- $sm := .Values.observability.metrics.prometheus.serviceMonitor -}} -{{- if and $sm.basicAuth.enabled (not $sm.basicAuth.secretName) -}} +{{- $basicAuth := $sm.basicAuth | default dict -}} +{{- if and $basicAuth.enabled (not $basicAuth.secretName) -}} {{- fail "serviceMonitor.basicAuth.enabled requires basicAuth.secretName (a secret with username + password keys)" -}} {{- end -}} apiVersion: monitoring.coreos.com/v1 @@ -29,14 +30,14 @@ spec: {{- with $sm.scrapeTimeout }} scrapeTimeout: {{ . }} {{- end }} - {{- if $sm.basicAuth.enabled }} + {{- if $basicAuth.enabled }} basicAuth: username: - name: {{ $sm.basicAuth.secretName }} - key: {{ $sm.basicAuth.usernameKey }} + name: {{ $basicAuth.secretName }} + key: {{ $basicAuth.usernameKey }} password: - name: {{ $sm.basicAuth.secretName }} - key: {{ $sm.basicAuth.passwordKey }} + name: {{ $basicAuth.secretName }} + key: {{ $basicAuth.passwordKey }} {{- end }} {{- with $sm.relabelings }} relabelings: diff --git a/charts/arcadedb/tests/observability_test.yaml b/charts/arcadedb/tests/observability_test.yaml index 883fe7e..648be0d 100644 --- a/charts/arcadedb/tests/observability_test.yaml +++ b/charts/arcadedb/tests/observability_test.yaml @@ -87,6 +87,31 @@ tests: path: spec.template.spec.containers[0].command content: "-Darcadedb.server.readinessRequiresHA=true" + - it: renders without panic when observability is explicitly null + set: + observability: null + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.logFormat=json" + - equal: + path: spec.template.spec.containers[0].livenessProbe.httpGet.path + value: /api/v1/health + + - it: renders without panic when observability sub-maps are explicitly null + set: + observability.logging: null + observability.metrics: null + observability.tracing: null + observability.health: null + asserts: + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.server.readinessRequiresHA=true" + - notContains: + path: spec.template.spec.containers[0].command + content: "-Darcadedb.serverMetrics.otlp.enabled=true" + - it: scrape pod annotations render when enabled with the prometheus plugin set: arcadedb.plugins.prometheus.enabled: true diff --git a/charts/arcadedb/tests/servicemonitor_test.yaml b/charts/arcadedb/tests/servicemonitor_test.yaml index 4984184..72a7df3 100644 --- a/charts/arcadedb/tests/servicemonitor_test.yaml +++ b/charts/arcadedb/tests/servicemonitor_test.yaml @@ -9,6 +9,12 @@ tests: asserts: - hasDocuments: { count: 0 } + - it: renders nothing without panic when observability is explicitly null + set: + observability: null + asserts: + - hasDocuments: { count: 0 } + - it: renders when enabled with the prometheus plugin set: arcadedb.plugins.prometheus.enabled: true From 5b5308e08232b22a78ce03340c6268f1c1ec0e98 Mon Sep 17 00:00:00 2001 From: robfrank Date: Thu, 18 Jun 2026 09:24:47 +0200 Subject: [PATCH 11/12] ci: temporarily test integration against latest (pre-26.7.1 release) The chart appVersion is bumped to 26.7.1 ahead of the image publish, so the default image.tag=appVersion is not pullable yet and the integration job's helm install --wait times out. Point the PR integration job at the rolling `latest` tag (== 26.7.1 content) until arcadedata/arcadedb:26.7.1 is published. REVERT this `with:` override at release so the job tests the pinned appVersion again (see f3db108). Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/lint.yml | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml index 3c43f91..3781c22 100644 --- a/.github/workflows/lint.yml +++ b/.github/workflows/lint.yml @@ -40,3 +40,12 @@ jobs: integration: uses: ./.github/workflows/integration-reusable.yml + # TEMPORARY (revert at 26.7.1 release): the chart's appVersion is bumped to + # 26.7.1 ahead of the image publish, so the default image.tag=appVersion + # cannot be pulled yet. Test against the rolling `latest` tag (== 26.7.1 + # content) until arcadedata/arcadedb:26.7.1 is published, then delete this + # `with:` block so the PR integration job tests the pinned appVersion again + # (see commit f3db108 and PR #12). + with: + imageTag: latest + pullPolicy: Always From 903596216c08077d5a06d84f7f5f49e0b2016103 Mon Sep 17 00:00:00 2001 From: robfrank Date: Thu, 18 Jun 2026 09:41:16 +0200 Subject: [PATCH 12/12] docs: note temporary latest pin + 26.7.1 revert in release checklist Document that appVersion can be bumped ahead of the image publish, that the PR integration job is then temporarily pinned to `latest`, and that the override must be removed once the pinned image ships. Adds a pending callout for the current 26.7.1 state. Co-Authored-By: Claude Opus 4.8 (1M context) --- README.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 9348f21..a7c0426 100644 --- a/README.md +++ b/README.md @@ -87,9 +87,22 @@ When a new ArcadeDB version is released: 2. Update the pinned image literal in `charts/arcadedb/tests/statefulset_test.yaml` to the new version, or `helm-unittest` will fail (it cannot reference `Chart.AppVersion` in an assertion). -3. The latest-image guard needs no change — it keeps watching the next cycle's +3. If `appVersion` was bumped **ahead of** the matching image being published + (so a feature can ship as soon as the image lands), the PR integration job — + which installs with `image.tag=appVersion` — cannot pull the image and will + time out. As a stopgap, the integration job in `.github/workflows/lint.yml` + carries a temporary `with: { imageTag: latest, pullPolicy: Always }` override + (`latest` tracks the upcoming release). Once the pinned image is published, + **remove that override** so the PR job tests the pinned `appVersion` again. +4. The latest-image guard needs no change — it keeps watching the next cycle's rolling image. +> **Pending — 26.7.1:** the chart is already at `appVersion: 26.7.1` (the +> observability feature shipped ahead of the image), and the integration job is +> temporarily pinned to `latest` per step 3. When `arcadedata/arcadedb:26.7.1` +> is published, remove the `with:` override in `.github/workflows/lint.yml`, +> re-run CI, and delete this note. Steps 1–2 are already done for this release. + ## Release New chart versions are published via the GitHub Actions Release workflow: