Add secure file settings support for stateless Elasticsearch#9309
Add secure file settings support for stateless Elasticsearch#9309
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
8f549b5 to
17817f9
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds support for delivering secure settings to stateless Elasticsearch via file-based settings (cluster_secrets) instead of the keystore init container used by stateful clusters, while ensuring the StackConfigPolicy (SCP) controller and ES controller can safely co-manage the same file-settings Secret.
Changes:
- Extend file-based settings state with
cluster_secretsand add stateless-aware empty settings behavior (exclude ILM by default). - Add secure-settings expansion (
BuildSecureSettingsData) to convert dotted keys into nestedcluster_secretsmaps. - Introduce
ReconcileClusterSecretsto managecluster_secretsindependently with hash-based change detection and preservation during SCP rebuild/reset flows. - Split reconciliation responsibility: keystore stays in the stateful driver; secure settings move to the stateless driver.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/controller/stackconfigpolicy/controller.go | Passes stateless flag into file-settings secret generation; adjusts reset log message. |
| pkg/controller/remotecluster/keystore/keystore.go | Adds reusable helper to expose remote API key secret as a secure-settings source. |
| pkg/controller/elasticsearch/filesettings/secret.go | Adds stateless-aware secret generation and preserves cluster_secrets during SCP rebuilds. |
| pkg/controller/elasticsearch/filesettings/secret_test.go | Updates tests for new constructor signature; adds stateless preservation test. |
| pkg/controller/elasticsearch/filesettings/reconciler.go | Passes current secret into empty-secret reconcile to preserve cluster_secrets for stateless. |
| pkg/controller/elasticsearch/filesettings/reconciler_test.go | Fixes expectation: no update when secret content is unchanged. |
| pkg/controller/elasticsearch/filesettings/file_settings.go | Adds ClusterSecrets field; adds stateless flag; excludes ILM from empty stateless state. |
| pkg/controller/elasticsearch/filesettings/file_settings_test.go | Updates expectations for new empty-state signature; adds stateless ILM exclusion test. |
| pkg/controller/elasticsearch/filesettings/cluster_secrets.go | New reconciler for cluster_secrets field with hash/version handling. |
| pkg/controller/elasticsearch/filesettings/cluster_secrets_test.go | Tests for create/update/no-op/preserve/clear behaviors of ReconcileClusterSecrets. |
| pkg/controller/elasticsearch/elasticsearch_controller.go | Selects stateless vs stateful driver at runtime via es.IsStateless(). |
| pkg/controller/elasticsearch/driver/stateless/driver.go | Implements stateless secure settings delivery via cluster_secrets and secret watches. |
| pkg/controller/elasticsearch/driver/stateful/driver.go | Moves file-settings empty-secret logic + keystore reconciliation into stateful driver. |
| pkg/controller/elasticsearch/driver/shared/results.go | Removes keystore resources from shared reconcile state. |
| pkg/controller/elasticsearch/driver/shared/reconcile.go | Removes keystore + file-settings reconciliation from shared path; exports MaybeReconcileEmptyFileSettingsSecret. |
| pkg/controller/elasticsearch/driver/shared/reconcile_test.go | Renames test to match exported MaybeReconcileEmptyFileSettingsSecret. |
| pkg/controller/elasticsearch/driver/shared/reconcile_integration_test.go | Updates integration expectations now that file-settings secret isn’t created in shared reconcile. |
| pkg/controller/common/keystore/user_secret.go | Adds BuildSecureSettingsData helper to expand dotted keys into nested cluster_secrets structure. |
| pkg/controller/common/keystore/user_secret_test.go | Adds unit tests for BuildSecureSettingsData behavior (flat, dotted, merge, empty). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0235d8a to
5af6685
Compare
|
|
||
| if err := filesettings.ReconcileSecret(ctx, r.Client, expectedSecret, &es); err != nil { | ||
| expectedVersion = candidateExpectedVersion | ||
| return filesettings.ReconcileSecret(ctx, r.Client, expectedSecret, &es) |
There was a problem hiding this comment.
We have a bug here, ReconcileResource does a fresh Get (so it has the latest resourceVersion), but the expected payload can be computed from older state.
6b8812d to
dc834b0
Compare
dc834b0 to
f265b37
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
f265b37 to
66bc613
Compare
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
Stateless Elasticsearch delivers secure settings (object store credentials, API keys) via cluster_secrets in file-based settings, instead of the keystore init container used by stateful clusters. - Add ClusterSecrets field to SettingsState and IsStateless to Settings - Preserve cluster_secrets when SCP controller rebuilds the Secret - Add BuildSecureSettingsData to expand dotted keys into nested maps - Add ReconcileClusterSecrets to manage cluster_secrets independently - Wire buildClusterSecrets into shared reconciliation for stateless - Guard stateful keystore reconciliation with !IsStateless() - Exclude ILM policies from empty stateless settings state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fixes - Use MergeFrom patch with retry for soft owner removal (metadata-only, avoids sending full Data payload and handles conflicts) - Handle AlreadyExists in MaybeReconcileEmptyFileSettingsSecret (the goal is to ensure the Secret exists, not to own its creation) - Fix applyExpectedSecret comment: Data is replaced, not merged Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…gs key - Reset() now delegates to ApplyPolicy(empty) which preserves cluster_secrets for stateless clusters instead of wiping them - Mark missing SettingsSecretKey as corrupted to prevent destructive overwrites from SetClusterSecrets - Remove unreachable IsNotFound check after Load (Load never returns NotFound) - Add tests for Reset behavior and missing settings key Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
d7980b8 to
2a55aa4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if remainingOwners == 0 { | ||
| log.Info("Resetting file settings Secret to empty, last policy owner removed", | ||
| "es_namespace", namespacedName.Namespace, "es_name", namespacedName.Name, | ||
| "owner_namespace", softOwner.Namespace, "owner_name", softOwner.Name) | ||
|
|
||
| esMeta := metadata.Propagate(&es, metadata.Metadata{Labels: eslabel.NewLabels(namespacedName)}) | ||
| fs, err := filesettings.Load(ctx, c, namespacedName, es.IsStateless(), esMeta) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if err := fs.Reset().Save(ctx, c, &es); err != nil && !apierrors.IsNotFound(err) { | ||
| return err | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // Metadata-only patch: persist the soft owner removal without sending Secret data. | ||
| log.Info("Removing policy soft owner from file settings Secret", | ||
| "es_namespace", namespacedName.Namespace, "es_name", namespacedName.Name, | ||
| "owner_namespace", softOwner.Namespace, "owner_name", softOwner.Name, | ||
| "remaining_owners", remainingOwners) |
There was a problem hiding this comment.
These orphan-reset log lines use log.Info(...), while the rest of this controller uses log.V(1).Info(...) for routine reconciliation messages. This will increase log volume during normal StackConfigPolicy churn (policy add/remove) and may be noisy in production.
Consider downgrading these to V(1) (or matching whatever log level is used for similar reconciliation-path events in this controller).
There was a problem hiding this comment.
This will increase log volume during normal StackConfigPolicy churn (policy add/remove) and may be noisy in production.
I don't think SCPs are added or removed frequently enough to generate much noise. I thought this kind of log was useful?
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
pebrc
left a comment
There was a problem hiding this comment.
Clean abstraction for the shared file-settings Secret. The Load → Mutate → Save pattern with WithAdditiveMetadata for the ES controller vs managed cleanup for SCP is well-designed and the concurrent access model is sound.
Two inline comments — the ILM guard for stateless and the unnecessary dotted-key expansion in BuildSecureSettingsData. Neither is blocking.
…-settings # Conflicts: # pkg/controller/elasticsearch/driver/shared/reconcile.go
ee4fc74 to
4e47ac9
Compare
Elasticsearch's SecureClusterStateSettings parses the inner map via Settings.fromXContent which accepts both flat dotted keys and nested objects. Expanding to nested maps added complexity and a failure mode when keys conflict at intermediate paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Merge with #9287 is not trivial, still working on it... |
…river Stateless clusters don't use the init container keystore (they deliver secure settings via cluster_secrets in file-based settings) and manage the file-settings Secret themselves via filesettings.ReconcileClusterSecrets. Running either in ReconcileSharedResources is useless work for stateless and, for file-settings, conflicts with SCP-deferral logic. Restore the original per-driver approach: the stateful driver explicitly calls MaybeReconcileEmptyFileSettingsSecret (version-gated) and its own reconcileKeystore (with managed keystore password and remote API keys). Both are stateful-only now. The stateless driver is untouched. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dcddadd to
7100bef
Compare
ILM is managed server-side on stateless clusters; applying ILM via file settings has no effect and is confusing. When a StackConfigPolicy targeting a stateless cluster includes indexLifecyclePolicies, mark the ES resource status as Error and emit a validation event instead of silently writing ignored settings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
buildkite test this -f p=gke,E2E_TAGS=es,t=TestFIPS -m s=9.4.0-SNAPSHOT [build : ✅ ] |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* [build: ✅ ] |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Implements secure settings delivery for stateless Elasticsearch via file-based settings (
cluster_secrets), replacing the keystore init container used by stateful clusters.ClusterSecretsfield toSettingsStateandIsStatelessflag toSettingscluster_secretswhen the SCP controller rebuilds the file settings SecretBuildSecureSettingsDatato deliver keystore entries viacluster_secrets(dotted keys likes3.client.default.access_keyare passed through as-is; Elasticsearch'sSecureClusterStateSettingsaccepts flat dotted keys viaSettings.fromXContent)ReconcileClusterSecretsto managecluster_secretsindependently with hash-based change detectionfilesettings.Secretabstraction (Load/modify/Save) that unifies the reconciliation pathsReconcileClusterSecrets)indexLifecyclePoliciesin StackConfigPolicies targeting stateless clusters (ILM is managed server-side on stateless)Controller conflict handling
The file settings Secret (
settings.json) is shared between two controllers, each managing disjoint fields:Both the ES controller (
cluster_secrets) and the SCP controller (cluster_settings, etc.) write to the same file settings Secret. Each controller manages disjoint fields and preserves the other's work. Kubernetes optimistic concurrency (resourceVersion) handles races — the loser retries on the next reconciliation loop.cluster_secretsonlyfilesettings.Loadfilesettings.LoadSetClusterSecrets(patches one field)ApplyPolicy(replaces state, preservescluster_secrets)WithAdditiveMetadata(never remove SCP labels/annotations)retry.OnErroronConflict/AlreadyExistsretry.OnErroronConflict/AlreadyExistsRequired for #9204
Depends on #9290
🤖 Generated with Claude Code