Skip to content

docs(etcd-encryption): document global cluster deterministic installation#773

Open
timonwong wants to merge 2 commits into
masterfrom
docs/etcd-encryption-global-deterministic
Open

docs(etcd-encryption): document global cluster deterministic installation#773
timonwong wants to merge 2 commits into
masterfrom
docs/etcd-encryption-global-deterministic

Conversation

@timonwong
Copy link
Copy Markdown
Contributor

@timonwong timonwong commented May 11, 2026

Summary

  • Restructure docs/en/configure/clusters/etcd-encryption.mdx to support the new global-cluster deterministic key strategy alongside the existing workload/DCS random strategy. Removes the stale "Not supported: global cluster" note.
  • Add a Key Strategies section that compares random vs. deterministic and clarifies that Active/Standby roles are detected at runtime, not configured manually.
  • Document the three UI-exposed plugin parameters (key_strategy, replication_group_id, root_secret_name), how to generate a replication_group_id, and how to prepare the root Secret in two explicit steps with an entropy guideline.
  • Surface DR-pair pitfalls as warning callouts: install on both clusters in deterministic mode with identical parameters, etcd Synchronizer v4.3.7+, identical key material across the replication group, and a "use the same value on both clusters" reminder for replication_group_id.
  • Add a collapsed Advanced block covering chart-only fields (activationPolicy, approvalDelay, derivationAlgorithm, dimension, rootSecretRef.namespace) that are not exposed in the plugin form.
  • Append a paragraph in How it Works describing how deterministic mode interacts with the etcd Synchronizer (SeedBundle generation/replication, Standby derivation, auto role detection).

Source of truth: the etcd-encryption-manager feat/deterministic-key-derivation branch (already merged to main upstream), updated terminology in the parallel "rename master→root" change. Chart parameter / CRD field names in this PR use the new root* naming.

Out of scope: docs/en/install/global_dr.mdx — not modified.

Test plan

  • npx cspell --config cspell.config.js docs/en/configure/clusters/etcd-encryption.mdx — 0 issues
  • yarn lint via pre-commit hook — 0 errors, 0 warnings
  • Reviewer visually verifies the rendered page locally (yarn dev):
    • Key Strategies section appears above Installation
    • Anchor #global-deterministic resolves
    • All four <Directive type="warning"> callouts render
    • <details> advanced block collapses by default
    • All tables render with correct columns

…tion

The etcd Encryption Manager now supports global clusters via deterministic
key derivation. Restructure the page to introduce a "Key Strategies"
section and split installation into the existing workload/DCS path and a
new global cluster path that documents the three UI-exposed plugin
parameters (key_strategy, replication_group_id, master_secret_name),
master Secret preparation, and chart-only advanced fields.

Highlight DR-pair requirements (matching parameters across Active/Standby,
etcd Synchronizer v4.3.7+ baseline, same master key material on every
cluster in the replication group, sufficient entropy when generating the
master key) as warning callouts so misconfiguration that would silently
break failover is hard to miss.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

Walkthrough

The etcd encryption documentation has been expanded from a brief installation overview to a comprehensive guide covering key strategy selection (random vs deterministic), cluster-type-specific installation requirements, deterministic mode prerequisites and parameter configuration, master Secret creation procedures, and Global DR pair behavioral details for deterministic key derivation and replication across failover scenarios.

Changes

etcd Encryption Configuration Guide

Layer / File(s) Summary
Key Strategies and Installation Guide
docs/en/configure/clusters/etcd-encryption.mdx
New "Key Strategies" section explains random vs deterministic mode selection and when each applies. Installation requirements are now differentiated by cluster type (Workload/DCS vs Global with DR pairing expectations). Deterministic mode prerequisites, plugin parameters (replication_group_id, master_secret_name, fixed namespace), and master Secret creation steps in kube-system are documented. Includes explicit warnings to reuse identical key material across clusters. Advanced chart values section enumerates internal deterministic activation/approval timing and derivation settings.
How It Works: Global DR Deterministic Behavior
docs/en/configure/clusters/etcd-encryption.mdx
"How it Works" section extended to describe deterministic-mode behavior for Global DR pairs: runtime Active/Standby detection, Active-side SeedBundle generation and replication, and Standby-side per-revision key derivation to ensure identical keys after failover.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • alauda/acp-docs#39: Both PRs modify etcd-encryption.mdx; this PR expands and replaces the initial documentation with deterministic mode, Global DR behavior, and advanced configuration details.
  • alauda/acp-docs#203: Both PRs address etcd encryption key handling for Global/DR scenarios, with this PR covering deterministic key replication and derivation across failover.

Suggested reviewers

  • chinameok

Poem

🐰 A documentation bloom so bright,
Encryption secrets keyed just right,
Random dances, deterministic reigns,
DR failovers, balanced chains—
From chaos springs order's delight!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding documentation for global cluster deterministic installation in etcd-encryption, which matches the core purpose of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/etcd-encryption-global-deterministic

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Align with the upstream etcd-encryption-manager API rename: masterSecretRef
→ rootSecretRef, master_secret_name → root_secret_name, default Secret
etcd-derivation-master → etcd-derivation-root, Secret data key master-key
→ root-key, and the example file master-key.bin → root-key.bin.

Drop the cspell ignore for "master" — it is no longer used in this file
and the project's inclusive-language rule (master → control plane) now
applies cleanly. Keep the cspell ignore for `urandom`, which is still
referenced via `/dev/urandom`.
@chinameok
Copy link
Copy Markdown
Contributor

Review of the deterministic etcd encryption documentation. Overall this is a strong update — the binary random vs deterministic framing up front, the four <Directive type="warning"> callouts placed where they bite, and the explicit entropy guidance all make the page easy to follow and hard to misuse. A few items worth tightening, split by severity.

P1 — Root key file: secure transport channel is not stated

The Preparing the Root Secret section tells the operator to generate ./root-key.bin once and then transport the same file to the other cluster, but does not say which channel to use. root-key.bin is 32 bytes of long-term high-entropy key material — effectively equivalent to a cluster-wide CA secret. A reader following the doc literally could copy it via Slack DM, plain scp, email, or a shared NFS mount, undoing the careful CSPRNG generation step.

Suggest adding one sentence right after the existing key-material warning:

Transport root-key.bin only through a channel already trusted to carry long-term secrets — for example a secrets vault, sealed-secret, or an out-of-band encrypted file transfer. Do not commit it to source control, paste it into chat, or send it through unencrypted channels. Treat the file with the same care as a cluster CA private key.

P1 — Root key file: lifecycle after Secret creation is not stated

Once kubectl create secret has run on both clusters, the doc does not say what to do with the local ./root-key.bin. The Global DR topology is a fixed Active + Standby pair (no future expansion), so retaining the file on disk only adds an attack surface without any operational benefit. The deletion step belongs in the documented procedure, not in the reader's head.

Suggest adding step 3 to Preparing the Root Secret:

3. After confirming the Secret exists on **both** the Active and Standby global clusters, securely delete the local `./root-key.bin`:

   ```bash
   shred -u ./root-key.bin   # Linux
   # or: rm -P ./root-key.bin    # macOS
   ```

   The Secret in `kube-system` is now the only copy of the root key material.

A related minor tightening: the warning currently reads transport the same file to every cluster in the replication group. ACP Global DR is a fixed 1-Active / 1-Standby pair, so the wording can be more specific to avoid implying that an N-node replication group is possible:

Transport the file to both the Active and Standby global clusters (Global DR is a fixed Active/Standby pair) and run the kubectl create secret step on each.

If keeping the generic wording for forward compatibility is intentional, ignore this nit.

P2 — Plugin Parameters overview table mixes literal defaults with UI behavior

| Plugin parameter | Required | Default | Purpose |
| `key_strategy` | Yes | `random`; dynamically defaults to `deterministic` for `global` clusters. | ... |
| `replication_group_id` | Yes | The plugin UI prefills the current cluster name (not recommended — see below). | ... |
| `root_secret_name` | Yes | `etcd-derivation-root` | ... |

Row 3 has a literal default. Rows 1 and 2 have narrative paragraphs in the same column. Scanning the column reads inconsistently. Suggest keeping only the literal value in Default and moving the dynamic / UI-prefill behavior into either the Purpose column or into the per-parameter H5 subsection below.

P2 — replication_group_id UI prefill contradicts the stated recommendation

The doc currently lists both:

  • Default in UI: Current cluster name
  • Recommended to keep the default: **No.**
  • do **not** reuse the cluster name even when the UI prefills it

A reader will reasonably ask: if the UI prefills a value that the doc says not to use, why is the UI doing that? Spell it out instead of leaving the contradiction implicit:

The UI prefills the current cluster name as a convenience for non-DR / single-cluster experimentation. Do not accept that default in any production or DR setup; replace it with the opaque identifier you generated above.

P2 — Plugin Parameters overview table duplicates the per-parameter property tables

The 4-column overview table (Plugin parameter | Required | Default | Purpose) and the per-parameter 2-column | Property | Value | tables that follow restate Default and Purpose for each field. Not strictly wrong, but readers feel they read the same data twice.

Two consolidation options, author's pick:

A. Drop the overview table; keep only the per-parameter H5 + property tables.
B. Keep the overview table as the only tabular form; replace the per-parameter property tables with short prose paragraphs that elaborate beyond the overview cells.

P3 — Advanced section mentions a plugin-form field name despite saying the section is not exposed in the form

The Advanced block header is chart values, **not exposed in the plugin form**. Yet one row says:

The plugin form maps this as approval_delay_minutes (integer minutes) internally.

If approval_delay_minutes is genuinely not exposed in the plugin form, the sentence is contradictory and can be deleted. If a related field is exposed (e.g., through a YAML override path), it belongs in Plugin Parameters, not Advanced. Worth clarifying with the chart owner.

P3 — How it Works deterministic mode bullets read better as a narrative

The three bullets currently mix a meta point (role detection) with two role-specific points. Suggest one paragraph instead:

On install, the controller automatically detects whether it is running on the Active or the Standby cluster — no manual role configuration is required. On the Active side, the controller generates the SeedBundle and relies on the etcd Synchronizer to replicate it to the Standby. On the Standby side, it derives the same per-revision keys from the replicated SeedBundle and the local root Secret, so a failover produces identical encryption keys without manual key copy.

Non-blocking; the bullet form is also readable.

Cross-doc follow-up (the PR description explicitly marks this out of scope)

Two propagation targets worth opening follow-up issues for:

  1. docs/en/install/global_dr.mdx — the canonical Global DR install procedure. After this PR, that page should mention installing the etcd Encryption Manager plugin in deterministic mode as part of the DR pair setup, with a link to #global-deterministic in this file. A reader following the install runbook end-to-end currently has no signal that deterministic mode exists or is required.
  2. docs/en/upgrade/upgrade_global_cluster.mdx — the etcd Synchronizer v4.3.7+ prerequisite for deterministic mode currently lives only in this page. Customers on Sync ≤ v4.3.6 who want to enable deterministic encryption need that constraint visible in the upgrade path too.

Confirmed-good notes

  • <Directive type="warning"> (JSX form) is used throughout rather than :::warning — matches the Doom MDX rule that ::: admonitions break inside JSX components ✅
  • Four <Directive> callouts are placed at the exact decision points (DR pair install, Synchronizer version, replication_group_id reuse, root key reuse) rather than batched at the top ✅
  • Entropy guidance is unambiguous (OS CSPRNG explicitly required; $RANDOM, rand(), time-seeded generators explicitly forbidden) ✅
  • <details> collapses the Advanced chart fields so the primary path stays uncluttered ✅
  • \{#global-deterministic} anchor matches existing repo conventions ✅
  • {/* cspell:ignore urandom */} declared up front rather than relying on a separate cspell config edit ✅
  • Concrete UUID generation one-liner (openssl rand -hex 16 | sed -E ...) — gives the reader a directly copyable command instead of "generate a UUID somehow" ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants