How to deploy the operator into a cluster. Assumes you have admin on the target cluster (the operator installs cluster-scoped CRDs and ClusterRoles) and a registry you can push to.
For the operator's runtime behaviour see concepts; for day-2 operations see operations.
| Requirement | Note |
|---|---|
| Kubernetes | 1.29+ recommended: CEL CRD validation went GA in 1.29 and the quantity() CEL extension (used by two of the operator's validation rules) was added in 1.28. 1.28 may work in practice because the CEL gate was beta-on-by-default from 1.25, but is not covered by CI. |
Default StorageClass |
Each per-member PVC uses the namespace's default StorageClass. Override per-cluster via spec.storage.storageClassName (a string naming a specific StorageClass, or "" to disable dynamic provisioning entirely). Immutable post-create — Kubernetes PVCs cannot have their StorageClass swapped in place. |
| Go (build-from-source only) | 1.25+, matches go.mod's toolchain directive. |
| Docker / buildx (build-from-source only) | For producing the operator image. The Dockerfile uses golang:1.25.10 for the builder and gcr.io/distroless/static:nonroot for runtime. |
Workload-side: every etcd Pod runs as UID 65532 with runAsNonRoot=true, allowPrivilegeEscalation=false, all capabilities dropped, and seccompProfile=RuntimeDefault. The Pods comply with the restricted PodSecurity profile. If your cluster enforces a stricter policy, see controllers/etcdmember_controller.go's buildPod for the exact security context the operator emits and adjust accordingly.
Tagged releases publish a signed multi-arch operator image to GHCR and attach ready-to-apply install manifests to the GitHub release — no checkout, no build, no registry of your own. This is the recommended path for consuming a release.
# Pick a released version (see https://github.com/cozystack/etcd-operator/releases).
VERSION=v0.5.0
# Everything (CRDs + namespace + RBAC + controller Deployment + Service):
kubectl apply -f https://github.com/cozystack/etcd-operator/releases/download/$VERSION/etcd-operator.yamlThe manifest already points the manager (and its OPERATOR_IMAGE, used for
snapshot/restore Pods) at ghcr.io/cozystack/etcd-operator:$VERSION — the same
tag whose image the release published, so there is nothing to substitute.
If you split CRDs from the rest (e.g. CRDs are applied by a separate cluster-admin step, or server-side-applied to dodge the annotation size limit):
kubectl apply --server-side -f https://github.com/cozystack/etcd-operator/releases/download/$VERSION/etcd-operator.crds.yaml
kubectl apply -f https://github.com/cozystack/etcd-operator/releases/download/$VERSION/etcd-operator.non-crds.yamlThe image is cosign-signed (keyless). To verify before deploying:
cosign verify ghcr.io/cozystack/etcd-operator:$VERSION \
--certificate-identity-regexp 'https://github.com/cozystack/etcd-operator/.github/workflows/.+' \
--certificate-oidc-issuer https://token.actions.githubusercontent.comTo pull a prebuilt image without the release manifests (e.g. to feed your own
overlay), the image ref is ghcr.io/cozystack/etcd-operator:<tag>.
Helm is the primary install path: the chart is the single source of truth for
the CRDs, RBAC, and the manager Deployment (the release manifests below are just
helm template of this same chart). Tagged releases publish it as an OCI Helm
chart to GHCR (ghcr.io/cozystack/charts/etcd-operator), versioned from the
same tag — the chart version is the tag without the leading v, and
appVersion keeps the v. The CRDs are generated straight into the chart and
templated into the release.
# Chart version == release tag without the leading 'v'
# (see https://github.com/cozystack/etcd-operator/releases).
VERSION=0.5.0
helm install etcd-operator oci://ghcr.io/cozystack/charts/etcd-operator \
--version "$VERSION" \
--namespace etcd-operator-system --create-namespaceBy default the chart pulls ghcr.io/cozystack/etcd-operator:<appVersion> — the
image that same release published — so a stock install has nothing to
substitute. The chart wires that ref into both the manager's image: and
its OPERATOR_IMAGE env var (the image launched for snapshot/restore Pods); the
two must be identical, and the chart keeps them equal for you. Override the
image via image.repository / image.tag and both follow.
The CRDs are templated into the release (not in Helm's install-only crds/
directory), so helm upgrade keeps them current with the chart — no separate
CRD-apply step on upgrade. They carry helm.sh/resource-policy: keep, so helm uninstall leaves the CRDs (and therefore your EtcdClusters and their data) in
place; deleting the CRDs is a deliberate, manual step. Set crds.enabled=false
to manage CRDs out-of-band, or crds.keep=false to let uninstall remove them.
Common values (--set key=value, or a -f my-values.yaml):
| Value | Default | Purpose |
|---|---|---|
image.repository |
ghcr.io/cozystack/etcd-operator |
Operator image repo. Override to mirror or fork. |
image.tag |
chart appVersion |
Operator image tag; also becomes OPERATOR_IMAGE (see above). |
replicaCount |
1 |
Operator replicas (leader election picks the active one). |
kubeRbacProxy.enabled |
true |
Front /metrics with the kube-rbac-proxy SubjectAccessReview sidecar. Set false to bind metrics on :8080 directly with no proxy. |
metrics.serviceMonitor.enabled |
false |
Create a prometheus-operator ServiceMonitor for the metrics endpoint (needs the monitoring.coreos.com CRDs and kubeRbacProxy.enabled). |
crds.enabled / crds.keep |
true / true |
Render the CRDs with the release / annotate them so uninstall keeps them. |
manager.resources |
10m/64Mi → 500m/128Mi | Manager container requests/limits. |
imagePullSecrets |
[] |
Pull secrets for a private registry mirror. |
See charts/etcd-operator/values.yaml for the complete, annotated list. Verify
the install:
kubectl -n etcd-operator-system get deployWith release name etcd-operator the Deployment is named etcd-operator. The
release manifests (the kubectl-apply path above) render from this same chart, so
they produce the same name. The Deployment carries the label
control-plane=controller-manager, a name-agnostic handle for scripts.
The repo's Makefile drives a complete install via Helm. From a checkout (needs
helm v3.16+ on PATH):
# 1. Build the operator image (or skip to a prebuilt registry tag).
make docker-build docker-push IMG=<your-registry>/etcd-operator:<tag>
# 2. Install/upgrade the operator (CRDs + RBAC + manager) with Helm.
# `make deploy` runs `helm upgrade --install` and wires image == OPERATOR_IMAGE.
make deploy IMG=<your-registry>/etcd-operator:<tag>The cluster must be able to pull from <your-registry>. For local clusters (kind / minikube / k3d), either sideload the image (kind load docker-image ...) or push to an ephemeral registry the cluster can reach (e.g. ttl.sh/<random>:1h); otherwise the operator Deployment goes ImagePullBackOff with no clear hint from the operator side.
make deploy installs the release etcd-operator into the etcd-operator-system
namespace (override with HELM_RELEASE= / NAMESPACE=). The Deployment is named
after the release. Verify:
kubectl get pod -n etcd-operator-system
kubectl -n etcd-operator-system logs deploy/etcd-operator -c manager --tail=20You should see the manager start lines and an empty work-queue (no EtcdCluster resources yet). Tear down with make undeploy (see Teardown).
For GitOps flows that apply plain YAML, render the chart with helm template
instead of installing it — this is exactly what make build-dist-manifests (and
the release pipeline) does to produce the release's etcd-operator.yaml:
helm template etcd-operator charts/etcd-operator \
--namespace etcd-operator-system \
--set image.repository=<your-image-repo> --set image.tag=<tag> \
--set namespace.create=true | kubectl apply --server-side -f ---server-side avoids the client-side last-applied-config annotation size limit (the consolidated manifest embeds the full CRD schemas). namespace.create=true emits the Namespace so the output is self-contained.
The chart keeps
image:andOPERATOR_IMAGEequal for you.OPERATOR_IMAGEis the image the operator launches for snapshot Jobs and restore init containers; it must match the manager image. The chart renders both fromimage.repository/image.tag(theetcd-operator.imagehelper in_helpers.tpl), so setting the image once covers both. If you hand-craft manifests and leaveOPERATOR_IMAGEat the placeholdercontroller:latest, the operator refuses to start and exits with a clear error (rather than letting snapshot/restore PodsImagePullBackOfflater).
cat <<'EOF' | kubectl apply -f -
apiVersion: etcd-operator.cozystack.io/v1alpha2
kind: EtcdCluster
metadata:
name: my-etcd
namespace: default
spec:
replicas: 3
version: 3.6.11
storage:
size: 1Gi
EOFWatch it form:
kubectl get etcdcluster.etcd-operator.cozystack.io my-etcd -wThe operator bootstraps a single seed first, latches clusterID, then adds the remaining members one at a time as learners (with promotion). A 3-member cluster typically reaches READY=3 in well under a minute on a healthy cluster.
To open a shell to one of the members:
POD=$(kubectl get pod -l etcd-operator.cozystack.io/cluster=my-etcd \
-o jsonpath='{.items[0].metadata.name}')
kubectl exec -it "$POD" -- etcdctl --endpoints=http://localhost:2379 \
member list -w tableDon't hard-code Pod names — they carry a random suffix from GenerateName (e.g. my-etcd-7xq2k). The label selector is the stable handle.
For reconstructable workloads (e.g. a Kubernetes-in-Kubernetes apiserver whose state is GitOps-managed) you can opt the cluster onto a tmpfs emptyDir instead of a PVC:
apiVersion: etcd-operator.cozystack.io/v1alpha2
kind: EtcdCluster
metadata:
name: my-mem-etcd
namespace: default
spec:
replicas: 3
version: 3.6.11
storage:
size: 256Mi # tmpfs SizeLimit per member
medium: MemoryThis trades durability for speed: a Pod that loses its tmpfs (eviction, node failure) loses its data and the member is automatically replaced via MemberRemove + scale-up. Don't use it as a general-purpose etcd backend — see docs/concepts.md and docs/operations.md for the full trade-off. For production, set spec.affinity (pod anti-affinity) and spec.resources.limits.memory explicitly — neither is defaulted (tracked in #16); see the production checklist. The apiserver rejects replicas: 0 on memory clusters via the CEL validation rules, and every cluster gets an auto-emitted PodDisruptionBudget.
You can opt the client API (2379), the peer API (2380), or both onto TLS. Two sources per subtree, mutually exclusive:
- BYO Secrets — you create
kubernetes.io/tls-shaped Secrets out-of-band (e.g. via your own cert pipeline) and reference them viaspec.tls.client.serverSecretRef/operatorClientSecretRefandspec.tls.peer.secretRef. See operations: TLS-enabled clusters for Secret-creation commands. - cert-manager — point at an
IssuerorClusterIssuerviaspec.tls.client.certManager.{serverIssuerRef,operatorClientIssuerRef}andspec.tls.peer.certManager.issuerRef. The operator emitscert-manager.io/v1 Certificateresources, cert-manager produces the Secrets, the rest of the wiring is identical.
See concepts: TLS for the trade-offs (EKU clientAuth requirement on the server cert, CA-bundle topology, etc.) and concepts: cert-manager-driven TLS for the operator-emitted-Certificate path.
- cert-manager (any v1.x release) installed on the cluster before the operator starts. The operator probes the discovery API for
cert-manager.io/v1at startup; if absent, clusters withcertManagerset are parked atAvailable=False / CertManagerNotInstalledand the operator never touches the GVK. Recovery is install-cert-manager + operator restart. - An
Issuer(namespaced, in the EtcdCluster's namespace) orClusterIssuerfor each role. Most commonly a single Issuer per plane signs both the server and operator-client certs. - The operator's cluster DNS suffix must match the cluster's actual suffix so the emitted SANs match what kube-dns returns for peer reverse-DNS. The operator auto-discovers it from
/etc/resolv.conf'ssearchline at startup (coverscluster.local,cozy.local, and any other kubelet-injected suffix for normal cluster-pod deployments); falls back tocluster.localwhen auto-discovery yields nothing (hostNetwork pods, customdnsPolicy). Override explicitly with--cluster-domain=<suffix>when neither path finds the right value.
Required SANs on the per-cluster server cert (BYO must cover all of these because the cert is shared by every member; cert-manager mode emits them automatically based on --cluster-domain):
*.<cluster>.<ns>.svc(DNS SAN, wildcard — etcd ≥3.4 supports wildcard DNS SANs)*.<cluster>.<ns>.svc.<cluster-domain>(also wildcard) — required by etcd's peer-mTLS verification, which reverse-DNS-looks-up the connecting peer's IP. Kubernetes' DNS returns the fully-qualified<pod>.<svc>.<ns>.svc.<cluster-domain>form, and the cert SAN has to cover it.<cluster-domain>iscluster.localon most clusters; Cozystack usescozy.local. Checkkubectl exec -n kube-system <coredns-pod> -- cat /etc/coredns/Corefileor your cluster's resolved DNS suffix if unsure.<cluster>.<ns>.svc(the headless Service)<cluster>-client.<ns>.svc(the client Service)localhost(DNS SAN, for thekubectl exec ... etcdctl --endpoints=https://localhost:2379flow documented under operations)127.0.0.1(IP SAN — same)
Required SANs on the per-cluster peer cert: both *.<cluster>.<ns>.svc AND *.<cluster>.<ns>.svc.<cluster-domain> — the second is load-bearing for peer-mTLS, as above.
Server cert EKU must include serverAuth AND clientAuth (the etcd grpc-gateway loopback presents the server cert as a client cert when self-dialing; the server's --trusted-ca-file then verifies it with ExtKeyUsageClientAuth). Peer cert EKU must include both because peer is symmetric. Operator-client cert needs only clientAuth.
The spec.tls subtree is immutable post-create — flipping TLS on or off on an existing cluster is delete-and-recreate.
spec.version in an EtcdCluster becomes quay.io/coreos/etcd:v<version>. The image repository is hard-coded in controllers/helpers.go:EtcdImage. Override it by patching the operator image with your own registry/repo if you mirror etcd internally.
The spec.version examples throughout these docs use 3.6.x, to match the etcdutl bundled in the operator image: spec.bootstrap.restore requires spec.version and that etcdutl to share a minor (see the restore runbook). The operator's etcd client is v3.6.x and is wire-compatible with 3.5.x servers, so a cluster you never restore into can still run 3.5.x — but to back up and restore on the same version, run 3.6.x.
Operator's own toolchain (relevant when building from source):
| Component | Version |
|---|---|
| Go | 1.25.10 |
| controller-runtime | v0.21 |
| k8s.io/api, k8s.io/client-go | v0.33 |
| controller-gen | v0.18.0 |
| Helm (install/render the chart) | v3.16+ |
etcd client (go.etcd.io/etcd/client/v3) |
v3.6.11 |
| Kubebuilder layout | v4 |
All pinned in go.mod, Dockerfile, and Makefile.
The operator runs as a ClusterRole — it needs to watch EtcdCluster and EtcdMember across all namespaces, plus create/delete the per-member Pods, PVCs, and Services in each user namespace. The rules are generated from the +kubebuilder:rbac markers (by make manifests) into charts/etcd-operator/files/manager-role-rules.yaml and pulled into the chart's templated ClusterRole — don't hand-edit; edit the markers and regenerate.
Single-namespace scoping is not currently exposed: main.go does not wire a namespace flag into the manager's Cache.DefaultNamespaces, so the manager always watches all namespaces. Limiting RBAC alone (ClusterRole → Role) is not sufficient — the manager will still attempt list/watch across the cluster and the API server will deny it. Scoped deployment is a follow-up.
The operator creates two Services per EtcdCluster:
<cluster>— headless (clusterIP: None),publishNotReadyAddresses: true, selectoretcd-operator.cozystack.io/cluster=<cluster>, exposes both 2379 (client) and 2380 (peer). Used by etcd for peer discovery and by the operator's own etcd client (which dials per-pod DNS<member>.<cluster>.<ns>.svc:2379resolved through this service).publishNotReadyAddressesis required for bootstrap: members during the initial join window aren'tReadyyet but still need DNS entries to find each other.<cluster>-client—ClusterIP, exposes 2379 only. Intended for end-user client traffic (load-balanced across all pods backing the selector).
External access (NodePort / LoadBalancer / Ingress) isn't created automatically. If you need it, layer a separate Service or Ingress on top of <cluster>-client's selector.
A specific routing pitfall: kube-apiserver pointed at the headless Service or the client ClusterIP will round-robin its etcd client across all reachable backends, including any current learner. Learners reject MemberList etc. with "rpc not supported for learner". The operator's own etcd client filters learners out (issue #12 fix in memberEndpoints / discoverMemberID), but you can't make kube-apiserver do the same. The pragmatic options for apiserver→etcd:
- Point at a single voter Pod's per-pod DNS name. Simple, fragile (Pod rescheduling).
- Point at a leader-aware proxy (etcd's own gRPC-proxy, or a sidecar). Robust, extra moving part.
- Accept occasional "rpc not supported for learner" errors during scale-up windows; kube-apiserver's own retry layer absorbs them.
This is outside the operator's scope but documented because operators ask.
# Remove individual clusters first — their finalizers will clean up etcd state.
kubectl delete etcdcluster.etcd-operator.cozystack.io --all -A
# Remove the operator (helm uninstall). The CRDs carry
# helm.sh/resource-policy: keep, so they (and any surviving EtcdClusters) are
# intentionally left in place.
make undeploy
# Remove the CRDs too (only after all EtcdClusters are gone) — deleting them
# cascade-deletes every remaining EtcdCluster:
kubectl delete crd etcdclusters.etcd-operator.cozystack.io \
etcdmembers.etcd-operator.cozystack.io \
etcdsnapshots.etcd-operator.cozystack.ioDeleting an EtcdCluster while it's running cascades through every owned resource: the operator's finalizer on each EtcdMember calls MemberRemove (when the cluster itself is also being deleted, the operator detects this and skips MemberRemove to avoid a deadlock — see handleDeletion in controllers/etcdmember_controller.go). Pods and PVCs are then GC'd via owner-refs.
If the operator is uninstalled while EtcdCluster resources still exist, they're stranded — the finalizers won't run because no controller is reading the queue. Recovery is to either re-install the operator, or kubectl patch ... --type=merge -p '{"metadata":{"finalizers":null}}' on each EtcdMember (manual, leaves PVCs and Pods in place — clean them up by label).
For now, in-place operator upgrades work via kubectl set image on the operator Deployment, but in-place etcd version upgrades do not — changing spec.version on an existing EtcdCluster only affects newly-created members. See What's not supported. The current recommended path for an etcd-version bump is:
- Scale up by one to introduce a new-version member as a learner.
- Scale down by one to evict an old-version member.
- Repeat for each member.
- Once all members are on the new version, edit
spec.versionso future scale-ups use it directly.
This is manual and slow. A native rolling upgrade is a tracked follow-up.
kubectl-etcd is an optional client-side kubectl plugin for day-2 operations on operator-managed clusters (member list, status, defrag, compact, alarms, snapshot, member add/remove). It runs on your workstation against your kubeconfig — it is not part of the operator image.
Each release attaches kubectl-etcd-<os>-<arch> binaries (with cli-SHA256SUMS.txt). Install it onto your PATH named kubectl-etcd, and kubectl picks it up as kubectl etcd:
VERSION=v0.5.0; OS=$(uname -s | tr A-Z a-z); ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')
curl -sSLo kubectl-etcd "https://github.com/cozystack/etcd-operator/releases/download/$VERSION/kubectl-etcd-$OS-$ARCH"
chmod +x kubectl-etcd && sudo mv kubectl-etcd /usr/local/bin/ # any dir on $PATH works
kubectl etcd --version
kubectl etcd members --helpOr build from a checkout with make kubectl-etcd (lands in bin/kubectl-etcd). There is no krew package yet.
Out-of-cluster development run (against the current $KUBECONFIG):
make runThis builds and runs the operator binary on your laptop. It can reconcile EtcdCluster resources but cannot dial etcd via in-cluster DNS — MemberList/MemberAdd/MemberRemove will fail. Useful for testing reconcile loop logic against the apiserver, not for end-to-end testing. For e2e use the deploy flow above.