Skip to content

feat(kubernetes): support HA gateway rebalancing#1868

Open
TaylorMutch wants to merge 2 commits into
mainfrom
1021-ha-gateway-rebalancing/tm
Open

feat(kubernetes): support HA gateway rebalancing#1868
TaylorMutch wants to merge 2 commits into
mainfrom
1021-ha-gateway-rebalancing/tm

Conversation

@TaylorMutch

@TaylorMutch TaylorMutch commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds HA gateway rebalancing support for Kubernetes deployments so client and supervisor traffic can survive gateway replica scale-up, scale-down, and pod rotation.

This PR targets main directly and currently includes the reconciler lease commit from #1577. Review focus should be ad9f04d7 unless #1577 lands first.

Related Issue

Closes #1021

Related: #1012, #1429, #1577, #1731, #1488

Changes

  • Adds gateway peer authentication and peer routing for HA supervisor relay handoff.
  • Adds Kubernetes compute lease/reconciler ownership behavior for multi-replica gateways.
  • Adds Helm peer Service/RBAC rendering and Skaffold HA/Envoy dev profile support.
  • Adds Kubernetes HA rebalancing e2e coverage and removes the noisy readyz e2e smoke.
  • Updates architecture and local cluster/debug skills for HA gateway development.

Testing

  • mise run pre-commit passes
  • Local Kubernetes HA validation with Envoy Gateway, external PostgreSQL, and gateway scale/rotation was exercised during development
  • GitHub test:e2e-kubernetes label should run the Kubernetes HA E2E job

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)

@TaylorMutch TaylorMutch requested review from a team, derekwaynecarr and mrunalp as code owners June 11, 2026 04:47
@TaylorMutch TaylorMutch added the test:e2e-kubernetes Requires Kubernetes end-to-end coverage label Jun 11, 2026
@github-actions

Copy link
Copy Markdown

Label test:e2e-kubernetes applied for ad9f04d. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute Kubernetes HA E2E after building the required gateway and supervisor images once. This is an optional proof-of-life suite; failures are visible in the workflow run but do not publish a required CI gate status.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
@TaylorMutch TaylorMutch force-pushed the 1021-ha-gateway-rebalancing/tm branch from 24c1003 to 3e590e6 Compare June 11, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e-kubernetes Requires Kubernetes end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(k8s, helm): Enable running OpenShell Gateway with multiple replicas

1 participant