Skip to content

Include WCI connectivity health in worker deployment API#10778

Merged
smuneebahmad merged 9 commits into
mainfrom
muneeb/wci-validation
Jun 25, 2026
Merged

Include WCI connectivity health in worker deployment API#10778
smuneebahmad merged 9 commits into
mainfrom
muneeb/wci-validation

Conversation

@smuneebahmad

@smuneebahmad smuneebahmad commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

What changed?

DescribeWorkerDeployment now returns compute status per version, checking whether Temporal can successfully interact with the version's compute resource. ListWorkerDeployments also returns compute status on the current, ramping, and latest version summaries, fetched in parallel.

When connectivity changes (e.g. Lambda becomes unreachable or is restored), WCI signals the version workflow, which propagates the update to the deployment workflow memo. The list view reads from the memo — versions that have been validated since deployment will show their status immediately; others will appear once the first validation runs.

Why?

Allows customers to see whether Temporal can successfully interact with their compute resource directly from the Worker Deployments list and detail views, without navigating into each individual version.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

@smuneebahmad smuneebahmad changed the title Include WCI connectivity health in worker deployment API Include WCI connectivity health in worker deployment API Jun 19, 2026
@smuneebahmad smuneebahmad force-pushed the muneeb/wci-validation branch from 32f2623 to bb3d98a Compare June 19, 2026 15:57
@smuneebahmad smuneebahmad force-pushed the muneeb/wci-validation branch from bb3d98a to e442af1 Compare June 19, 2026 16:52
smuneebahmad added a commit to temporalio/api that referenced this pull request Jun 22, 2026
…810)

**What changed?**
Added `ComputeStatus` to the Worker Deployment API with a
`ProviderValidationStatus` that tracks the result of the most recent
connectivity check between Temporal and a customer's compute resource.
An empty error message means validation passed; a non-empty message
describes what failed.

**Why?**
Enables surfacing the connectivity health between Temporal and a
customer's compute resource (e.g. Lambda) through the Worker Deployment
API, so the UI can show ongoing validations status of compute configs
without additional queries.

**Breaking changes**
None. All additions are new optional fields; existing clients are
unaffected.

**Server PR**
temporalio/temporal#10778
temporal-cicd Bot pushed a commit to temporalio/api-go that referenced this pull request Jun 22, 2026
…(#810)

**What changed?**
Added `ComputeStatus` to the Worker Deployment API with a
`ProviderValidationStatus` that tracks the result of the most recent
connectivity check between Temporal and a customer's compute resource.
An empty error message means validation passed; a non-empty message
describes what failed.

**Why?**
Enables surfacing the connectivity health between Temporal and a
customer's compute resource (e.g. Lambda) through the Worker Deployment
API, so the UI can show ongoing validations status of compute configs
without additional queries.

**Breaking changes**
None. All additions are new optional fields; existing clients are
unaffected.

**Server PR**
temporalio/temporal#10778
@smuneebahmad smuneebahmad force-pushed the muneeb/wci-validation branch from 8f0d9fe to 15c5335 Compare June 22, 2026 22:53

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Expose Worker Controller Instance (WCI) connectivity/validation health as ComputeStatus on worker deployment version summaries, so list/describe APIs can surface compute reachability/validation results per version.

Changes:

  • Add compute_status to internal deployment/version workflow state protos and regenerate Go protos.
  • Update version workflow to ingest WCI validation status signals and propagate ComputeStatus into deployment summaries/memo.
  • Plumb ComputeStatus through worker deployment API response mapping; bump related module versions.

Reviewed changes

Copilot reviewed 6 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
service/worker/workerdeployment/workflow.go Include ComputeStatus when mapping internal version summaries to API version summaries.
service/worker/workerdeployment/version_workflow.go Listen for WCI validation status signals and sync updated compute status to the deployment workflow.
service/worker/workerdeployment/compute_util.go Add helper converting WCI validation status to public deploymentpb.ComputeStatus.
service/worker/workerdeployment/client.go Include ComputeStatus in DescribeWorkerDeployment response mapping for each version summary.
proto/internal/temporal/server/api/deployment/v1/message.proto Add compute_status fields to internal workflow state protos.
api/deployment/v1/message.pb.go Regenerated output for internal proto changes.
cmd/tools/getproto/files.go Generated proto import map update (formatting issues introduced).
go.mod Bump go.temporal.io/api and go.temporal.io/auto-scaled-workers versions.
go.sum Update sums for bumped module versions.
Files not reviewed (2)
  • api/deployment/v1/message.pb.go: Generated file
  • cmd/tools/getproto/files.go: Generated file

Comment thread service/worker/workerdeployment/version_workflow.go Outdated
Comment thread service/worker/workerdeployment/compute_util.go
Comment thread proto/internal/temporal/server/api/deployment/v1/message.proto Outdated
Comment thread proto/internal/temporal/server/api/deployment/v1/message.proto Outdated
Comment thread cmd/tools/getproto/files.go Outdated
Comment thread cmd/tools/getproto/files.go Outdated
Comment thread service/worker/workerdeployment/client.go

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • api/deployment/v1/message.pb.go: Generated file

Comment thread service/worker/workerdeployment/version_workflow.go
Comment thread service/worker/workerdeployment/version_workflow.go
Comment thread api/deployment/v1/message.pb.go Outdated
Comment thread api/deployment/v1/message.pb.go Outdated
@smuneebahmad smuneebahmad force-pushed the muneeb/wci-validation branch from 6e171dd to 0dc9e8c Compare June 24, 2026 16:20
@smuneebahmad smuneebahmad force-pushed the muneeb/wci-validation branch from 0dc9e8c to 2d15ad9 Compare June 24, 2026 16:24
@smuneebahmad smuneebahmad marked this pull request as ready for review June 24, 2026 17:17
@smuneebahmad smuneebahmad requested review from a team as code owners June 24, 2026 17:17
}

// Version gate for sync-validation-status signal to prevent NDEs during rollback
if workflow.GetVersion(ctx, "sync-validation-status-signal", workflow.DefaultVersion, 0) >= 0 {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we've been doing this patchings for other signal handlers, but I don't think patching (GetVersion) really helps with NDEs related to the new signal. Because the handler registration itself does not create history events and is safe to hit during replay of a workflow ran in the previous version without signal handler.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will clean this up in a follow-up PR

@rkannan82 rkannan82 self-requested a review June 25, 2026 00:00

@rkannan82 rkannan82 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update existing e2e test to verify this?

@smuneebahmad

Copy link
Copy Markdown
Contributor Author

Do we need to update existing e2e test to verify this?

Good question. The unit tests in compute_util_test.go and version_workflow_test.go cover the new ComputeStatus conversion and signal propagation. E2E tests have been validated as part of temporalio/temporal-auto-scaled-workers#60

@smuneebahmad smuneebahmad merged commit d4cab6b into main Jun 25, 2026
50 checks passed
@smuneebahmad smuneebahmad deleted the muneeb/wci-validation branch June 25, 2026 05:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants