Skip to content

[Bug] Worker Deployment Read API rate limit still exceeded in v1.4.0 #278

@gabriel-yahav

Description

@gabriel-yahav

Description

v1.4.0 release notes mention:

"Omit DescribeVersion calls for drained versions to avoid hitting RPS limits"

However, I am still hitting Worker Deployment Read API rate limit errors in v1.4.0.
The error is coming from DescribeWorkerDeployment (the top-level deployment describe),
not from DescribeVersion on individual versions.

Error

"error":"unable to get Temporal worker deployment state: unable to describe worker
deployment X/y: Worker Deployment Read API rate limit exceeded for namespace "ABCD""

The stacktrace points to the standard reconciler loop in controller-runtime,

Root Cause (suspected)

The v1.4.0 optimization skips DescribeVersion for drained versions, which reduces
per-version calls. However, DescribeWorkerDeployment is called unconditionally on
every reconcile iteration regardless of the deployment's state. With multiple
TemporalWorkerDeployment CRs in the same Temporal namespace and a short reconcile
interval, these calls aggregate and exceed the APS (actions-per-second) limit on
Temporal Cloud.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions