Add logs for stall detection#2155
Conversation
MarcusSorealheis
left a comment
There was a problem hiding this comment.
High‑volume logs on hot paths can worsen stalls. Please ensure new logs are rate‑limited or include stable identifiers (action_id, worker_id, invocation_id, queue name) so stalls can be correlated across scheduler/worker/store. Basically, log state transitions.
That's why I have added the log level as trace. We won't be using the trace log level in production. These logs are just added in case we experience stalling with normal deployment and we would like to find the source of the stall. |
Description
Added some logs for detecting stalls if any and what caused those stalls.
Fixes # (issue)
Type of change
Please delete options that aren't relevant.
not work as expected)
How Has This Been Tested?
Please also list any relevant details for your test configuration
Checklist
bazel test //...passes locallygit amendsee some docsThis change is