Skip to content

Health metrics (Part 2)#2796

Merged
un-def merged 8 commits intodstackai:masterfrom
Nadine-H:nadine/2736_add-custom-health-metrics
Jun 26, 2025
Merged

Health metrics (Part 2)#2796
un-def merged 8 commits intodstackai:masterfrom
Nadine-H:nadine/2736_add-custom-health-metrics

Conversation

@Nadine-H
Copy link
Copy Markdown
Contributor

Part of #2736

Adding two custom metrics:

  • dstack_submit_to_provision_duration_seconds: Time from when a run has been submitted and first job provisioning
  • dstack_pending_runs_total: Total number of pending runs

We can add metrics later too, but I think for now these two are helpful to see if there are runs stuck in SUBMITTED or PENDING states, which could be due to an issue with dstack or the underlying infrastructure.

@peterschmidt85 peterschmidt85 requested a review from un-def June 16, 2025 11:08
@Nadine-H Nadine-H force-pushed the nadine/2736_add-custom-health-metrics branch from 8941494 to 7227265 Compare June 18, 2025 18:22
@Nadine-H Nadine-H requested a review from un-def June 18, 2025 18:23
@un-def un-def merged commit 40ee802 into dstackai:master Jun 26, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants