|
25 | 25 |
|
26 | 26 |
|
27 | 27 | <link rel="icon" href="../../../assets/images/dstack-fav-32.ico"> |
28 | | - <meta name="generator" content="mkdocs-1.6.1, mkdocs-material-9.6.10+insiders-4.53.16"> |
| 28 | + <meta name="generator" content="mkdocs-1.6.1, mkdocs-material-9.6.11+insiders-4.53.16"> |
29 | 29 |
|
30 | 30 |
|
31 | 31 |
|
|
3037 | 3037 | </label> |
3038 | 3038 | <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix> |
3039 | 3039 |
|
| 3040 | + <li class="md-nav__item"> |
| 3041 | + <a href="#exporting-gpu-cost-and-other-metrics-to-prometheus" class="md-nav__link"> |
| 3042 | + <span class="md-ellipsis"> |
| 3043 | + |
| 3044 | + Exporting GPU, cost, and other metrics to Prometheus |
| 3045 | + |
| 3046 | + </span> |
| 3047 | + </a> |
| 3048 | + |
| 3049 | +</li> |
| 3050 | + |
3040 | 3051 | <li class="md-nav__item"> |
3041 | 3052 | <a href="#accessing-dev-environments-with-cursor" class="md-nav__link"> |
3042 | 3053 | <span class="md-ellipsis"> |
|
3617 | 3628 | </label> |
3618 | 3629 | <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix> |
3619 | 3630 |
|
| 3631 | + <li class="md-nav__item"> |
| 3632 | + <a href="#exporting-gpu-cost-and-other-metrics-to-prometheus" class="md-nav__link"> |
| 3633 | + <span class="md-ellipsis"> |
| 3634 | + |
| 3635 | + Exporting GPU, cost, and other metrics to Prometheus |
| 3636 | + |
| 3637 | + </span> |
| 3638 | + </a> |
| 3639 | + |
| 3640 | +</li> |
| 3641 | + |
3620 | 3642 | <li class="md-nav__item"> |
3621 | 3643 | <a href="#accessing-dev-environments-with-cursor" class="md-nav__link"> |
3622 | 3644 | <span class="md-ellipsis"> |
@@ -3723,6 +3745,55 @@ <h1 id="2025">2025<a class="headerlink" href="#2025" title="Permanent link">&par |
3723 | 3745 | <article class="md-post md-post--excerpt"> |
3724 | 3746 | <header class="md-post__header"> |
3725 | 3747 |
|
| 3748 | + <div class="md-post__meta md-meta"> |
| 3749 | + <ul class="md-meta__list"> |
| 3750 | + <li class="md-meta__item"> |
| 3751 | + <time datetime="2025-04-01 00:00:00+00:00">April 1, 2025</time></li> |
| 3752 | + |
| 3753 | + <li class="md-meta__item"> |
| 3754 | + in |
| 3755 | + |
| 3756 | + <a href="../../category/monitoring/" class="md-meta__link">Monitoring</a>, |
| 3757 | + <a href="../../category/nvidia/" class="md-meta__link">NVIDIA</a></li> |
| 3758 | + |
| 3759 | + |
| 3760 | + |
| 3761 | + <li class="md-meta__item"> |
| 3762 | + |
| 3763 | + 2 min read |
| 3764 | + |
| 3765 | + </li> |
| 3766 | + |
| 3767 | + |
| 3768 | + </ul> |
| 3769 | + |
| 3770 | + </div> |
| 3771 | + </header> |
| 3772 | + <div class="md-post__content md-typeset"> |
| 3773 | + <h2 id="exporting-gpu-cost-and-other-metrics-to-prometheus"><a class="toclink" href="../../prometheus/">Exporting GPU, cost, and other metrics to Prometheus</a></h2> |
| 3774 | +<h3 id="why-prometheus" style="display:none"><a class="toclink" href="../../prometheus/#why-prometheus">Why Prometheus</a></h3> |
| 3775 | +<p>Effective AI infrastructure management requires full visibility into compute performance and costs. AI researchers need |
| 3776 | +detailed insights into container- and GPU-level performance, while managers rely on cost metrics to track resource usage |
| 3777 | +across projects.</p> |
| 3778 | +<p>While <code>dstack</code> provides key metrics through its UI and <a href="../../dstack-stats/"><code>dstack stats</code></a> CLI, teams often need more granular data and prefer |
| 3779 | +using their own monitoring tools. To support this, we’ve introduced a new endpoint that allows real-time exporting all collected |
| 3780 | +metrics—covering fleets and runs—directly to Prometheus.</p> |
| 3781 | +<p><img src="https://github.com/dstackai/static-assets/blob/main/static-assets/images/dstack-prometheus-v3.png?raw=true" width="630"/></p> |
| 3782 | + |
| 3783 | + |
| 3784 | + <nav class="md-post__action"> |
| 3785 | + <a href="../../prometheus/"> |
| 3786 | + Continue reading |
| 3787 | + </a> |
| 3788 | + </nav> |
| 3789 | + |
| 3790 | + |
| 3791 | + </div> |
| 3792 | +</article> |
| 3793 | + |
| 3794 | + <article class="md-post md-post--excerpt"> |
| 3795 | + <header class="md-post__header"> |
| 3796 | + |
3726 | 3797 | <div class="md-post__meta md-meta"> |
3727 | 3798 | <ul class="md-meta__list"> |
3728 | 3799 | <li class="md-meta__item"> |
|
0 commit comments