Skip to content

Bake Node Exporter and DCGM Exporter into Azure HPC Images under /opt/azurehpc/monitoring #421

@Daramfon10

Description

@Daramfon10

Background: As part of our observability pipeline for Azure HPC VMs, we are leveraging the Azure Monitor Agent (AMA) to scrape and publish telemetry. We rely on exporters to collect this telemetry. However, AMA does not currently bundle exporters (e.g., Node Exporter, DCGM Exporter) directly.

To simplify the startup process and ensure consistency across nodes, we propose embedding the exporters into the image itself. This ensures that exporters are always present at a known path (/opt/azurehpc/monitoring/) and AMA can reliably start them via docker run or by executing local binaries.

This approach reduces runtime dependencies, speeds up telemetry availability, and reduces the risk of runtime failures during scale-out caused by image pull delays, network issues, or inconsistencies in exporter versions and file paths across nodes.

Ask: We would like to request that both the Node Exporter and DCGM Exporter directories be pre-baked into the Azure HPC images under - /opt/azurehpc/monitoring. Here is the link to both repositories:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions