feat(telemetry): implement app-extended-heartbeat event#301
feat(telemetry): implement app-extended-heartbeat event#301khanayan123 wants to merge 6 commits intomainfrom
Conversation
Add support for the app-extended-heartbeat telemetry event per the telemetry v2 API spec. The event fires periodically (default 24h) and includes the full configuration payload, matching app-started. The interval is configurable via DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL (integer seconds) to enable system testing with shorter intervals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BenchmarksBenchmark execution time: 2026-03-31 05:04:20 Comparing candidate commit 6766649 in PR branch Found 0 performance improvements and 1 performance regressions! Performance is the same for 0 metrics, 0 unstable metrics.
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eartbeat default Move extended heartbeat scheduling after metrics to preserve the positional task order expected by FakeEventScheduler in tests (heartbeat=0, metrics=1). Add default value check for extended_heartbeat_interval in test_configuration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The FakeEventScheduler used positional indexing to identify callbacks, which broke when the extended heartbeat task was added. Use interval duration to distinguish metrics (<=60s) from extended heartbeat (>60s) callbacks instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
🎯 Code Coverage (details) 🔗 Commit SHA: 6766649 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback! |
…load Add test that creates a telemetry instance with configuration, triggers the extended heartbeat, and verifies the payload contains the expected configuration entries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| MACRO(DD_VERSION, STRING, "") \ | ||
| MACRO(DD_TRACE_128_BIT_TRACEID_GENERATION_ENABLED, BOOLEAN, true) \ | ||
| MACRO(DD_TELEMETRY_HEARTBEAT_INTERVAL, DECIMAL, 10) \ | ||
| MACRO(DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL, INT, 86400) \ |
There was a problem hiding this comment.
maybe this could be a decimal, so that later on you don't have to static_cast(*maybe_value), but can just deal with a double
| std::string Telemetry::extended_heartbeat_payload() { | ||
| auto configuration_json = nlohmann::json::array(); | ||
|
|
||
| for (const auto& product : config_.products) { |
There was a problem hiding this comment.
config_.products would not reflect runtime configuration changes (e.g., via remote config) in the extended heartbeat, as this field is never updated, you'd need to introduce a new field to track config changes I think like so https://github.com/DataDog/dd-trace-cpp/pull/289/changes#diff-8e4b8c344253799b7a41954c017a79f7b026dca44849dc2dec9460120dc57a53R807
| for (const auto& [_, config_metadatas] : product.configurations) { | ||
| for (const auto& config_metadata : config_metadatas) { | ||
| configuration_json.emplace_back( | ||
| generate_configuration_field(config_metadata)); |
There was a problem hiding this comment.
not sure you want to call this function here, as it will increment the seq-id as if a new configuration had been added, maybe you want to split the function in 2, encode the field, and increment for new configs, like so https://github.com/DataDog/dd-trace-cpp/pull/289/changes#diff-8e4b8c344253799b7a41954c017a79f7b026dca44849dc2dec9460120dc57a53R800
| } | ||
| } | ||
|
|
||
| auto extended_hb_msg = nlohmann::json{ |
There was a problem hiding this comment.
should integrations too be sent? eg https://github.com/DataDog/dd-trace-cpp/pull/289/changes#diff-8e4b8c344253799b7a41954c017a79f7b026dca44849dc2dec9460120dc57a53R745
| REQUIRE(find_payload(message_batch["payload"], "app-heartbeat")); | ||
| } | ||
|
|
||
| SECTION("generates an extended heartbeat with configuration") { |
There was a problem hiding this comment.
if you change the above to also capture configuration changes, maybe you want to test it later with remote config changes like so https://github.com/DataDog/dd-trace-cpp/pull/289/changes#diff-6a69962f102d55319c4c00418c82707a9c13a11fe0f75195e6b60fd50da7627aR914
Summary
Implement the
app-extended-heartbeattelemetry event for the C++ tracer.Motivation
Long-running services (24h+) currently only report their configuration state via the initial
app-startedevent. If the backend misses or loses that event, there's no way to recover visibility into the SDK's configuration. Theapp-extended-heartbeatevent solves this by re-sending the full configuration payload every 24h, ensuring reliable state reporting for long-running instances.Implementation
The event fires periodically (default 24h) and includes the full
configurationpayload, matchingapp-started. The interval is configurable viaDD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVAL(integer seconds) for system test parity validation — production always uses the 24h default.Changes
include/datadog/environment.h: DeclareDD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVALenv var (INT, default 86400)include/datadog/telemetry/configuration.h: Addextended_heartbeat_interval_secondstoConfigurationandextended_heartbeat_intervaltoFinalizedConfigurationsrc/datadog/telemetry/configuration.cpp: Parse env var and finalize interval with validationsrc/datadog/telemetry/telemetry_impl.h: Declareextended_heartbeat_payload()methodsrc/datadog/telemetry/telemetry_impl.cpp: Schedule recurring extended heartbeat task; build payload with full configurationsupported-configurations.json: Updated supported configurations manifesttest/telemetry/test_configuration.cpp: Test for env var parsing and default valuetest/telemetry/test_telemetry.cpp: Test that extended heartbeat includes configuration payloadRelated