fix(telemetry): concurrency issue on endpoint unreachable#213
Closed
pablomartinezbernardo wants to merge 1 commit intomainfrom
Closed
fix(telemetry): concurrency issue on endpoint unreachable#213pablomartinezbernardo wants to merge 1 commit intomainfrom
pablomartinezbernardo wants to merge 1 commit intomainfrom
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #213 +/- ##
==========================================
- Coverage 87.02% 86.96% -0.07%
==========================================
Files 80 80
Lines 5140 5154 +14
==========================================
+ Hits 4473 4482 +9
- Misses 667 672 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
BenchmarksBenchmark execution time: 2025-05-13 13:04:43 Comparing candidate commit 8743842 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics. |
39ce617 to
1b46b3c
Compare
Datadog Summary✅ Code Quality ✅ Code Security ✅ Dependencies Was this helpful? Give us feedback! |
0e3853f to
8743842
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Apparent when upgrading dd-trace-cpp for rum injection IIS e2e tests. This situation pops up when the request to telemetry takes longer than the drain time in the constructor.
The issue is sort of hiding in this part of the code
A first Telemetry instance is created in
make_telemetry. When constructingTelemetryProxythat instance is not passed as a reference, but instead theTelemetry(Telemetry&&)constructor is invoked, which moves members from the first instance to the second. The initial Telemetry instance which attempted to send theapp-startedevent, calls back on error a bit over 2 seconds after (In my Windows machine, it takes ~2200ms to error), accessing a variable (counters_in this case) that was moved from and now in an invalid state.If the request takes shorter than the drain time, it is not an issue because when
drain()exits all callbacks are already complete, and no more callbacks will be done to the original Telemetry object.This is not a great solution, because the update to
counters_will be ignored, but created the PR for illustration purposes. A better solution might be to not dosend_telemetry("app-started", app_started());in the constructor and instead do it in the init function, which will avoid receiving callbacks in a moved-from instance.Motivation
Additional Notes
Jira ticket: [PROJ-IDENT]