Add _dd.p.ksr propagated tag for Knuth sampling rate#8287
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Knuth sampling rate tag (_dd.p.ksr) was using TryAddTag which has first-write semantics, preserving any existing upstream value. This is inconsistent with all other tracers which unconditionally overwrite ksr with the locally-computed value. Switch to SetTag (replaceIfExists: true) so the local sampling rate always takes precedence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BenchmarksBenchmark execution time: 2026-03-24 21:30:40 Comparing candidate commit b760abc in PR branch Found 4 performance improvements and 13 performance regressions! Performance is the same for 260 metrics, 11 unstable metrics.
|
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8287) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (69ms) : 67, 71
master - mean (69ms) : 66, 71
section Bailout
This PR (8287) - mean (72ms) : 71, 74
master - mean (73ms) : 71, 74
section CallTarget+Inlining+NGEN
This PR (8287) - mean (1,049ms) : 995, 1104
master - mean (1,049ms) : 999, 1098
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (107ms) : 104, 110
master - mean (107ms) : 104, 111
section Bailout
This PR (8287) - mean (108ms) : 106, 109
master - mean (108ms) : 106, 110
section CallTarget+Inlining+NGEN
This PR (8287) - mean (773ms) : 754, 792
master - mean (772ms) : 757, 788
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (95ms) : 93, 97
master - mean (95ms) : 92, 98
section Bailout
This PR (8287) - mean (96ms) : 94, 98
master - mean (95ms) : 94, 97
section CallTarget+Inlining+NGEN
This PR (8287) - mean (769ms) : 747, 791
master - mean (758ms) : 735, 781
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (94ms) : 91, 96
master - mean (94ms) : 91, 96
section Bailout
This PR (8287) - mean (95ms) : 93, 97
master - mean (95ms) : 93, 96
section CallTarget+Inlining+NGEN
This PR (8287) - mean (669ms) : 642, 695
master - mean (661ms) : 639, 684
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (209ms) : 203, 214
master - mean (215ms) : 205, 224
section Bailout
This PR (8287) - mean (213ms) : 206, 221
master - mean (220ms) : 209, 230
section CallTarget+Inlining+NGEN
This PR (8287) - mean (1,228ms) : 1165, 1290
master - mean (1,252ms) : 1193, 1311
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (302ms) : 291, 313
master - mean (311ms) : 299, 324
section Bailout
This PR (8287) - mean (305ms) : 292, 317
master - mean (312ms) : 300, 325
section CallTarget+Inlining+NGEN
This PR (8287) - mean (1,006ms) : 980, 1032
master - mean (1,029ms) : 998, 1060
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (296ms) : 284, 309
master - mean (305ms) : 294, 317
section Bailout
This PR (8287) - mean (295ms) : 286, 304
master - mean (305ms) : 294, 316
section CallTarget+Inlining+NGEN
This PR (8287) - mean (1,036ms) : 1007, 1066
master - mean (1,057ms) : 1018, 1096
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8287) - mean (295ms) : 286, 305
master - mean (302ms) : 291, 313
section Bailout
This PR (8287) - mean (294ms) : 286, 303
master - mean (303ms) : 290, 316
section CallTarget+Inlining+NGEN
This PR (8287) - mean (942ms) : 845, 1039
master - mean (1,021ms) : 920, 1122
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The new _dd.p.ksr propagated tag added in the ksr PR was not listed as an optional tag in DefaultTagAssertions, causing ValidateIntegrationSpans to fail with "Expected to have no remaining tags" in AspNetMvc4Tests and other Windows IIS integration tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 16eaa8a188
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
# What does this PR do? Adds `_dd.p.ksr` (Knuth Sampling Rate) as a propagated tag set when agent-based or rule-based sampling decisions are made. The tag is stored in span `meta` (string type) with up to 6 significant digits and no trailing zeros. `format_sampling_rate` now returns `Option<String>` and guards against invalid inputs (negative, >1.0, NaN, infinity), returning `None` instead of producing garbage output. # Motivation To enable consistent sampling across tracers and backend retention filters, the backend needs to know the sampling rate applied by the tracer. Without transmitting the tracer's rate via `_dd.p.ksr`, backend resampling cannot correctly compute effective rates in multi-stage sampling scenarios. See RFC: "Transmit Knuth sampling rate to backend" # Additional Notes Key files changed: - `datadog-opentelemetry/src/core/constants.rs` — Added `SAMPLING_KNUTH_RATE_TAG_KEY` constant - `datadog-opentelemetry/src/sampling/datadog_sampler.rs` — Added `format_sampling_rate()` helper (returns `Option<String>`, defensive against invalid rates) and set ksr in agent/rule sampling paths - Updated 2 snapshot JSON files Related PRs across tracers: - Java: DataDog/dd-trace-java#10802 - .NET: DataDog/dd-trace-dotnet#8287 - Ruby: DataDog/dd-trace-rb#5436 - Node.js: DataDog/dd-trace-js#7741 - PHP: DataDog/dd-trace-php#3701 - C++: DataDog/dd-trace-cpp#288 - System tests: DataDog/system-tests#6466 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What does this PR do? Fixes `_dd.p.ksr` (Knuth Sampling Rate) to only be set on spans when the agent has provided sampling rates via `readRatesJSON()`. Previously, ksr was unconditionally set in `prioritySampler.apply()`, including when the rate was the initial client-side default (1.0) before any agent response arrived. Also refactors `prioritySampler` to consolidate lock acquisitions: extracts `getRateLocked()` so `apply()` acquires `ps.mu.RLock` only once to read both the rate and `agentRatesLoaded`. ### Motivation Cross-language consistency: Python, Java, PHP, and other tracers only set ksr when actual agent rates or sampling rules are applied, not for the default fallback. This aligns Go with that behavior. See RFC: "Transmit Knuth sampling rate to backend" ### Additional Notes - Added `agentRatesLoaded` bool field to `prioritySampler`, set to `true` in `readRatesJSON()` - `apply()` now gates ksr behind `agentRatesLoaded` check - Extracted `getRateLocked()` to avoid double lock acquisition in `apply()` - Rule-based sampling path (`applyTraceRuleSampling` in span.go) unchanged — correctly always sets ksr - Tests added: `ksr-not-set-without-agent-rates` and `ksr-set-after-agent-rates-received` Related PRs across tracers: - Java: DataDog/dd-trace-java#10802 - .NET: DataDog/dd-trace-dotnet#8287 - Ruby: DataDog/dd-trace-rb#5436 - Node.js: DataDog/dd-trace-js#7741 - PHP: DataDog/dd-trace-php#3701 - Rust: DataDog/dd-trace-rs#180 - C++: DataDog/dd-trace-cpp#288 - System tests: DataDog/system-tests#6466 ### Reviewer's Checklist - [x] Changed code has unit tests for its functionality at or near 100% coverage. - [x] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag. - [ ] There is a benchmark for any new code, or changes to existing code. - [x] If this interacts with the agent in a new way, a system test has been added. - [x] New code is free of linting errors. You can check this by running `make lint` locally. - [x] New code doesn't break existing tests. You can check this by running `make test` locally. - [ ] Add an appropriate team label so this PR gets put in the right place for the release notes. - [ ] All generated files are up to date. You can check this by running `make generate` locally. - [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally. Unsure? Have a question? Request a review! 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Dario Castañé <dario.castane@datadoghq.com> Co-authored-by: Mikayla Toffler <46911781+mtoffl01@users.noreply.github.com>
Address Codex review feedback: change SetTag to TryAddTag for the _dd.p.ksr propagated tag so it follows the same first-write-wins semantics as AppliedSamplingRate and SamplingMechanism. This prevents inconsistency where _dd.p.ksr could reflect a newer rate while _dd.agent_psr/_dd.rule_psr still come from the original rate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tracer/test/Datadog.Trace.Tests/TraceContextTests_KnuthSamplingRate.cs
Outdated
Show resolved
Hide resolved
Change format specifier from "G6" (6 significant digits, allows scientific notation) to "0.######" (up to 6 decimal digits) per RFC. This fixes 0.00001 being formatted as "1E-05" instead of "0.00001". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| or Sampling.SamplingMechanism.RemoteUserSamplingRule) | ||
| { | ||
| // format with up to 6 decimal digits, no trailing zeros (per RFC) | ||
| Tags.TryAddTag(Trace.Tags.Propagated.KnuthSamplingRate, samplingRate.ToString("0.######", CultureInfo.InvariantCulture)); |
There was a problem hiding this comment.
I wonder if it's worth hard-coding the string for common sampling rates like "1" to avoid these heap allocations for every trace. Or caching the string in the rule itself (which allows caching the string for different rates, for example if one rules has "1" and another has "0.5").
Maybe something we can do in a follow-up PR. Thoughts, @andrewlock?
There was a problem hiding this comment.
Yeah, nice idea 🤔 Although I wonder if we should even be adding this here, and shouldn't be just adding it at serialization time. We should avoid adding tags like this wherever possible 😬
Summary of changes
Add
_dd.p.ksr(Knuth Sampling Rate) propagated tag to spans when sampling is applied via agent rates or trace sampling rules, per the Transmit Knuth Sampling Rate to Backend RFC.Reason for change
The backend needs to know the exact sampling rate applied by the tracer to correctly compute effective rates during resampling (e.g., tracer 0.5 × backend 0.5 = effective 0.25). This tag enables that by propagating the rate via
x-datadog-tagsand W3Ctracestate.Implementation details
_dd.p.ksrinTraceContext.SetSamplingPriority()forAgentRate,LocalTraceSamplingRule,RemoteAdaptiveSamplingRule, andRemoteUserSamplingRulemechanismsTryAddTagto preserve the original rate (consistent withAppliedSamplingRate ??= ratesemantics)"0.######"(up to 6 decimal digits, no trailing zeros, no scientific notation) per RFC spec.IsOptional("_dd.p.ksr")toSpanTagAssertion.csso integration test tag validators accept the new tagTest coverage
TraceContextTests_KnuthSamplingRate.cs:Other details
Related PRs across tracers:
_dd.p.ksrformatting to use 6 significant digits without trailing zeros dd-trace-cpp#288🤖 Generated with Claude Code