fix(tracer): only set _dd.p.ksr after agent rates are received#4523
fix(tracer): only set _dd.p.ksr after agent rates are received#4523
Conversation
…ceived The priority sampler was unconditionally setting the _dd.p.ksr tag on every span, even when using the initial client-side default rate (1.0) before any agent response was received. This is inconsistent with other Datadog tracers, which only propagate ksr when actual agent rates or sampling rules are applied. Add an agentRatesLoaded flag to prioritySampler that is set to true the first time readRatesJSON processes an agent response. The apply method now only sets keyKnuthSamplingRate when this flag is true. Rule-based sampling (applyTraceRuleSampling) is unaffected and continues to set ksr unconditionally, which is the correct behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
🚀 New features to boost your workflow:
|
|
✅ Tests 🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: cfbe029 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback! |
BenchmarksBenchmark execution time: 2026-03-19 20:28:20 Comparing candidate commit cfbe029 in PR branch Found 10 performance improvements and 0 performance regressions! Performance is the same for 146 metrics, 8 unstable metrics.
|
|
Can we resolve the conflicts with main? There's been some changes to the sampling rate functions since you last pulled. |
Resolve conflicts in ddtrace/tracer/sampler.go: keep both agentRatesLoaded field (from this branch) and lastCapped field with rate capping logic (from main). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
genesor
left a comment
There was a problem hiding this comment.
Looks great, one comment about lock
ddtrace/tracer/sampler.go
Outdated
| // received. The initial default rate (1.0) is a client-side fallback that | ||
| // does not represent an agent-configured rate, so it must not propagate | ||
| // as _dd.p.ksr to stay consistent with other tracers. | ||
| ps.mu.RLock() |
There was a problem hiding this comment.
We're now getting the locking twice in this PR.
We could acquire the lock in this function get the rate and the loaded state in the same lock and release it.
Adding a function getRateLocked would help us do that (leveraging the /internal/locking/assert package)
func (ps *prioritySampler) getRate(spn *Span) float64 {
ps.mu.RLock()
defer ps.mu.RUnlock()
return ps.getRateLocked()
}
func (ps *prioritySampler) getRateLocked(spn *Span) float64 {
assert.RWMutexLocked(&ps.mu)
key := serviceEnvKey{service: spn.service, env: spn.meta[ext.Environment]}
if rate, ok := ps.rates[key]; ok {
return rate
}
return ps.defaultRate
}
func (ps *prioritySampler) apply(spn *Span) {
ps.mu.RLock()
rate := ps.getRateLocked(spn)
loaded := ps.agentRatesLoaded
ps.mu.RUnlock()
...…pply Extract getRateLocked() so apply() acquires ps.mu.RLock only once to read both the rate and agentRatesLoaded, addressing review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Dario Castañé <dario.castane@datadoghq.com>
mtoffl01
left a comment
There was a problem hiding this comment.
Left some improvements otherwise LGTM
Co-authored-by: Mikayla Toffler <46911781+mtoffl01@users.noreply.github.com>
Co-authored-by: Mikayla Toffler <46911781+mtoffl01@users.noreply.github.com>
What does this PR do?
Fixes
_dd.p.ksr(Knuth Sampling Rate) to only be set on spans when the agent has provided sampling rates viareadRatesJSON(). Previously, ksr was unconditionally set inprioritySampler.apply(), including when the rate was the initial client-side default (1.0) before any agent response arrived.Also refactors
prioritySamplerto consolidate lock acquisitions: extractsgetRateLocked()soapply()acquiresps.mu.RLockonly once to read both the rate andagentRatesLoaded.Motivation
Cross-language consistency: Python, Java, PHP, and other tracers only set ksr when actual agent rates or sampling rules are applied, not for the default fallback. This aligns Go with that behavior.
See RFC: "Transmit Knuth sampling rate to backend"
Additional Notes
agentRatesLoadedbool field toprioritySampler, set totrueinreadRatesJSON()apply()now gates ksr behindagentRatesLoadedcheckgetRateLocked()to avoid double lock acquisition inapply()applyTraceRuleSamplingin span.go) unchanged — correctly always sets ksrksr-not-set-without-agent-ratesandksr-set-after-agent-rates-receivedRelated PRs across tracers:
_dd.p.ksrformatting to use 6 significant digits without trailing zeros dd-trace-cpp#288Reviewer's Checklist
make lintlocally.make testlocally.make generatelocally.make fix-moduleslocally.Unsure? Have a question? Request a review!
🤖 Generated with Claude Code