Skip to content

Fix _dd.p.ksr scientific notation for very small sampling rates#5497

Merged
bm1549 merged 1 commit intomasterfrom
brian.marks/fix-ksr-scientific-notation
Mar 24, 2026
Merged

Fix _dd.p.ksr scientific notation for very small sampling rates#5497
bm1549 merged 1 commit intomasterfrom
brian.marks/fix-ksr-scientific-notation

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Mar 24, 2026

What does this PR do?

Fix _dd.p.ksr span tag formatting for very small sampling rates. Rates below 0.001 were formatted using Ruby's %.6g format which outputs scientific notation (e.g. 0.000001"1e-06"). Changed to explicit integer-level rounding and %.6f formatting to always produce decimal notation with up to 6 decimal digits, trailing zeros stripped.

Motivation:

Fixes APMAPI-1869. System tests expect decimal notation for _dd.p.ksr values (see DataDog/system-tests#6466).

Related PRs:

Change log entry

Yes. Fix _dd.p.ksr span tag formatting for very small sampling rates to use decimal notation instead of scientific notation.

Additional Notes:

Uses (rate * 1e6).round / 1e6.to_f for consistent rounding at the integer level, avoiding IEEE 754 precision issues with %.6f alone.

How to test the change?

Updated unit tests in spec/datadog/tracing/transport/trace_formatter_spec.rb covering:

  • Rate 0.000001 → "0.000001" (was "1e-06")
  • Rate 0.0000001 → "0" (rounds to zero, new case)
  • Rate 0.0000005 → "0.000001" (rounds up, new case)

Very small sampling rates (e.g. 0.000001) were formatted using Ruby's
%.6g format which outputs scientific notation like "1e-06". This changes
to explicit rounding at the integer level and %.6f formatting to always
produce decimal notation with up to 6 decimal digits, trailing zeros
stripped (e.g. "0.000001").

Fixes APMAPI-1869

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 bot commented Mar 24, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 95.14% (-0.01%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 515477f | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 24, 2026

Benchmarks

Benchmark execution time: 2026-03-24 01:15:37

Comparing candidate commit 515477f in PR branch brian.marks/fix-ksr-scientific-notation with baseline commit eadefc7 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

@bm1549 bm1549 marked this pull request as ready for review March 24, 2026 01:43
@bm1549 bm1549 requested review from a team as code owners March 24, 2026 01:43
@bm1549 bm1549 requested a review from vpellan March 24, 2026 01:43
@bm1549 bm1549 merged commit 539b1ba into master Mar 24, 2026
632 of 634 checks passed
@bm1549 bm1549 deleted the brian.marks/fix-ksr-scientific-notation branch March 24, 2026 15:23
gh-worker-dd-mergequeue-cf854d bot pushed a commit to DataDog/dd-trace-py that referenced this pull request Mar 25, 2026
…tific notation (#17086)

## Description

Fix `_dd.p.ksr` span tag formatting for very small sampling rates. Previously, rates below 0.001 were formatted using Python's `:.6g` format which outputs scientific notation (e.g. `0.000001` → `"1e-06"`). This changes to explicit integer-level rounding and `:.6f` formatting to always produce decimal notation with up to 6 decimal digits, trailing zeros stripped.

**Related PRs:**
- dd-trace-rb: DataDog/dd-trace-rb#5497
- dd-trace-js: DataDog/dd-trace-js#7846
- system-tests: DataDog/system-tests#6466

Fixes APMAPI-1869

## Testing

- Added parametrized unit tests in `tests/tracer/test_sampler.py::test_ksr_formatting` covering:
  - Rate 1.0 → `"1"` (trailing zeros stripped)
  - Rate 0.000001 → `"0.000001"` (6 decimal precision boundary)
  - Rate 0.0000001 → `"0"` (below precision, rounds to zero)
  - Rate 0.0000005 → `"0.000001"` (rounds up to one millionth)
  - Rate 0.5 → `"0.5"` (simple case)
  - Rate 0.7654321 → `"0.765432"` (truncation at 6 decimal places)
- These match the system test cases in DataDog/system-tests#6466

## Risks

None — the formatting change only affects `_dd.p.ksr` string values for very small rates. Values ≥ 0.001 produce identical output to the previous `:.6g` format.

## Additional Notes

Uses `math.floor(rate * 1e6 + 0.5) / 1e6` instead of `round(rate * 1e6) / 1e6` to avoid Python's banker's rounding which would round `0.0000005` down to `0` instead of up to `0.000001`.

Co-authored-by: brian.marks <brian.marks@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants