Minor improvements to benchmarks by yebai · Pull Request #1386 · TuringLang/DynamicPPL.jl

yebai · 2026-05-05T17:31:34Z

No description provided.

Bring DifferentiationInterface into the benchmarks env and adopt the flatter markdown layout (no <details> wrapper, no "Gist:" prefix). Released AbstractPPL/Bijectors are used instead of the fork-branch sources from the source branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pairs with the prior commit's benchmarks.jl markdown changes — the new workflow benches PR head and main side-by-side and wraps main's table in <details> on the CI side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the PrettyTables benchmark report with a manual text formatter modeled on posteriordb-bench: top/bottom `=` rules, centered `eval` and `gradient` banners, dashed subgroup underlines, and a stub of Model/dim/linked columns. Keep the current pivoted data shape, with a shared `primal` column and backend ratio columns labelled FwdDiff, RvsDiff, Mooncake, and Enzyme. While there, simplify the renderer by formatting rows once up front and using a single backend key/label table as the source of truth. Update the PR comment caption to explain that `primal` is shared `t(logdensity)` and the backend columns are `t(grad)/t(logdensity)`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-05T17:36:31Z

DynamicPPL.jl documentation for PR #1386 is available at:
https://TuringLang.github.io/DynamicPPL.jl/previews/PR1386/

codecov · 2026-05-05T17:44:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.26%. Comparing base (2691e7c) to head (8f48885).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1386   +/-   ##
=======================================
  Coverage   82.26%   82.26%           
=======================================
  Files          50       50           
  Lines        3535     3535           
=======================================
  Hits         2908     2908           
  Misses        627      627

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Restructure the comment so the table comes first, followed by a single paragraph explaining what each column means and how to read the AD backend ratios. Update the surrounding workflow text: - "## Benchmark Report" + separate PR head/Main lines collapsed into a single "## Benchmarks @ <sha>" heading. - Foldout summaries shortened to "Main @ <sha>" and "Environment". - Comparison hint ("compare against `main`") only appears when the baseline foldout is actually available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-05T20:58:42Z

Benchmarks @ `8f48885`

==================================================================================================
                                              eval                       gradient                 
                                           ----------  -------------------------------------------
Model                       dim    linked      primal     FwdDiff    RvsDiff    Mooncake    Enzyme
--------------------------------------------------------------------------------------------------
Simple assume observe         1     false     5.19 ns       10.72    1106.88       27.94     12.00
Simple assume observe         1      true     18.7 ns        5.94     469.84       16.41      3.37
Smorgasbord                 201     false     8.94 μs       55.89      89.44        7.79      6.21
Smorgasbord                 201      true     19.2 μs       29.44      55.16        3.72      2.86
Loop univariate 1k         1000     false     50.6 μs      367.10     103.98        3.54      2.72
Loop univariate 1k         1000      true     50.9 μs      513.91     110.49        3.35      2.61
Multivariate 1k            1000     false     42.5 μs      239.07      39.29        4.83      1.75
Multivariate 1k            1000      true     42.8 μs      243.80      39.35        5.41      1.85
Loop univariate 10k       10000     false    215.0 μs    11496.26     257.59        7.03      6.18
Loop univariate 10k       10000      true    226.0 μs    11508.06     258.79        6.27      5.81
Multivariate 10k          10000     false    247.0 μs     7057.37      70.30        9.65      1.83
Multivariate 10k          10000      true    244.0 μs     7019.44      72.24        9.42      1.93
Dynamic                      15     false     2.42 μs         err      30.93       11.94      8.83
Dynamic                      10      true     3.34 μs        2.07      41.36       10.51     17.50
Submodel                      1     false     5.49 ns       19.66    1415.59       53.79     11.66
Submodel                      1      true      5.2 ns       21.85    1496.48       46.69     12.49
LDA                          12      true     28.6 μs        0.62       2.03       23.36       err
==================================================================================================

Each row times one of DynamicPPL's reference models on this PR's head. Dim is the parameter count; Linked is true when parameters have been mapped to unconstrained space. t(logdensity) is the wall-clock time for one log-density evaluation. The AD (automatic differentiation) backend columns express gradient time as a multiple of t(logdensity) — a value of 10 means computing the gradient takes 10× as long as the log-density. Lower is better throughout; err means the backend errored on that model. Compare against main below to spot regressions.

Main @ 2691e7c

Gist: Smorgasbord

┌─────────────┬─────┬─────────────┬────────┬───────────────┬───────────────────────┐
│       Model │ Dim │  AD Backend │ Linked │ t(logdensity) │ t(grad)/t(logdensity) │
├─────────────┼─────┼─────────────┼────────┼───────────────┼───────────────────────┤
│ Smorgasbord │ 201 │ forwarddiff │  false │       6.21 μs │                 75.72 │
│ Smorgasbord │ 201 │ reversediff │  false │        6.3 μs │                127.41 │
│ Smorgasbord │ 201 │    mooncake │  false │       6.25 μs │                  6.40 │
│ Smorgasbord │ 201 │      enzyme │  false │       6.34 μs │                  6.92 │
│ Smorgasbord │ 201 │ forwarddiff │   true │        8.9 μs │                 65.07 │
│ Smorgasbord │ 201 │ reversediff │   true │       8.67 μs │                123.23 │
│ Smorgasbord │ 201 │    mooncake │   true │       8.92 μs │                  5.20 │
│ Smorgasbord │ 201 │      enzyme │   true │       8.89 μs │                  4.65 │
└─────────────┴─────┴─────────────┴────────┴───────────────┴───────────────────────┘

Full table (68 rows)

┌───────────────────────┬───────┬─────────────┬────────┬───────────────┬───────────────────────┐
│                 Model │   Dim │  AD Backend │ Linked │ t(logdensity) │ t(grad)/t(logdensity) │
├───────────────────────┼───────┼─────────────┼────────┼───────────────┼───────────────────────┤
│ Simple assume observe │     1 │ forwarddiff │  false │       6.28 ns │                 10.28 │
│ Simple assume observe │     1 │ reversediff │  false │       6.28 ns │               1058.79 │
│ Simple assume observe │     1 │    mooncake │  false │       6.28 ns │                 30.61 │
│ Simple assume observe │     1 │      enzyme │  false │       6.28 ns │                  6.20 │
│ Simple assume observe │     1 │ forwarddiff │   true │       21.5 ns │                  2.99 │
│ Simple assume observe │     1 │ reversediff │   true │       21.5 ns │                336.25 │
│ Simple assume observe │     1 │    mooncake │   true │       21.5 ns │                  9.21 │
│ Simple assume observe │     1 │      enzyme │   true │       21.4 ns │                  1.85 │
│           Smorgasbord │   201 │ forwarddiff │  false │       6.21 μs │                 75.72 │
│           Smorgasbord │   201 │ reversediff │  false │        6.3 μs │                127.41 │
│           Smorgasbord │   201 │    mooncake │  false │       6.25 μs │                  6.40 │
│           Smorgasbord │   201 │      enzyme │  false │       6.34 μs │                  6.92 │
│           Smorgasbord │   201 │ forwarddiff │   true │        8.9 μs │                 65.07 │
│           Smorgasbord │   201 │ reversediff │   true │       8.67 μs │                123.23 │
│           Smorgasbord │   201 │    mooncake │   true │       8.92 μs │                  5.20 │
│           Smorgasbord │   201 │      enzyme │   true │       8.89 μs │                  4.65 │
│    Loop univariate 1k │  1000 │ forwarddiff │  false │       18.8 μs │               1078.72 │
│    Loop univariate 1k │  1000 │ reversediff │  false │       18.9 μs │                284.18 │
│    Loop univariate 1k │  1000 │    mooncake │  false │       19.4 μs │                  8.40 │
│    Loop univariate 1k │  1000 │      enzyme │  false │       18.7 μs │                  6.90 │
│    Loop univariate 1k │  1000 │ forwarddiff │   true │       20.6 μs │               1445.64 │
│    Loop univariate 1k │  1000 │ reversediff │   true │       21.1 μs │                261.48 │
│    Loop univariate 1k │  1000 │    mooncake │   true │       20.7 μs │                  7.87 │
│    Loop univariate 1k │  1000 │      enzyme │   true │       20.3 μs │                  6.45 │
│       Multivariate 1k │  1000 │ forwarddiff │  false │       26.0 μs │                307.79 │
│       Multivariate 1k │  1000 │ reversediff │  false │       26.7 μs │                 63.65 │
│       Multivariate 1k │  1000 │    mooncake │  false │       29.3 μs │                  7.63 │
│       Multivariate 1k │  1000 │      enzyme │  false │       25.2 μs │                  1.98 │
│       Multivariate 1k │  1000 │ forwarddiff │   true │       25.3 μs │                268.24 │
│       Multivariate 1k │  1000 │ reversediff │   true │       25.6 μs │                 66.95 │
│       Multivariate 1k │  1000 │    mooncake │   true │       24.5 μs │                  9.06 │
│       Multivariate 1k │  1000 │      enzyme │   true │       23.7 μs │                  2.04 │
│   Loop univariate 10k │ 10000 │ forwarddiff │  false │      176.0 μs │              14179.52 │
│   Loop univariate 10k │ 10000 │ reversediff │  false │      178.0 μs │                331.97 │
│   Loop univariate 10k │ 10000 │    mooncake │  false │      178.0 μs │                  9.25 │
│   Loop univariate 10k │ 10000 │      enzyme │  false │      178.0 μs │                  7.06 │
│   Loop univariate 10k │ 10000 │ forwarddiff │   true │      199.0 μs │              13193.49 │
│   Loop univariate 10k │ 10000 │ reversediff │   true │      220.0 μs │                266.32 │
│   Loop univariate 10k │ 10000 │    mooncake │   true │      198.0 μs │                  8.25 │
│   Loop univariate 10k │ 10000 │      enzyme │   true │      198.0 μs │                  6.36 │
│      Multivariate 10k │ 10000 │ forwarddiff │  false │      220.0 μs │               4257.93 │
│      Multivariate 10k │ 10000 │ reversediff │  false │      221.0 μs │                 80.71 │
│      Multivariate 10k │ 10000 │    mooncake │  false │      220.0 μs │                  9.82 │
│      Multivariate 10k │ 10000 │      enzyme │  false │      220.0 μs │                  1.82 │
│      Multivariate 10k │ 10000 │ forwarddiff │   true │      217.0 μs │               4347.01 │
│      Multivariate 10k │ 10000 │ reversediff │   true │      217.0 μs │                 80.54 │
│      Multivariate 10k │ 10000 │    mooncake │   true │      218.0 μs │                  9.93 │
│      Multivariate 10k │ 10000 │      enzyme │   true │      218.0 μs │                  1.83 │
│               Dynamic │    15 │ forwarddiff │  false │           err │                   err │
│               Dynamic │    15 │ reversediff │  false │       1.43 μs │                 43.30 │
│               Dynamic │    15 │    mooncake │  false │       1.47 μs │                 11.70 │
│               Dynamic │    15 │      enzyme │  false │       1.42 μs │                 10.73 │
│               Dynamic │    10 │ forwarddiff │   true │       1.93 μs │                  1.94 │
│               Dynamic │    10 │ reversediff │   true │       1.97 μs │                 54.83 │
│               Dynamic │    10 │    mooncake │   true │       1.98 μs │                  9.89 │
│               Dynamic │    10 │      enzyme │   true │       1.99 μs │                 17.24 │
│              Submodel │     1 │ forwarddiff │  false │       6.29 ns │                 10.34 │
│              Submodel │     1 │ reversediff │  false │       6.29 ns │               1233.34 │
│              Submodel │     1 │    mooncake │  false │       6.28 ns │                 30.73 │
│              Submodel │     1 │      enzyme │  false │       6.29 ns │                  6.26 │
│              Submodel │     1 │ forwarddiff │   true │       5.98 ns │                 10.77 │
│              Submodel │     1 │ reversediff │   true │        6.3 ns │               1327.82 │
│              Submodel │     1 │    mooncake │   true │       6.28 ns │                 30.80 │
│              Submodel │     1 │      enzyme │   true │       6.29 ns │                  6.21 │
│                   LDA │    12 │ forwarddiff │   true │       22.5 μs │                  0.49 │
│                   LDA │    12 │ reversediff │   true │       23.8 μs │                  1.85 │
│                   LDA │    12 │    mooncake │   true │       23.4 μs │                 29.87 │
│                   LDA │    12 │      enzyme │   true │           err │                   err │
└───────────────────────┴───────┴─────────────┴────────┴───────────────┴───────────────────────┘

Environment

Julia Version 1.11.9
Commit 53a02c0720c (2026-02-06 00:27 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, icelake-server)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

seabbs · 2026-05-06T09:43:27Z

+            benchmarks/results.md
+            benchmarks/version_info.txt
+
+  benchmark-main:


This seems to kind of revert the previous PR but now the comparision rartios have been removed? Quite confused about this churn.

Removing [this](https://github.com/TuringLang/DynamicPPL.jl/pull/1386/changes#diff-387b58af3053f1318400f94ed565d16f15808d793d09e989377000c949cc3310L55) in PR #1386 caused it Closes #1389

yebai and others added 3 commits May 5, 2026 11:32

github-actions Bot assigned yebai May 5, 2026

format

70cf315

yebai marked this pull request as ready for review May 5, 2026 17:34

yebai and others added 3 commits May 5, 2026 21:03

Use plain text for benchmark main-job failure note

1971c91

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Print only the full benchmark table in markdown mode

8f48885

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

TuringLang deleted a comment from github-actions Bot May 5, 2026

yebai added this pull request to the merge queue May 5, 2026

yebai removed this pull request from the merge queue due to a manual request May 5, 2026

yebai merged commit 4dfc048 into main May 5, 2026
22 checks passed

yebai deleted the benchmarks branch May 5, 2026 21:03

seabbs reviewed May 6, 2026

View reviewed changes

yebai mentioned this pull request May 6, 2026

Improve benchmarks 1374 #1385

Merged

shravanngoswamii mentioned this pull request May 8, 2026

Fix duplicate benchmark comments #1391

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor improvements to benchmarks#1386

Minor improvements to benchmarks#1386
yebai merged 7 commits into
mainfrom
benchmarks

yebai commented May 5, 2026

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

codecov Bot commented May 5, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 5, 2026

Gist: Smorgasbord

Uh oh!

Uh oh!

Uh oh!

seabbs May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yebai commented May 5, 2026

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

codecov Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented May 5, 2026

Benchmarks @ 8f48885

Gist: Smorgasbord

Uh oh!

Uh oh!

Uh oh!

seabbs May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented May 5, 2026 •

edited

Loading

Benchmarks @ `8f48885`