[SIP] Correct totals/subtotals for non-additive metrics (Table + Pivot) by rusackas · Pull Request #41184 · apache/superset

rusackas · 2026-06-18T06:09:14Z

SUMMARY

Draft SIP + prototype development zone for fixing one of Superset's
longest-standing correctness gaps: totals and subtotals are wrong for
non-additive metrics (ratios, COUNT_DISTINCT, AVG, percentiles, and
"Percentage metrics") in both the Table and Pivot Table charts.

The proposal lives in SIP.md. This PR is intentionally a draft so
the SIP, a failing-test matrix of the known reported cases, and the POC
implementation can be rounded out together (TDD: the POC drives the matrix
green).

Core principle: totals must be computed by the DB at the grouping
granularity, never by re-aggregating already-aggregated cells. A SQL aggregate
metric is correct at any grouping level if the same expression is evaluated
grouped at that level, so no formula language is needed for the common cases.

Approach: detect additivity (additive-only = current cheap path, unchanged)
→ for non-additive, compute rollup levels via native GROUPING SETS in a
single query → fall back to multi-query (the #34592 approach) on engines lacking
a new supports_grouping_sets capability → recompute (don't sum) post-processing
% columns → one shared mechanism for Table + Pivot.

Consolidates: #25747 (canonical), #32260, #38674, #36165, #37627,
#34350/#34425/#34426. Design thread: #29297. Builds on prior art in #34592.

Out of scope: #39223, SIP-179 (#34245), #35089, SIP-205 (#38586).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A (proposal/scaffold).

TESTING INSTRUCTIONS

The TDD acceptance matrix is in SIP.md (§ Validation / TDD test matrix).
Failing tests and POC to follow in this PR.

ADDITIONAL INFORMATION

Has associated issue: relates to TOTALS incorrect for non-additive metric in pivot table #25747, Totals in table charts #29297
Required feature flags: TBD (possible opt-in flag during rollout)
Changes UI
Includes DB Migration (please mention transaction, lock, and revert plan)
Introduces new feature or API
Removes existing feature or API

Draft — not for review yet. Opening early as a shared SIP + prototype space.

Draft proposal + prototype dev zone for correcting totals and subtotals for non-additive metrics across Table and Pivot Table charts. Consolidates issues apache#25747, apache#32260, apache#38674, apache#36165, apache#37627, apache#34350/apache#34425/apache#34426 and the design discussion apache#29297. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…lvage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Unit-layer acceptance tests encoding the known reported cases from SIP.md: additive fast-path (green guard), the contribution zero-trap behind apache#37627 (characterized), and the ratio/apache#25747 + distinct-count/apache#36165 totals as strict xfails the POC will flip green. CI stays green via xfail(strict). Refs apache#25747, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

netlify · 2026-06-18T16:44:32Z

✅ Deploy Preview for superset-docs-preview ready!

Name	Link
🔨 Latest commit	`b685383`
🔍 Latest deploy log	https://app.netlify.com/projects/superset-docs-preview/deploys/6a342007f001a300086c1d2c
😎 Deploy Preview	https://deploy-preview-41184--superset-docs-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.
🤖 Make changes	Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

codecov · 2026-06-18T16:51:00Z

Codecov Report

❌ Patch coverage is 78.64078% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.31%. Comparing base (ae0b1f0) to head (c246967).
⚠️ Report is 24 commits behind head on master.

Files with missing lines	Patch %	Lines
...s/plugin-chart-pivot-table/src/PivotTableChart.tsx	0.00%	14 Missing ⚠️
...hart-pivot-table/src/react-pivottable/utilities.ts	90.19%	5 Missing ⚠️
...gin-chart-pivot-table/src/plugin/transformProps.ts	77.77%	2 Missing ⚠️
...ivot-table/src/react-pivottable/TableRenderers.tsx	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #41184      +/-   ##
==========================================
- Coverage   64.31%   64.31%   -0.01%     
==========================================
  Files        2651     2652       +1     
  Lines      144773   144842      +69     
  Branches    33407    33439      +32     
==========================================
+ Hits        93111    93155      +44     
- Misses      49992    50011      +19     
- Partials     1670     1676       +6

Flag	Coverage Δ
hive	`39.32% <ø> (-0.01%)`	⬇️
javascript	`68.44% <78.64%> (-0.01%)`	⬇️
mysql	`58.07% <ø> (+<0.01%)`	⬆️
postgres	`58.14% <ø> (+0.01%)`	⬆️
presto	`40.90% <ø> (-0.01%)`	⬇️
python	`59.58% <ø> (+<0.01%)`	⬆️
sqlite	`57.79% <ø> (+<0.01%)`	⬆️
unit	`100.00% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

End-to-end tests on the api/v1/chart/data path against birth_names: - Bucket A regression guards: ratio + COUNT_DISTINCT grand totals are DB-computed at no-GROUP-BY, so they're already correct for the table summary (not the sum of per-group cells). - Bucket B xfail(strict): percent/contribution column is absent from the summary row because the totals query is built with post_processing=[] (apache#37627, apache#34350). Uses get_query_result (not get_payload) to avoid the cache layer. Refs apache#25747, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

CI confirmed the Bucket A hypothesis: the ratio grand total is DB-computed (passed) and Bucket B percent column is broken (xfailed). The distinct-count guard failed on a too-strong assertion (assert 619 < 619) because names don't recur across states in the birth_names fixture. Switch to COUNT_DISTINCT(gender), which recurs across every state, so summing per-group counts genuinely double-counts and the grand total (=2) is strictly less. Refs apache#25747, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

POC (phase 1, Table chart) for the non-additive totals SIP. CI confirmed the Table grand total is already DB-correct (Bucket A); the only Table defect was Bucket B: the show_totals summary query was built with post_processing=[], dropping the contribution op behind percent metrics, so the summary % column came back empty (apache#37627, apache#34350). Retain post_processing on that query -- with no GROUP BY the total's contribution to itself is 100%. - buildQuery.ts: inherit post_processing on the show_totals totals query - buildQuery.test.ts: assert the summary query keeps the contribution op - integration: reframe the Bucket B case as a backend-capability guard - SIP.md: record the empirical finding and phase-1 progress Refs apache#25747, apache#37627, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Phase 2 of the non-additive totals SIP (pivot table). Adopt the rollup-level combination generator from apache#34592 plus its 10 unit tests, and add the Groupby type. Each combination is a prefix of row dims x prefix of column dims; the empty combination is the grand total. buildQuery will issue one query per level so the DB computes each total at its own granularity (multi-query fallback; GROUPING SETS optimization to follow). Pure addition (not yet wired into buildQuery/transformProps, which must change together). SIP.md updated with phase-2 plan + the additivity gate refinement. Refs apache#25747, apache#32260, apache#36165, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Break down the remaining pivot work (QueryData type, gated multi-query buildQuery, transformProps assembly, and the ~400-line react-pivottable PivotData rewrite + TableRenderers) in dependency order, and note that the rendering rewrite needs app-level visual verification and is best landed after the apache#29297 design call confirms the multi-query approach. Refs apache#25747, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Adds issue-driven pivot test cases (apache#25747/apache#32260): the production react- pivottable Sum aggregator totals the per-group completion ratios to 0.75 instead of the correct SUM(actual)/SUM(target)=0.4. A test.failing case documents that no client-side aggregator can recover the right value, which is exactly why phase 2 moves totals to DB-computed per-level rollup queries. Refs apache#25747, apache#32260, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Make pivot subtotals/grand totals correct for non-additive metrics by computing each rollup level in the database instead of re-aggregating already-aggregated cells client-side (adapted from apache#34592, on current TS code). - buildQuery: emit one query per rollup level (prefix of rows x prefix of cols) - transformProps: zip each result with its level into QueryData[]; pick the longest-colnames result as the detail "mainQuery" for metadata/formatters - PivotTableChart: tag every record with the level (rows/columns) that produced it; drop the client-side aggregatorsFactory - react-pivottable/PivotData: replace re-aggregation with placement -- a `cellValue` passthrough stores the DB-computed value verbatim and processRecord drops it into the single matching slot (allTotal/rowTotals/colTotals/tree) - remove the now-meaningless aggregateFunction control; TableRenderers label "Total" instead of "Total (<agg>)" - update buildQuery/transformProps/tableRenders tests to the new contract; 60/60 pivot tests pass Correctness-first: multi-query runs for all metrics (correct for additive too); the additivity gate + GROUPING SETS single-query are perf optimizations tracked as phase 3 in SIP.md. Still needs in-app visual verification. Refs apache#25747, apache#32260, apache#36165, apache#38674, apache#29297 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… work Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

pull-request-size Bot added the size/L label Jun 18, 2026

Superset Dev and others added 2 commits June 17, 2026 23:10

docs(sip): note apache#34592 is frontend-only multi-query; clarify sa…

919bc67

…lvage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

pull-request-size Bot added size/XL and removed size/L labels Jun 18, 2026

Superset Dev and others added 2 commits June 18, 2026 11:27

github-actions Bot added the plugins label Jun 18, 2026

Superset Dev and others added 3 commits June 18, 2026 11:54

pull-request-size Bot added size/XXL and removed size/XL labels Jun 18, 2026

Superset Dev and others added 2 commits June 18, 2026 16:09

docs(sip): mark pivot phase-2 engineering complete; note phase-3 perf…

c246967

… work Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SIP] Correct totals/subtotals for non-additive metrics (Table + Pivot)#41184

[SIP] Correct totals/subtotals for non-additive metrics (Table + Pivot)#41184
rusackas wants to merge 11 commits into
apache:masterfrom
rusackas:sip-non-additive-totals

rusackas commented Jun 18, 2026

Uh oh!

netlify Bot commented Jun 18, 2026

Uh oh!

codecov Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rusackas commented Jun 18, 2026

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Uh oh!

netlify Bot commented Jun 18, 2026

✅ Deploy Preview for superset-docs-preview ready!

Uh oh!

codecov Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 18, 2026 •

edited

Loading