[fix](nereids) Gate aggregate parent shuffle reuse by NDV stats by foxtail463 · Pull Request #64892 · apache/doris

foxtail463 · 2026-06-26T09:15:32Z

Problem Summary:

PhysicalHashAggregate could still enumerate a parent hash key that is only a strict subset of the group by keys when child statistics were missing or unknown. That allowed CBO to choose a narrower shuffle distribution without evidence that the parent key had enough NDV, which can concentrate data and lead to OOM.

Solution:

Require agg_shuffle_use_parent_key to pass a real stats gate before adding the parent subset distribution: child stats and parent key stats must be known, and the estimated parent-key group count must be greater than LOW_NDV_THRESHOLD. Keep the full group key distribution as the conservative fallback.

hello-stephen · 2026-06-26T09:15:37Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

foxtail463 · 2026-06-26T09:15:42Z

run buildall

hello-stephen · 2026-06-26T09:40:44Z

TPC-H: Total hot run time: 29420 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 019cd1b84d0d7217be65a54295d8f6a5a34d4939, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17618	3993	3975	3975
q2	2067	322	203	203
q3	10258	1447	820	820
q4	4683	466	333	333
q5	7571	848	571	571
q6	184	170	139	139
q7	790	835	623	623
q8	9337	1604	1602	1602
q9	5539	4524	4498	4498
q10	6781	1764	1537	1537
q11	444	276	248	248
q12	629	426	290	290
q13	18120	3362	2782	2782
q14	267	262	232	232
q15	q16	789	771	708	708
q17	1042	1029	1020	1020
q18	6830	5772	5438	5438
q19	1326	1311	1093	1093
q20	490	422	257	257
q21	6459	2827	2738	2738
q22	470	378	313	313
Total cold run time: 101694 ms
Total hot run time: 29420 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	5234	4741	4714	4714
q2	331	385	222	222
q3	4896	5291	4637	4637
q4	2089	2168	1378	1378
q5	4747	4863	4649	4649
q6	238	172	127	127
q7	1943	1725	1560	1560
q8	2400	2081	2160	2081
q9	7983	7645	7531	7531
q10	4727	4643	4180	4180
q11	528	374	347	347
q12	722	749	521	521
q13	2987	3301	2799	2799
q14	270	278	247	247
q15	q16	674	704	616	616
q17	1284	1245	1253	1245
q18	7333	6884	6731	6731
q19	1090	1134	1095	1095
q20	2234	2241	1951	1951
q21	5276	4574	4449	4449
q22	542	455	410	410
Total cold run time: 57528 ms
Total hot run time: 51490 ms

hello-stephen · 2026-06-26T09:51:41Z

TPC-DS: Total hot run time: 171785 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 019cd1b84d0d7217be65a54295d8f6a5a34d4939, data reload: false

query5	4337	649	489	489
query6	451	189	173	173
query7	4983	560	309	309
query8	345	186	170	170
query9	8815	4107	4093	4093
query10	428	310	257	257
query11	5894	2339	2145	2145
query12	156	101	95	95
query13	1241	565	401	401
query14	6278	5279	4991	4991
query14_1	4263	4314	4282	4282
query15	217	202	181	181
query16	986	457	419	419
query17	931	724	584	584
query18	2438	472	355	355
query19	208	191	142	142
query20	113	107	107	107
query21	220	151	120	120
query22	13669	13623	13440	13440
query23	17282	16493	16127	16127
query23_1	16310	16295	16265	16265
query24	7632	1802	1319	1319
query24_1	1324	1316	1308	1308
query25	572	462	392	392
query26	1303	313	172	172
query27	2705	551	344	344
query28	4504	2066	2017	2017
query29	1118	659	494	494
query30	313	236	203	203
query31	1110	1085	980	980
query32	111	64	62	62
query33	528	322	262	262
query34	1217	1108	634	634
query35	760	773	679	679
query36	1419	1410	1276	1276
query37	159	114	99	99
query38	1920	1717	1742	1717
query39	921	917	888	888
query39_1	891	863	878	863
query40	248	129	107	107
query41	71	71	67	67
query42	90	89	90	89
query43	337	327	288	288
query44	1437	799	794	794
query45	206	192	184	184
query46	1064	1184	760	760
query47	2351	2376	2209	2209
query48	405	416	299	299
query49	594	434	323	323
query50	981	347	276	276
query51	4437	4406	4251	4251
query52	84	84	72	72
query53	260	266	191	191
query54	287	245	209	209
query55	74	72	68	68
query56	242	232	244	232
query57	1430	1438	1333	1333
query58	272	226	257	226
query59	1579	1655	1472	1472
query60	291	237	234	234
query61	147	142	151	142
query62	708	642	586	586
query63	230	191	195	191
query64	2540	794	616	616
query65	4871	4797	4752	4752
query66	1787	454	343	343
query67	28997	28833	28661	28661
query68	3176	1615	1007	1007
query69	421	309	255	255
query70	1068	932	998	932
query71	292	234	213	213
query72	2998	2654	2317	2317
query73	871	768	443	443
query74	5125	4968	4767	4767
query75	2582	2544	2156	2156
query76	2328	1221	808	808
query77	361	388	287	287
query78	12518	12629	11877	11877
query79	1427	1229	754	754
query80	1293	478	395	395
query81	520	277	238	238
query82	652	160	118	118
query83	353	273	246	246
query84	313	143	119	119
query85	923	567	408	408
query86	468	304	286	286
query87	1839	1818	1778	1778
query88	3735	2806	2775	2775
query89	433	381	339	339
query90	1995	185	183	183
query91	172	158	128	128
query92	62	58	57	57
query93	1558	1496	927	927
query94	814	366	274	274
query95	675	372	455	372
query96	1070	795	360	360
query97	2709	2739	2566	2566
query98	218	213	206	206
query99	1188	1116	1036	1036
Total cold run time: 258722 ms
Total hot run time: 171785 ms

hello-stephen · 2026-06-26T09:56:38Z

ClickBench: Total hot run time: 25.24 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 019cd1b84d0d7217be65a54295d8f6a5a34d4939, data reload: false

query1	0.00	0.00	0.00
query2	0.10	0.05	0.05
query3	0.25	0.14	0.13
query4	1.60	0.13	0.14
query5	0.23	0.25	0.22
query6	1.24	1.12	1.03
query7	0.04	0.00	0.00
query8	0.06	0.04	0.04
query9	0.39	0.32	0.31
query10	0.59	0.56	0.56
query11	0.20	0.15	0.14
query12	0.19	0.15	0.14
query13	0.48	0.47	0.49
query14	1.02	1.02	1.01
query15	0.64	0.60	0.60
query16	0.32	0.32	0.33
query17	1.14	1.08	1.16
query18	0.23	0.22	0.21
query19	2.03	1.95	1.99
query20	0.02	0.01	0.02
query21	15.45	0.19	0.14
query22	4.94	0.05	0.06
query23	16.14	0.31	0.12
query24	3.02	0.41	0.34
query25	0.11	0.04	0.05
query26	0.77	0.20	0.15
query27	0.03	0.04	0.03
query28	3.49	0.93	0.53
query29	12.47	4.32	3.45
query30	0.29	0.15	0.16
query31	2.78	0.63	0.31
query32	3.23	0.59	0.49
query33	3.21	3.24	3.18
query34	15.64	4.21	3.54
query35	3.53	3.53	3.55
query36	0.56	0.45	0.44
query37	0.09	0.07	0.06
query38	0.05	0.03	0.04
query39	0.04	0.03	0.02
query40	0.18	0.16	0.14
query41	0.09	0.03	0.04
query42	0.05	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 96.98 s
Total hot run time: 25.24 s

foxtail463 · 2026-06-26T09:57:34Z

run buildall

hello-stephen · 2026-06-26T10:53:40Z

TPC-H: Total hot run time: 28727 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2de423e6fa25661ce9af7e98830e7c9f94268d0f, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17681	3985	3934	3934
q2	2007	311	187	187
q3	10294	1368	783	783
q4	4677	462	332	332
q5	7545	832	583	583
q6	175	168	135	135
q7	740	846	626	626
q8	9649	1492	1653	1492
q9	6158	4481	4487	4481
q10	6824	1767	1516	1516
q11	446	279	238	238
q12	649	429	289	289
q13	18121	3406	2766	2766
q14	273	254	233	233
q15	q16	778	780	706	706
q17	1122	1049	907	907
q18	6867	5866	5510	5510
q19	1271	1218	1036	1036
q20	476	399	260	260
q21	5571	2593	2409	2409
q22	438	354	304	304
Total cold run time: 101762 ms
Total hot run time: 28727 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4347	4249	4261	4249
q2	308	353	218	218
q3	4595	4935	4403	4403
q4	2050	2135	1345	1345
q5	4411	4511	4251	4251
q6	228	180	129	129
q7	1704	1625	1718	1625
q8	2552	2137	2150	2137
q9	8161	8069	8071	8069
q10	4780	4725	4282	4282
q11	579	402	378	378
q12	755	749	544	544
q13	3367	3654	2952	2952
q14	297	322	288	288
q15	q16	741	756	642	642
q17	1338	1300	1312	1300
q18	7998	7313	7180	7180
q19	1107	1092	1092	1092
q20	2228	2214	1934	1934
q21	5201	4535	4439	4439
q22	504	455	433	433
Total cold run time: 57251 ms
Total hot run time: 51890 ms

hello-stephen · 2026-06-26T11:04:30Z

TPC-DS: Total hot run time: 171183 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2de423e6fa25661ce9af7e98830e7c9f94268d0f, data reload: false

query5	4311	624	483	483
query6	434	183	166	166
query7	4902	532	301	301
query8	328	177	159	159
query9	8756	4039	4045	4039
query10	470	311	250	250
query11	5904	2395	2124	2124
query12	158	105	100	100
query13	1280	593	436	436
query14	6289	5282	4964	4964
query14_1	4320	4269	4298	4269
query15	223	202	181	181
query16	1007	474	439	439
query17	1129	744	618	618
query18	2459	468	353	353
query19	202	187	149	149
query20	123	107	108	107
query21	217	136	118	118
query22	13557	13588	13517	13517
query23	17450	16633	16182	16182
query23_1	16292	16376	16160	16160
query24	7511	1758	1331	1331
query24_1	1300	1304	1279	1279
query25	581	478	347	347
query26	1306	297	167	167
query27	2705	540	331	331
query28	4417	2004	1995	1995
query29	1045	592	469	469
query30	306	224	210	210
query31	1146	1074	953	953
query32	107	62	57	57
query33	506	309	242	242
query34	1174	1119	655	655
query35	752	772	657	657
query36	1362	1394	1199	1199
query37	153	105	93	93
query38	1885	1702	1672	1672
query39	918	915	934	915
query39_1	896	881	895	881
query40	229	121	99	99
query41	63	64	65	64
query42	89	87	88	87
query43	317	322	274	274
query44	1405	780	789	780
query45	200	186	176	176
query46	1059	1219	758	758
query47	2365	2316	2260	2260
query48	412	412	300	300
query49	562	416	313	313
query50	1021	352	256	256
query51	4478	4427	4342	4342
query52	82	81	69	69
query53	244	276	186	186
query54	259	212	186	186
query55	72	72	64	64
query56	241	224	224	224
query57	1437	1412	1335	1335
query58	236	206	198	198
query59	1556	1634	1482	1482
query60	282	238	224	224
query61	172	148	147	147
query62	704	643	575	575
query63	222	187	204	187
query64	2497	786	597	597
query65	4883	4784	4732	4732
query66	1792	455	330	330
query67	28181	28844	28658	28658
query68	3203	1509	926	926
query69	419	305	263	263
query70	1078	970	947	947
query71	284	242	219	219
query72	2847	2632	2271	2271
query73	812	792	432	432
query74	5113	4930	4776	4776
query75	2556	2568	2171	2171
query76	2317	1181	782	782
query77	350	388	306	306
query78	12223	12339	11849	11849
query79	1390	1137	724	724
query80	584	477	370	370
query81	439	275	238	238
query82	564	155	119	119
query83	356	273	239	239
query84	270	144	109	109
query85	839	516	401	401
query86	367	293	300	293
query87	1835	1861	1778	1778
query88	3681	2766	2757	2757
query89	437	379	336	336
query90	1866	182	166	166
query91	172	155	132	132
query92	60	59	55	55
query93	1476	1451	910	910
query94	573	363	318	318
query95	665	376	445	376
query96	1068	788	362	362
query97	2717	2682	2583	2583
query98	218	201	196	196
query99	1166	1150	1033	1033
Total cold run time: 255095 ms
Total hot run time: 171183 ms

hello-stephen · 2026-06-26T11:09:20Z

ClickBench: Total hot run time: 25.15 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2de423e6fa25661ce9af7e98830e7c9f94268d0f, data reload: false

query1	0.01	0.01	0.01
query2	0.09	0.06	0.09
query3	0.26	0.13	0.13
query4	1.60	0.14	0.13
query5	0.24	0.23	0.23
query6	1.21	1.08	1.12
query7	0.04	0.00	0.00
query8	0.05	0.04	0.03
query9	0.39	0.32	0.33
query10	0.58	0.54	0.54
query11	0.19	0.13	0.14
query12	0.18	0.14	0.14
query13	0.48	0.47	0.48
query14	1.01	1.01	1.00
query15	0.61	0.59	0.60
query16	0.32	0.32	0.31
query17	1.11	1.13	1.09
query18	0.22	0.21	0.22
query19	2.10	1.96	1.92
query20	0.02	0.01	0.02
query21	15.44	0.22	0.13
query22	4.81	0.05	0.06
query23	16.13	0.30	0.12
query24	2.97	0.39	0.34
query25	0.12	0.06	0.04
query26	0.77	0.20	0.16
query27	0.05	0.04	0.03
query28	3.54	0.87	0.51
query29	12.50	4.30	3.45
query30	0.27	0.16	0.16
query31	2.77	0.60	0.32
query32	3.23	0.60	0.49
query33	3.26	3.26	3.19
query34	15.52	4.20	3.50
query35	3.55	3.51	3.53
query36	0.56	0.42	0.41
query37	0.09	0.06	0.06
query38	0.05	0.05	0.03
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 96.72 s
Total hot run time: 25.15 s

hello-stephen · 2026-06-26T12:43:42Z

FE Regression Coverage Report

Increment line coverage 92.00% (23/25) 🎉
Increment coverage report
Complete coverage report

morrySnow

Thanks for this fix! The change from default-true to default-false in shouldUseParent when stats are missing is the right safety trade-off — preventing OOM is much more important than a potentially narrower shuffle. The approach of pre-resolving parent hash expressions against group-by expressions in visitPhysicalHashAggregate also simplifies shouldUseParent nicely.

I left a few inline comments for your consideration.

morrySnow · 2026-06-29T04:13:51Z

            PlanContext context) {
        if (!context.getConnectContext().getSessionVariable().aggShuffleUseParentKey) {
            return false;
        }


Good call — if there is no group expression at all, we cannot derive stats and should not gamble on the parent subset key.

morrySnow · 2026-06-29T04:13:51Z

        }
        if (agg.hasSourceRepeat()) {
            return false;
        }


This is the core fix — previously returning true here meant the optimizer would use the narrower parent hash key even with no stats at all, which could lead to severe data skew and OOM. Returning false (fall through to the full group-by key) is the conservative and correct choice.

Consider adding a brief comment here explaining the rationale, e.g.:

// Without stats we cannot assess whether the parent subset key has enough // NDV to avoid skew; fall back to the safe full group-by distribution.

morrySnow · 2026-06-29T04:13:51Z

-            if (exprIdSlotMap.containsKey(exprId)) {
-                parentHashExprs.add(exprIdSlotMap.get(exprId));
-            }
+            return false;


Same pattern as above — hasUnknownStatistics returning true now correctly causes us to skip the parent subset optimization instead of blindly trying it.

morrySnow · 2026-06-29T04:13:51Z

        }
        if (AggregateUtils.hasUnknownStatistics(parentHashExprs, aggChildStats)) {
-            return true;
+            return false;


Note: NDV exactly equal to LOW_NDV_THRESHOLD (1024) is treated as insufficient — this is consistent with how SplitAggMultiPhase also uses > (strictly greater), so the threshold boundary is uniform across callers. 👍

morrySnow · 2026-06-29T04:13:51Z

+        expected.add(Lists.newArrayList(PhysicalProperties.createHash(
+                Lists.newArrayList(key1.getExprId(), key2.getExprId()), ShuffleType.REQUIRE)));
+        Assertions.assertEquals(expected, actual);
+    }


The childStatistics override returns null here, which exercises the aggChildStats == null → return false path. The test now correctly expects only the full group-by key distribution. Consider updating the comment above to reflect the new behavior (e.g., // When stats are null, parent subset should NOT be used).

morrySnow · 2026-06-29T04:13:51Z

+        GroupExpression groupExpression = new GroupExpression(aggregate) {
+            @Override
+            public Statistics childStatistics(int idx) {
+                return childStats;


Nice boundary test — setNdv(AggregateUtils.LOW_NDV_THRESHOLD) (1024) and correctly expecting the parent key NOT to be used, since combinedNdv > LOW_NDV_THRESHOLD is false when NDV is exactly at the threshold.

morrySnow · 2026-06-29T04:13:51Z

-                                                                    physicalHashAggregate(
-                                                                            physicalDistribute(any())))),
+                                                                    physicalDistribute(
+                                                                            physicalHashAggregate(


The physicalDistribute wrappers are now expected in the plan shape because shouldUseParent no longer returns true when stats are unknown (which is the case in this unit test). Previously, the parent subset key was blindly adopted, which could eliminate the distribute node. This test change correctly reflects the stricter stats gate — the full group-by key distribution is used, and the distribute is preserved.

This is an intended side effect of the fix, but worth confirming: is the plan shape here what you would expect to see in production queries after this change?

[fix](nereids) Gate aggregate parent shuffle reuse by NDV stats

096bafd

foxtail463 requested review from 924060929, englefly, morrySnow and starocean999 as code owners June 26, 2026 09:15

Fix test

2de423e

foxtail463 force-pushed the fix/cbo-agg-parent-shuffle-distinct-oom branch from 019cd1b to 2de423e Compare June 26, 2026 09:55

morrySnow reviewed Jun 29, 2026

View reviewed changes

Uh oh!

Conversation

foxtail463 commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

foxtail463 commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

foxtail463 commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

Uh oh!

hello-stephen commented Jun 26, 2026

FE Regression Coverage Report

Uh oh!

morrySnow left a comment

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

morrySnow Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants