Skip to content

[fix](topn) Fix topn distinct union output mapping#64900

Open
morrySnow wants to merge 1 commit into
apache:masterfrom
morrySnow:fix-push-topn-agg-union
Open

[fix](topn) Fix topn distinct union output mapping#64900
morrySnow wants to merge 1 commit into
apache:masterfrom
morrySnow:fix-push-topn-agg-union

Conversation

@morrySnow

Copy link
Copy Markdown
Contributor

Related PR: #27628

Problem Summary: PushDownTopNDistinctThroughUnion used the rewritten child plan output to map UNION output slots when pushing TopN distinct into UNION branches. For UNION DISTINCT queries with duplicate projection columns and window output, the child plan output can be shorter than the UNION regular child output, causing IndexOutOfBoundsException during rewrite. Use the UNION regular child output for slot remapping and add focused FE and regression coverage for duplicate output with window plus outer ORDER BY LIMIT.

Issue Number: close #xxx

Related PR: apache#27628

Problem Summary: PushDownTopNDistinctThroughUnion used the rewritten child plan output to map UNION output slots when pushing TopN distinct into UNION branches. For UNION DISTINCT queries with duplicate projection columns and window output, the child plan output can be shorter than the UNION regular child output, causing IndexOutOfBoundsException during rewrite. Use the UNION regular child output for slot remapping and add focused FE and regression coverage for duplicate output with window plus outer ORDER BY LIMIT.

None

- Test: Regression test / Unit Test
    - ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PushDownTopNDistinctThroughUnionTest
    - ./run-regression-test.sh --run -d nereids_rules_p0/push_down_top_n -s push_down_top_n_distinct_through_union -forceGenOut
- Behavior changed: No
- Does this need documentation: No
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow

Copy link
Copy Markdown
Contributor Author

/review

@morrySnow

Copy link
Copy Markdown
Contributor Author

run buildall

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one correctness gap that should be fixed before merging: the TopN path now uses the UNION regular child outputs, but the parallel LIMIT DISTINCT-through-UNION rewrite still uses post-rewrite child output positions and can hit the same duplicate-output/window failure for LIMIT-only queries.

Critical checkpoint conclusions:

  • Goal: the PR fixes the ORDER BY/LIMIT TopN variant, but does not fully accomplish the stated root fix across the parallel LIMIT path.
  • Scope/focus: the implementation is small and local; the missing parallel rule update is the main scope gap.
  • Concurrency/lifecycle/config/protocol/transactions/persistence: not involved in these changed FE rewrite/test files.
  • Parallel paths: PushDownLimitDistinctThroughUnion is a functionally parallel path and still has the old mapping.
  • Tests: the new FE and regression coverage exercise the TopN/ORDER BY case; a no-ORDER-BY LIMIT case is still missing.
  • Validation: I did not run FE or regression tests because thirdparty/installed/bin/protoc and thirdparty/installed/bin/thrift are missing in this runner. git diff --check FETCH_HEAD..HEAD on the changed files passed.

User focus: no additional user-provided review focus was supplied.

Subagent conclusions: optimizer-rewrite proposed OPT-1, which I independently verified and accepted as MAIN-1 for the inline comment. tests-session-config reported no TEST-* candidates. In convergence round 1, after MAIN-1 was accepted and the final comment set was recorded, both live subagents replied NO_NEW_VALUABLE_FINDINGS.

NamedExpression output = union.getOutputs().get(i);
replaceMap.put(output, child.getOutput().get(i));
for (int j = 0; j < union.getOutputs().size(); ++j) {
NamedExpression output = union.getOutputs().get(j);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the TopN variant, but the parallel LIMIT-only path still has the same output mapping bug. BuildAggForUnion rewrites UNION DISTINCT into a distinct LogicalAggregate over union.withAllQualifier(), and a query with ... LIMIT 10 but no outer ORDER BY reaches PushDownLimitDistinctThroughUnion instead of this TopN rule. That rule still loops over union.getOutputs().size() and reads child.getOutput().get(i), so the duplicate-column/window shape added in this PR can still index past the pruned child output on the LIMIT-only path. Please apply the same union.getRegularChildOutput(childIdx) mapping in PushDownLimitDistinctThroughUnion and add a no-ORDER-BY LIMIT case.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29453 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8dbfdab4c5cfb2669421dc3b1378b6b492535a33, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17709	4115	4051	4051
q2	2036	308	186	186
q3	10302	1468	855	855
q4	4683	475	344	344
q5	7518	854	582	582
q6	192	173	137	137
q7	782	837	647	647
q8	9372	1552	1669	1552
q9	5505	4465	4408	4408
q10	6805	1770	1534	1534
q11	442	278	247	247
q12	633	425	301	301
q13	18143	3444	2792	2792
q14	269	269	247	247
q15	q16	787	785	708	708
q17	1023	1029	1032	1029
q18	7030	5857	5602	5602
q19	1306	1245	1094	1094
q20	509	428	269	269
q21	5987	2656	2565	2565
q22	438	360	303	303
Total cold run time: 101471 ms
Total hot run time: 29453 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4355	4332	4343	4332
q2	317	349	222	222
q3	4554	4976	4511	4511
q4	2087	2179	1413	1413
q5	4488	4313	4308	4308
q6	234	176	131	131
q7	1810	1906	1817	1817
q8	2646	2279	2188	2188
q9	8283	8505	8134	8134
q10	4824	4874	4426	4426
q11	574	427	377	377
q12	764	774	535	535
q13	3243	3666	2900	2900
q14	306	311	293	293
q15	q16	709	730	650	650
q17	1379	1333	1453	1333
q18	7786	7364	7360	7360
q19	1187	1104	1098	1098
q20	2246	2238	1953	1953
q21	5301	4623	4482	4482
q22	559	459	414	414
Total cold run time: 57652 ms
Total hot run time: 52877 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 172573 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8dbfdab4c5cfb2669421dc3b1378b6b492535a33, data reload: false

query5	4331	628	487	487
query6	438	199	184	184
query7	4878	573	315	315
query8	349	176	161	161
query9	8776	4061	4061	4061
query10	448	329	270	270
query11	5883	2324	2143	2143
query12	151	99	99	99
query13	1244	597	434	434
query14	6247	5291	4994	4994
query14_1	4333	4337	4319	4319
query15	215	203	184	184
query16	999	428	432	428
query17	938	739	592	592
query18	2472	487	347	347
query19	209	202	151	151
query20	114	110	108	108
query21	225	144	122	122
query22	13636	13721	13490	13490
query23	17598	16633	16327	16327
query23_1	16317	16332	16373	16332
query24	7543	1797	1292	1292
query24_1	1330	1328	1311	1311
query25	566	466	398	398
query26	1300	340	177	177
query27	2701	532	345	345
query28	4439	2057	2042	2042
query29	1092	615	495	495
query30	313	239	200	200
query31	1126	1103	959	959
query32	123	63	66	63
query33	537	334	257	257
query34	1164	1119	659	659
query35	794	796	681	681
query36	1394	1400	1190	1190
query37	159	110	95	95
query38	1896	1696	1682	1682
query39	925	922	884	884
query39_1	906	862	900	862
query40	223	135	117	117
query41	71	68	68	68
query42	92	89	91	89
query43	327	326	287	287
query44	1437	806	788	788
query45	207	185	180	180
query46	1076	1202	759	759
query47	2360	2350	2227	2227
query48	405	403	299	299
query49	591	439	329	329
query50	1039	368	267	267
query51	4484	4481	4366	4366
query52	124	82	69	69
query53	243	262	191	191
query54	258	220	211	211
query55	73	69	68	68
query56	237	225	221	221
query57	1451	1398	1322	1322
query58	255	203	212	203
query59	1570	1627	1424	1424
query60	295	251	224	224
query61	153	147	154	147
query62	699	656	594	594
query63	230	195	182	182
query64	2553	777	610	610
query65	4932	4755	4794	4755
query66	1787	489	338	338
query67	29003	28941	28916	28916
query68	3185	1717	987	987
query69	422	313	266	266
query70	1087	968	947	947
query71	293	237	210	210
query72	2946	2687	2209	2209
query73	862	803	444	444
query74	5133	4956	4784	4784
query75	2568	2590	2188	2188
query76	2326	1175	809	809
query77	353	383	277	277
query78	12488	12614	11946	11946
query79	1396	1150	750	750
query80	1285	482	406	406
query81	513	277	239	239
query82	987	160	124	124
query83	356	281	256	256
query84	314	149	115	115
query85	927	525	428	428
query86	483	302	265	265
query87	1851	1849	1780	1780
query88	3764	2829	2798	2798
query89	440	389	337	337
query90	1917	182	186	182
query91	174	158	132	132
query92	66	61	55	55
query93	1649	1568	869	869
query94	762	338	326	326
query95	707	480	356	356
query96	1095	800	351	351
query97	2695	2689	2609	2609
query98	228	208	199	199
query99	1177	1157	1028	1028
Total cold run time: 259051 ms
Total hot run time: 172573 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.13 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 8dbfdab4c5cfb2669421dc3b1378b6b492535a33, data reload: false

query1	0.01	0.00	0.00
query2	0.12	0.05	0.04
query3	0.26	0.14	0.13
query4	1.62	0.14	0.14
query5	0.25	0.26	0.22
query6	1.21	1.05	1.09
query7	0.03	0.01	0.01
query8	0.06	0.04	0.04
query9	0.37	0.32	0.31
query10	0.55	0.55	0.55
query11	0.21	0.15	0.14
query12	0.18	0.15	0.16
query13	0.48	0.47	0.48
query14	1.03	1.03	1.02
query15	0.63	0.59	0.61
query16	0.30	0.31	0.31
query17	1.09	1.07	1.07
query18	0.22	0.20	0.20
query19	2.08	1.91	1.95
query20	0.02	0.01	0.01
query21	15.48	0.18	0.13
query22	5.00	0.05	0.04
query23	16.19	0.31	0.12
query24	2.99	0.42	0.32
query25	0.12	0.06	0.04
query26	0.73	0.21	0.16
query27	0.04	0.04	0.05
query28	3.55	0.87	0.52
query29	12.49	4.20	3.44
query30	0.26	0.16	0.15
query31	2.77	0.61	0.32
query32	3.24	0.60	0.49
query33	3.27	3.17	3.26
query34	15.65	4.25	3.56
query35	3.55	3.51	3.56
query36	0.56	0.41	0.42
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.04	0.02	0.02
query40	0.19	0.17	0.15
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 97.15 s
Total hot run time: 25.13 s

@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (6/6) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants