Skip to content

[fix](fe) Merge one row union outputs by slots#64906

Open
morrySnow wants to merge 1 commit into
apache:masterfrom
morrySnow:fix-compress-materialize
Open

[fix](fe) Merge one row union outputs by slots#64906
morrySnow wants to merge 1 commit into
apache:masterfrom
morrySnow:fix-compress-materialize

Conversation

@morrySnow

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: None

Related PR: #24060

Problem Summary: MergeOneRowRelationIntoUnion previously built constant rows by matching one-row child projects to UNION outputs by list index. When a UNION child output used a different position or fewer columns than the one-row child projects, the rewrite could read past the UNION output list or place the wrong constant into the UNION constant expression list. The fix builds a slot-to-expression map from the one-row child and rewrites each regular child output expression through that map, preserving the UNION child output order and cardinality.

Release note

Fix incorrect results or failures for UNION ALL constant queries whose one-row child output order differs from the UNION output order.

Check List (For Author)

  • Test: Regression test / Unit Test
    • ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.MergeOneRowRelationIntoUnionTest
    • ./run-regression-test.sh --run -d query_p0/compress_materialize -s compress_materialize -forceGenOut
  • Behavior changed: Yes (fixes UNION constant-row rewrite to follow child output mapping instead of project indexes)
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: MergeOneRowRelationIntoUnion previously built constant rows by matching one-row child projects to UNION outputs by list index. When a UNION child output used a different position or fewer columns than the one-row child projects, the rewrite could read past the UNION output list or place the wrong constant into the UNION constant expression list. The fix builds a slot-to-expression map from the one-row child and rewrites each regular child output expression through that map, preserving the UNION child output order and cardinality.

### Release note

Fix incorrect results or failures for UNION ALL constant queries whose one-row child output order differs from the UNION output order.

### Check List (For Author)

- Test: Regression test / Unit Test
    - ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.MergeOneRowRelationIntoUnionTest
    - ./run-regression-test.sh --run -d query_p0/compress_materialize -s compress_materialize -forceGenOut
- Behavior changed: Yes (fixes UNION constant-row rewrite to follow child output mapping instead of project indexes)
- Does this need documentation: No
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow

Copy link
Copy Markdown
Contributor Author

/review

@morrySnow

Copy link
Copy Markdown
Contributor Author

run buildall

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary: I did not find a blocking correctness issue in the PR-scoped diff.

Critical checkpoint conclusions:

  • Goal and proof: The change addresses the stated UNION constant-row ordering bug by deriving merged constant rows from regularChildrenOutputs instead of one-row project indexes. The new unit test covers a one-row child with extra projects and a selected second slot; the regression test covers compressed-materialize UNION ALL grouping/project-order cases.
  • Scope and clarity: The implementation is small and focused on MergeOneRowRelationIntoUnion; test changes are limited to the affected Nereids rule and regression suite.
  • Optimizer semantics and parallel paths: I checked BindExpression.bindSetOperation, PushProjectThroughUnion, ProjectMergeable / PlanUtils.mergeProjections, PushDownEncodeSlot, and physical UNION translation. The child-output slots remain ExprId-compatible with merged one-row aliases, so the replacement map preserves output order and cast/project aliases.
  • Concurrency, lifecycle, persistence, compatibility, FE-BE protocol, and configuration: Not materially involved; no new config, protocol field, storage format, transaction path, or threaded lifecycle is introduced.
  • Tests and expected results: The new order_qt_const_union_* labels match the updated .out sections, use ordered output capture, and the removed explain_sort_agg / exec_sort_agg blocks are stale relative to the current suite. git diff --check passes for the four PR files.
  • Performance and observability: The added per-child map is bounded by the one-row project count and is not a hot runtime path; no extra observability is needed for this planner rewrite.

User focus: No additional user-provided review focus was present.

Subagent conclusions: optimizer-rewrite and tests-session-config both reported no candidates in the initial pass. After the main ledger recorded dismissed points and an empty inline-comment set, both subagents replied NO_NEW_VALUABLE_FINDINGS in convergence round 1.

Validation limit: I did not rerun FE UT or regression tests in this runner because thirdparty/installed and thirdparty/installed/bin/protoc are absent; this matches the local validation limit recorded in the ledger.

No inline review comments were submitted.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 28787 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f90e86add167569ccee3f85938ad6969fedf7baf, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17625	3976	3950	3950
q2	2026	305	189	189
q3	10310	1399	808	808
q4	4685	463	334	334
q5	7539	829	569	569
q6	176	162	135	135
q7	751	839	642	642
q8	9437	1523	1499	1499
q9	6091	4486	4461	4461
q10	6773	1785	1519	1519
q11	441	269	235	235
q12	624	419	286	286
q13	18139	3376	2727	2727
q14	264	261	244	244
q15	q16	787	768	702	702
q17	1025	1012	914	914
q18	6888	5703	5444	5444
q19	1305	1178	1148	1148
q20	465	389	262	262
q21	5910	2571	2427	2427
q22	413	352	292	292
Total cold run time: 101674 ms
Total hot run time: 28787 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4288	4231	4238	4231
q2	313	339	227	227
q3	4529	4929	4447	4447
q4	2064	2124	1351	1351
q5	4414	4314	4294	4294
q6	232	177	128	128
q7	1724	1619	1449	1449
q8	2808	2207	2119	2119
q9	8034	8205	8012	8012
q10	4818	4762	4279	4279
q11	589	411	388	388
q12	752	743	534	534
q13	3222	3616	2994	2994
q14	291	314	297	297
q15	q16	722	755	662	662
q17	1370	1309	1294	1294
q18	8129	7229	7151	7151
q19	1068	1072	1090	1072
q20	2211	2219	1946	1946
q21	5217	4565	4409	4409
q22	524	445	407	407
Total cold run time: 57319 ms
Total hot run time: 51691 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 171684 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f90e86add167569ccee3f85938ad6969fedf7baf, data reload: false

query5	4338	633	486	486
query6	436	192	170	170
query7	4878	527	303	303
query8	322	184	164	164
query9	8745	4021	4028	4021
query10	462	316	283	283
query11	5998	2352	2144	2144
query12	161	103	102	102
query13	1291	621	442	442
query14	6270	5284	4970	4970
query14_1	4324	4291	4280	4280
query15	214	209	183	183
query16	1007	471	435	435
query17	1127	749	585	585
query18	2460	490	365	365
query19	207	186	153	153
query20	119	112	108	108
query21	214	144	117	117
query22	13563	13501	13446	13446
query23	17454	16430	16201	16201
query23_1	16305	16195	16351	16195
query24	7584	1793	1312	1312
query24_1	1328	1322	1296	1296
query25	565	475	387	387
query26	1299	330	176	176
query27	2670	574	367	367
query28	4474	2057	2029	2029
query29	1121	653	503	503
query30	312	239	202	202
query31	1121	1080	974	974
query32	116	65	59	59
query33	526	352	272	272
query34	1183	1140	649	649
query35	785	790	675	675
query36	1418	1402	1291	1291
query37	170	106	98	98
query38	1885	1739	1658	1658
query39	928	913	931	913
query39_1	873	891	882	882
query40	263	127	101	101
query41	65	61	60	60
query42	87	86	86	86
query43	318	323	289	289
query44	1460	794	785	785
query45	195	179	178	178
query46	1085	1196	759	759
query47	2398	2380	2325	2325
query48	406	401	304	304
query49	640	433	318	318
query50	1034	360	257	257
query51	4490	4454	4393	4393
query52	82	81	72	72
query53	245	269	184	184
query54	273	245	246	245
query55	79	72	65	65
query56	259	229	213	213
query57	1505	1475	1377	1377
query58	246	219	211	211
query59	1611	1702	1475	1475
query60	280	252	220	220
query61	153	144	155	144
query62	720	664	592	592
query63	230	195	195	195
query64	2556	783	605	605
query65	4891	4769	4784	4769
query66	1807	448	336	336
query67	28776	28710	28692	28692
query68	3262	1532	990	990
query69	422	297	253	253
query70	1046	969	972	969
query71	302	273	213	213
query72	2949	2650	2355	2355
query73	843	747	451	451
query74	5116	4950	4743	4743
query75	2584	2539	2161	2161
query76	2365	1180	781	781
query77	354	389	286	286
query78	12290	12377	11814	11814
query79	1441	1138	767	767
query80	1274	472	388	388
query81	519	279	240	240
query82	623	156	126	126
query83	320	270	243	243
query84	263	146	112	112
query85	896	503	403	403
query86	430	302	285	285
query87	1844	1825	1773	1773
query88	3741	2807	2777	2777
query89	429	385	351	351
query90	1946	180	182	180
query91	171	160	130	130
query92	63	63	55	55
query93	1652	1497	965	965
query94	740	310	276	276
query95	663	460	350	350
query96	1118	803	380	380
query97	2727	2686	2565	2565
query98	219	202	198	198
query99	1182	1163	1017	1017
Total cold run time: 258565 ms
Total hot run time: 171684 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.04 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f90e86add167569ccee3f85938ad6969fedf7baf, data reload: false

query1	0.01	0.01	0.01
query2	0.10	0.05	0.04
query3	0.26	0.13	0.14
query4	1.61	0.14	0.14
query5	0.23	0.26	0.22
query6	1.24	1.05	1.10
query7	0.04	0.01	0.00
query8	0.06	0.04	0.04
query9	0.39	0.31	0.32
query10	0.58	0.54	0.54
query11	0.19	0.14	0.14
query12	0.18	0.14	0.16
query13	0.47	0.47	0.47
query14	1.03	1.01	0.98
query15	0.62	0.60	0.60
query16	0.32	0.32	0.34
query17	1.10	1.07	1.10
query18	0.23	0.21	0.21
query19	2.04	1.98	1.91
query20	0.02	0.01	0.01
query21	15.58	0.22	0.13
query22	4.70	0.05	0.04
query23	16.12	0.31	0.12
query24	3.11	0.41	0.33
query25	0.12	0.05	0.04
query26	0.72	0.21	0.16
query27	0.04	0.03	0.04
query28	3.53	0.91	0.51
query29	12.49	4.32	3.46
query30	0.27	0.14	0.16
query31	2.78	0.60	0.30
query32	3.22	0.59	0.49
query33	3.24	3.26	3.19
query34	15.54	4.26	3.52
query35	3.52	3.53	3.48
query36	0.56	0.43	0.42
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.04	0.02	0.02
query40	0.17	0.16	0.15
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 96.78 s
Total hot run time: 25.04 s

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (6/6) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants