Skip to content

[fix](fe) Prevent pushing other join conditions to right child of null-aware anti join when hash conjuncts are empty#64898

Draft
starocean999 wants to merge 1 commit into
apache:masterfrom
starocean999:master_0626
Draft

[fix](fe) Prevent pushing other join conditions to right child of null-aware anti join when hash conjuncts are empty#64898
starocean999 wants to merge 1 commit into
apache:masterfrom
starocean999:master_0626

Conversation

@starocean999

Copy link
Copy Markdown
Contributor

When PushDownJoinOtherCondition processes a NULL_AWARE_LEFT_ANTI_JOIN
with empty hash join conjuncts, it incorrectly pushes the other join conditions
that reference only the right-side slots down to the right child. This changes
the evaluation semantics of the NOT IN subquery, because filtering the right
child before the null-aware anti join eliminates NULL values prematurely.

For a NULL_AWARE_LEFT_ANTI_JOIN, the hash join conjuncts represent the
correlation between the outer and inner tables. When there is no correlation
(i.e., hash conjuncts are empty), the other join conditions must not be pushed
to the right child, as doing so would filter rows from the subquery result
before the NULL check, potentially producing incorrect results.

Example affected query:
SELECT COUNT(*) FROM (SELECT 1 AS x) t
WHERE 1 NOT IN (SELECT CAST(NULL AS INT));

The subquery returns NULL, so NOT IN should evaluate to UNKNOWN (NULL),
meaning no rows match and COUNT returns 0. However, pushing the other condition
to the right child changes this behavior and produces wrong results.

Fix:
Add a guard in PushDownJoinOtherCondition to keep the other condition as
a remaining other conjunct (not pushed to the right child) when the join type
is NULL_AWARE_LEFT_ANTI_JOIN and hash join conjuncts are empty.

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…l-aware anti join when hash conjuncts are empty
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@starocean999

Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary for PR #64898:

I reviewed the three changed files and the surrounding Nereids null-aware anti join rewrite/translation path. I did not find a substantiated blocking issue, so I am leaving this as a summary-only review with no inline comments.

Critical checkpoint conclusions:

  • Goal and proof: the PR prevents PushDownJoinOtherCondition from pushing a right-child-only predicate below a NULL_AWARE_LEFT_ANTI_JOIN when hashJoinConjuncts is empty. The added qt_select65 covers the constant-vs-null subquery case and expects COUNT(*) = 0.
  • Scope: the code change is small and focused on the affected pushdown rule, with a matching regression output update.
  • Parallel paths: checked InApplyToJoin, FindHashConditionForJoin, JoinUtils.JoinSlotCoverageChecker, repeated RuleSet.PUSH_DOWN_FILTERS use, and physical hash/nested-loop translation. I did not find another path that would still push the same no-hash NAAJ right-only comparison into the build child.
  • Session/config behavior: no new configuration item is added. The regression suite does not set planner variables, but SessionVariable defaults keep Nereids enabled and fallback disabled, so the new case should exercise the Nereids rule path under default regression settings.
  • Concurrency, lifecycle, persistence, data writes, FE-BE protocol compatibility, and observability: not materially involved by this PR.
  • Test/output review: the new query returns a single aggregate row, so ordering is deterministic, and the .out label/result matches the suite case.
  • Validation limit: I did not run FE build or the regression suite locally because this runner lacks thirdparty/installed and thirdparty/installed/bin/protoc. I did run git diff --check on the three PR files, which passed.

User focus: no additional user-provided review focus was present.

Subagent conclusions:

  • optimizer-rewrite reported NO_NEW_VALUABLE_FINDINGS; no optimizer candidate became an inline comment.
  • tests-session-config reported NO_NEW_VALUABLE_FINDINGS; no test/session candidate became an inline comment.
  • Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same current ledger and empty proposed final inline comment set.

@starocean999

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29036 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 09ffce185d033d49869f6519fe3e8c23c81f061e, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17610	3931	4128	3931
q2	1988	307	193	193
q3	10317	1407	810	810
q4	4689	464	334	334
q5	7492	865	581	581
q6	182	172	140	140
q7	761	823	640	640
q8	9359	1691	1621	1621
q9	6301	4495	4516	4495
q10	6804	1789	1535	1535
q11	436	274	249	249
q12	644	420	291	291
q13	18131	3310	2694	2694
q14	264	259	244	244
q15	q16	790	765	710	710
q17	926	883	926	883
q18	6882	5807	5623	5623
q19	1481	1270	1104	1104
q20	507	396	284	284
q21	5768	2635	2375	2375
q22	420	350	299	299
Total cold run time: 101752 ms
Total hot run time: 29036 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4304	4240	4271	4240
q2	321	347	225	225
q3	4589	5000	4416	4416
q4	2047	2157	1368	1368
q5	4422	4309	4297	4297
q6	227	173	133	133
q7	1706	1697	1907	1697
q8	2527	2091	2132	2091
q9	8085	8111	8181	8111
q10	4785	4755	4254	4254
q11	574	417	398	398
q12	748	745	532	532
q13	3421	3604	2967	2967
q14	322	311	292	292
q15	q16	734	722	686	686
q17	1366	1313	1317	1313
q18	7944	7331	6985	6985
q19	1125	1082	1116	1082
q20	2217	2248	1979	1979
q21	5232	4560	4386	4386
q22	510	470	412	412
Total cold run time: 57206 ms
Total hot run time: 51864 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 171437 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 09ffce185d033d49869f6519fe3e8c23c81f061e, data reload: false

query5	4318	641	479	479
query6	443	193	166	166
query7	4806	565	304	304
query8	329	180	165	165
query9	8779	4044	4005	4005
query10	433	300	284	284
query11	5922	2373	2129	2129
query12	164	102	99	99
query13	1254	624	429	429
query14	6266	5270	4951	4951
query14_1	4261	4253	4280	4253
query15	223	202	184	184
query16	982	451	460	451
query17	1145	761	581	581
query18	2476	473	346	346
query19	207	194	147	147
query20	113	110	107	107
query21	222	140	126	126
query22	13644	13560	13453	13453
query23	17381	16469	16284	16284
query23_1	16542	16296	16209	16209
query24	7469	1774	1300	1300
query24_1	1330	1290	1311	1290
query25	583	460	397	397
query26	1307	329	170	170
query27	2655	589	359	359
query28	4438	2028	2022	2022
query29	1065	613	495	495
query30	310	239	200	200
query31	1125	1079	955	955
query32	109	64	60	60
query33	514	323	259	259
query34	1192	1154	680	680
query35	753	790	680	680
query36	1427	1422	1251	1251
query37	156	110	94	94
query38	1879	1712	1668	1668
query39	917	919	898	898
query39_1	909	935	890	890
query40	216	117	97	97
query41	64	72	61	61
query42	94	84	88	84
query43	319	322	276	276
query44	1468	790	795	790
query45	207	190	179	179
query46	1103	1241	732	732
query47	2350	2364	2238	2238
query48	369	428	316	316
query49	583	417	304	304
query50	968	357	263	263
query51	4445	4398	4355	4355
query52	80	79	69	69
query53	252	258	195	195
query54	263	216	214	214
query55	73	77	66	66
query56	231	215	232	215
query57	1436	1408	1333	1333
query58	235	212	205	205
query59	1649	1714	1463	1463
query60	277	234	225	225
query61	152	144	151	144
query62	720	633	587	587
query63	229	184	195	184
query64	2493	764	583	583
query65	4852	4762	4749	4749
query66	1798	463	346	346
query67	28889	28731	28609	28609
query68	3145	1532	889	889
query69	450	309	260	260
query70	1018	959	945	945
query71	281	236	215	215
query72	2799	2612	2343	2343
query73	838	791	451	451
query74	5098	4959	4744	4744
query75	2584	2532	2161	2161
query76	2333	1198	763	763
query77	365	391	279	279
query78	12498	12361	11814	11814
query79	1405	1217	735	735
query80	1307	468	399	399
query81	513	279	235	235
query82	576	155	118	118
query83	360	274	244	244
query84	260	144	115	115
query85	899	504	409	409
query86	414	294	290	290
query87	1903	1826	1768	1768
query88	3760	2812	2777	2777
query89	432	384	333	333
query90	1987	180	175	175
query91	168	159	132	132
query92	64	63	55	55
query93	1544	1420	995	995
query94	713	352	311	311
query95	691	392	358	358
query96	1096	807	337	337
query97	2712	2700	2560	2560
query98	210	206	200	200
query99	1182	1154	1023	1023
Total cold run time: 257920 ms
Total hot run time: 171437 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.04 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 09ffce185d033d49869f6519fe3e8c23c81f061e, data reload: false

query1	0.01	0.01	0.01
query2	0.09	0.05	0.05
query3	0.25	0.14	0.13
query4	1.61	0.14	0.12
query5	0.24	0.23	0.22
query6	1.24	1.08	1.09
query7	0.04	0.01	0.01
query8	0.06	0.04	0.04
query9	0.38	0.33	0.31
query10	0.54	0.56	0.54
query11	0.20	0.13	0.13
query12	0.18	0.15	0.15
query13	0.47	0.47	0.46
query14	1.01	1.00	1.01
query15	0.60	0.59	0.58
query16	0.34	0.33	0.32
query17	1.08	1.08	1.12
query18	0.23	0.21	0.22
query19	2.05	1.94	1.85
query20	0.02	0.01	0.02
query21	15.43	0.24	0.14
query22	4.85	0.05	0.05
query23	16.13	0.31	0.13
query24	2.85	0.41	0.34
query25	0.11	0.06	0.04
query26	0.72	0.21	0.14
query27	0.04	0.04	0.04
query28	3.55	0.89	0.53
query29	12.49	4.32	3.45
query30	0.27	0.15	0.15
query31	2.77	0.59	0.31
query32	3.22	0.59	0.48
query33	3.17	3.18	3.16
query34	15.55	4.20	3.51
query35	3.50	3.48	3.48
query36	0.54	0.43	0.43
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.15	0.15
query41	0.09	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 96.35 s
Total hot run time: 25.04 s

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 66.67% (2/3) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants