Skip to content

Conversation

@morrySnow
Copy link
Contributor

What problem does this PR solve?

Related PR: #57204

Problem Summary:

This pull request refactors and improves the PushDownProject rule in the Nereids optimizer, mainly focusing on the logic for pushing down projections through UNION operations. It also introduces a comprehensive unit test to verify the new logic, making the relevant methods more testable and robust.

Refactoring and Logic Improvements:

  • Refactored the pushThroughUnion logic by extracting it into a new static method, making it easier to test and use independently. The main logic now takes explicit arguments instead of relying on the context object.
  • Improved the handling of projections and child outputs when pushing down through UNION, ensuring correct mapping and replacement of slots. This includes using regulator outputs for children and constant expressions, and making the slot replacement logic static for better testability. [1] [2] [3] [4]

Testing Enhancements:

  • Added a new unit test class PushDownProjectTest to rigorously test the pushdown logic in various scenarios, including unions with and without children. The tests verify both the structure and the correctness of the rewritten plans.

Code Quality Improvements:

  • Added the @VisibleForTesting annotation and imported necessary dependencies to clarify method visibility and intent for testing.
  • Replaced some usages of Collection with List for better type safety and clarity in projection handling.

These changes make the projection pushdown logic more modular, testable, and robust, and provide strong test coverage for future maintenance.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 5, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow
Copy link
Contributor Author

run buildall

1 similar comment
@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34445 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7ab0330bae284857ae182595fc0651aa513b6cd5, data reload: false

------ Round 1 ----------------------------------
q1	17617	4958	4898	4898
q2	2040	314	214	214
q3	10254	1309	732	732
q4	10220	833	317	317
q5	7473	2442	2173	2173
q6	186	177	136	136
q7	957	782	638	638
q8	9361	1479	1162	1162
q9	7071	5363	5359	5359
q10	6848	2188	1808	1808
q11	534	322	297	297
q12	379	378	220	220
q13	17768	3743	3074	3074
q14	239	241	216	216
q15	583	508	514	508
q16	909	874	825	825
q17	681	790	526	526
q18	7829	7164	7060	7060
q19	1092	979	614	614
q20	385	346	229	229
q21	3994	3253	2498	2498
q22	1023	991	941	941
Total cold run time: 107443 ms
Total hot run time: 34445 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4999	4973	4907	4907
q2	330	404	322	322
q3	2133	2727	2287	2287
q4	1341	1771	1285	1285
q5	4279	4501	4611	4501
q6	233	187	137	137
q7	2108	1975	1834	1834
q8	2733	2676	2686	2676
q9	7537	7524	7812	7524
q10	3088	3226	2887	2887
q11	604	519	497	497
q12	699	800	582	582
q13	3534	3981	3415	3415
q14	305	301	272	272
q15	540	510	517	510
q16	901	957	891	891
q17	1216	1361	1420	1361
q18	7930	7875	7562	7562
q19	928	882	909	882
q20	2058	2109	1930	1930
q21	5084	4453	4159	4159
q22	1081	1040	986	986
Total cold run time: 53661 ms
Total hot run time: 51407 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180274 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7ab0330bae284857ae182595fc0651aa513b6cd5, data reload: false

query5	4412	658	491	491
query6	370	242	229	229
query7	4658	476	289	289
query8	316	257	250	250
query9	8730	2676	2649	2649
query10	515	334	272	272
query11	15290	14730	14535	14535
query12	190	121	118	118
query13	1709	475	363	363
query14	5483	3321	3018	3018
query14_1	2910	2847	2897	2847
query15	208	197	177	177
query16	7165	479	442	442
query17	1178	684	587	587
query18	1991	424	339	339
query19	207	182	156	156
query20	127	123	113	113
query21	220	137	118	118
query22	3880	3981	3889	3889
query23	16623	16088	16063	16063
query23_1	16183	16033	16078	16033
query24	7231	1601	1192	1192
query24_1	1214	1190	1212	1190
query25	639	508	455	455
query26	1268	301	182	182
query27	2897	473	309	309
query28	4417	2193	2179	2179
query29	879	585	468	468
query30	311	236	221	221
query31	814	721	628	628
query32	89	80	71	71
query33	689	364	320	320
query34	858	881	518	518
query35	803	863	747	747
query36	881	917	832	832
query37	122	88	76	76
query38	3847	3940	3804	3804
query39	758	734	715	715
query39_1	694	889	697	697
query40	233	145	122	122
query41	73	69	70	69
query42	136	101	103	101
query43	442	420	404	404
query44	1308	773	761	761
query45	197	195	184	184
query46	918	964	594	594
query47	1724	1734	1672	1672
query48	412	323	242	242
query49	823	446	383	383
query50	685	302	240	240
query51	3855	3831	3781	3781
query52	118	99	88	88
query53	239	238	181	181
query54	332	270	254	254
query55	96	81	79	79
query56	351	317	323	317
query57	1172	1142	1144	1142
query58	282	250	254	250
query59	2261	2353	2278	2278
query60	341	313	296	296
query61	160	157	151	151
query62	781	713	668	668
query63	233	175	179	175
query64	4580	1174	872	872
query65	4052	3942	3948	3942
query66	1259	435	332	332
query67	15513	15588	15201	15201
query68	8301	976	702	702
query69	533	311	267	267
query70	1148	1050	1004	1004
query71	464	309	283	283
query72	6235	4977	5123	4977
query73	724	611	304	304
query74	8598	8804	8794	8794
query75	3282	3051	2480	2480
query76	3408	1132	763	763
query77	771	396	291	291
query78	9562	9660	8808	8808
query79	1640	853	583	583
query80	703	552	462	462
query81	524	278	236	236
query82	205	127	101	101
query83	286	273	258	258
query84	276	112	96	96
query85	906	492	436	436
query86	379	282	304	282
query87	4095	4102	3978	3978
query88	3828	2146	2109	2109
query89	382	321	286	286
query90	2115	162	156	156
query91	170	165	142	142
query92	88	69	66	66
query93	2015	1069	684	684
query94	751	294	292	292
query95	575	387	335	335
query96	543	501	214	214
query97	2608	2701	2550	2550
query98	246	207	208	207
query99	1349	1305	1233	1233
Total cold run time: 270484 ms
Total hot run time: 180274 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.34 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7ab0330bae284857ae182595fc0651aa513b6cd5, data reload: false

query1	0.06	0.05	0.04
query2	0.10	0.05	0.05
query3	0.26	0.09	0.09
query4	1.60	0.11	0.11
query5	0.25	0.25	0.25
query6	1.17	0.64	0.63
query7	0.04	0.03	0.03
query8	0.05	0.04	0.04
query9	0.57	0.50	0.51
query10	0.57	0.56	0.54
query11	0.16	0.11	0.12
query12	0.14	0.12	0.12
query13	0.61	0.60	0.61
query14	0.98	0.99	0.98
query15	0.82	0.80	0.83
query16	0.40	0.39	0.41
query17	1.05	1.05	1.05
query18	0.23	0.21	0.21
query19	1.96	1.78	1.72
query20	0.02	0.01	0.01
query21	15.43	0.29	0.14
query22	4.82	0.06	0.05
query23	16.04	0.28	0.10
query24	1.22	0.26	0.76
query25	0.09	0.06	0.04
query26	0.13	0.12	0.12
query27	0.04	0.06	0.05
query28	4.92	1.21	1.02
query29	12.61	4.10	3.34
query30	0.30	0.13	0.14
query31	2.80	0.62	0.41
query32	3.24	0.55	0.46
query33	3.03	3.05	3.07
query34	16.90	5.17	4.54
query35	4.62	4.56	4.54
query36	0.68	0.50	0.49
query37	0.10	0.06	0.06
query38	0.07	0.04	0.03
query39	0.04	0.03	0.03
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 98.45 s
Total hot run time: 27.34 s

@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34604 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7ab0330bae284857ae182595fc0651aa513b6cd5, data reload: false

------ Round 1 ----------------------------------
q1	17611	5104	4943	4943
q2	2069	314	197	197
q3	10237	1318	768	768
q4	10221	877	320	320
q5	7523	2396	2193	2193
q6	183	173	138	138
q7	1001	771	645	645
q8	9352	1426	1155	1155
q9	7047	5341	5396	5341
q10	6793	2202	1826	1826
q11	525	312	316	312
q12	337	364	231	231
q13	17779	3671	3017	3017
q14	233	244	213	213
q15	569	520	523	520
q16	915	883	818	818
q17	678	782	538	538
q18	7542	7160	7066	7066
q19	1093	970	623	623
q20	381	355	230	230
q21	4053	3421	2547	2547
q22	1057	1023	963	963
Total cold run time: 107199 ms
Total hot run time: 34604 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5033	4986	5001	4986
q2	330	404	333	333
q3	2183	2727	2279	2279
q4	1334	1740	1334	1334
q5	4208	4513	4638	4513
q6	225	175	129	129
q7	2084	2020	1861	1861
q8	2741	2561	2631	2561
q9	7612	7589	7547	7547
q10	3170	3275	2880	2880
q11	605	510	496	496
q12	679	769	610	610
q13	3510	3855	3308	3308
q14	291	292	295	292
q15	553	532	511	511
q16	940	955	888	888
q17	1175	1530	1511	1511
q18	7902	7653	7641	7641
q19	931	905	869	869
q20	2004	2133	1946	1946
q21	5122	4408	4128	4128
q22	1109	999	982	982
Total cold run time: 53741 ms
Total hot run time: 51605 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 179917 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7ab0330bae284857ae182595fc0651aa513b6cd5, data reload: false

query5	5108	663	487	487
query6	346	247	212	212
query7	4661	485	280	280
query8	316	246	242	242
query9	8744	2645	2668	2645
query10	580	307	268	268
query11	15781	14830	14716	14716
query12	175	119	120	119
query13	1682	494	381	381
query14	6243	3267	3035	3035
query14_1	2906	2919	2867	2867
query15	208	200	182	182
query16	7706	491	463	463
query17	1236	725	613	613
query18	2055	440	348	348
query19	217	192	168	168
query20	136	133	123	123
query21	219	145	115	115
query22	3906	4230	3867	3867
query23	16827	16325	16108	16108
query23_1	16129	16036	16151	16036
query24	7242	1631	1195	1195
query24_1	1216	1193	1250	1193
query25	641	497	463	463
query26	1270	288	177	177
query27	2883	485	316	316
query28	4372	2197	2186	2186
query29	842	590	497	497
query30	321	249	220	220
query31	812	710	615	615
query32	91	73	74	73
query33	690	362	352	352
query34	874	884	535	535
query35	797	810	749	749
query36	889	924	821	821
query37	136	99	75	75
query38	3866	3884	3729	3729
query39	767	738	702	702
query39_1	719	702	694	694
query40	226	128	118	118
query41	69	60	64	60
query42	127	101	97	97
query43	435	427	395	395
query44	1310	760	765	760
query45	194	192	185	185
query46	914	963	601	601
query47	1691	1739	1632	1632
query48	400	317	242	242
query49	774	439	357	357
query50	677	306	225	225
query51	3875	3888	3808	3808
query52	114	95	87	87
query53	236	231	178	178
query54	322	254	236	236
query55	97	84	76	76
query56	343	307	286	286
query57	1166	1163	1091	1091
query58	285	281	257	257
query59	2292	2382	2440	2382
query60	377	322	301	301
query61	158	161	160	160
query62	765	703	629	629
query63	230	178	178	178
query64	4550	1187	898	898
query65	4048	3960	3956	3956
query66	1186	435	338	338
query67	15462	15047	14874	14874
query68	8461	929	668	668
query69	518	298	266	266
query70	1109	1016	982	982
query71	471	296	275	275
query72	6004	4929	4997	4929
query73	675	566	301	301
query74	8681	8815	8593	8593
query75	3535	3053	2526	2526
query76	3656	1127	732	732
query77	813	416	298	298
query78	9518	9514	8840	8840
query79	1896	867	583	583
query80	742	565	475	475
query81	497	270	238	238
query82	439	127	103	103
query83	311	266	246	246
query84	301	111	96	96
query85	908	499	448	448
query86	339	313	282	282
query87	4002	4085	3914	3914
query88	3945	2108	2117	2108
query89	386	333	276	276
query90	2101	162	161	161
query91	169	166	139	139
query92	88	69	62	62
query93	1566	1046	680	680
query94	771	303	294	294
query95	573	376	320	320
query96	540	498	214	214
query97	2616	2663	2559	2559
query98	243	200	203	200
query99	1393	1320	1255	1255
Total cold run time: 273135 ms
Total hot run time: 179917 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.34 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7ab0330bae284857ae182595fc0651aa513b6cd5, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.05	0.05
query3	0.25	0.09	0.08
query4	1.61	0.11	0.11
query5	0.26	0.26	0.27
query6	1.17	0.65	0.62
query7	0.03	0.02	0.03
query8	0.06	0.04	0.04
query9	0.59	0.51	0.51
query10	0.57	0.55	0.56
query11	0.15	0.11	0.12
query12	0.15	0.11	0.12
query13	0.62	0.60	0.60
query14	0.99	0.98	1.00
query15	0.82	0.79	0.80
query16	0.40	0.41	0.42
query17	1.00	0.99	1.01
query18	0.23	0.21	0.21
query19	1.94	1.80	1.89
query20	0.02	0.01	0.01
query21	15.44	0.31	0.15
query22	4.85	0.05	0.04
query23	16.12	0.28	0.10
query24	1.22	0.31	0.27
query25	0.07	0.08	0.07
query26	0.14	0.12	0.14
query27	0.08	0.04	0.06
query28	3.37	1.26	1.01
query29	12.60	4.01	3.24
query30	0.27	0.14	0.12
query31	2.81	0.64	0.39
query32	3.26	0.55	0.46
query33	3.11	3.14	3.06
query34	16.67	5.15	4.56
query35	4.54	4.58	4.54
query36	0.67	0.51	0.50
query37	0.11	0.07	0.07
query38	0.08	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.03	0.04
query42	0.05	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 96.81 s
Total hot run time: 27.34 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 92.00% (46/50) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit 7db879b into apache:master Dec 8, 2025
28 of 29 checks passed
@morrySnow morrySnow deleted the fix_project_pushdown branch December 8, 2025 02:27
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
apache#58765)

### What problem does this PR solve?

Related PR: apache#57204

Problem Summary:

This pull request refactors and improves the `PushDownProject` rule in
the Nereids optimizer, mainly focusing on the logic for pushing down
projections through `UNION` operations. It also introduces a
comprehensive unit test to verify the new logic, making the relevant
methods more testable and robust.

**Refactoring and Logic Improvements:**

* Refactored the `pushThroughUnion` logic by extracting it into a new
static method, making it easier to test and use independently. The main
logic now takes explicit arguments instead of relying on the context
object.
* Improved the handling of projections and child outputs when pushing
down through `UNION`, ensuring correct mapping and replacement of slots.
This includes using regulator outputs for children and constant
expressions, and making the slot replacement logic static for better
testability.

**Testing Enhancements:**

* Added a new unit test class `PushDownProjectTest` to rigorously test
the pushdown logic in various scenarios, including unions with and
without children. The tests verify both the structure and the
correctness of the rewritten plans.

**Code Quality Improvements:**

* Added the `@VisibleForTesting` annotation and imported necessary
dependencies to clarify method visibility and intent for testing.
* Replaced some usages of `Collection` with `List` for better type
safety and clarity in projection handling.

These changes make the projection pushdown logic more modular, testable,
and robust, and provide strong test coverage for future maintenance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants