Skip to content

Conversation

@924060929
Copy link
Contributor

@924060929 924060929 commented Nov 25, 2025

What problem does this PR solve?

optimize push down project, this can reduce the scan bytes and shuffle bytes by prune nested column. #57204 related

the sql:

select coalecse(struct_element(t1.s, 'city'), 'beijing') 
from t1 join t2
on t1.id = t2.id

original plan:

Project(coalecse(struct_element(t1.s, 'city'), 'beijing'))
                             |
                    Join(t1.id=t2.id)
                    /               \
            Project(t1.id, t1.s)    Project(t2.id)
                 |                    |
            Scan(t1)                Scan(t2)

optimize plan:


                       Project(coalecse(slot#3, 'beijing'))
                                      |
                               Join(t1.id=t2.id)
                    /                                       \
Project(t1.id, struct_element(t1.s, 'city')#3)              Project(t2.id)
              |                                                |
            Scan(t1)                                       Scan(t2)

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 25, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@924060929
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 96.97% (32/33) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 34187 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7ccf613459c1e6d13cc20be0241d6711f142babd, data reload: false

------ Round 1 ----------------------------------
q1	17613	5038	4904	4904
q2	2068	320	216	216
q3	10226	1295	704	704
q4	10244	875	371	371
q5	7531	2277	2365	2277
q6	184	164	134	134
q7	901	787	645	645
q8	9376	1383	1093	1093
q9	7038	5261	5277	5261
q10	6819	2246	1850	1850
q11	488	305	277	277
q12	328	373	223	223
q13	17800	3640	3069	3069
q14	259	240	222	222
q15	567	515	515	515
q16	1020	1019	980	980
q17	597	877	363	363
q18	7443	7145	7026	7026
q19	1095	965	565	565
q20	363	354	225	225
q21	3775	2586	2300	2300
q22	1073	1027	967	967
Total cold run time: 106808 ms
Total hot run time: 34187 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4975	4956	4979	4956
q2	318	392	322	322
q3	2194	2647	2352	2352
q4	1360	1818	1347	1347
q5	4247	4368	4568	4368
q6	216	178	135	135
q7	2126	1991	1868	1868
q8	2640	2606	2576	2576
q9	7569	7577	7494	7494
q10	3060	3303	2843	2843
q11	586	529	509	509
q12	690	778	630	630
q13	3559	3902	3267	3267
q14	310	309	289	289
q15	555	517	523	517
q16	1124	1127	1120	1120
q17	1170	1625	1392	1392
q18	7914	7889	7613	7613
q19	778	799	985	799
q20	2058	2073	1996	1996
q21	4956	4356	4294	4294
q22	1106	1069	978	978
Total cold run time: 53511 ms
Total hot run time: 51665 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185010 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7ccf613459c1e6d13cc20be0241d6711f142babd, data reload: false

query1	1061	408	398	398
query2	6562	1601	1606	1601
query3	6769	225	225	225
query4	25709	23029	22366	22366
query5	4650	637	465	465
query6	339	248	231	231
query7	4648	503	297	297
query8	299	266	248	248
query9	8706	2621	2621	2621
query10	547	362	315	315
query11	15299	14820	14590	14590
query12	199	122	113	113
query13	1695	596	451	451
query14	11431	8957	8925	8925
query15	224	206	190	190
query16	7714	704	548	548
query17	1647	815	649	649
query18	2060	440	353	353
query19	243	208	198	198
query20	132	129	133	129
query21	223	141	119	119
query22	4025	3972	3974	3972
query23	33039	32239	32274	32239
query24	8410	2420	2465	2420
query25	611	561	494	494
query26	1237	278	168	168
query27	2695	507	362	362
query28	4327	2168	2144	2144
query29	857	644	546	546
query30	314	249	216	216
query31	887	710	630	630
query32	85	79	76	76
query33	620	449	342	342
query34	772	867	534	534
query35	791	826	747	747
query36	914	973	853	853
query37	123	108	86	86
query38	3350	3360	3295	3295
query39	1592	1412	1390	1390
query40	218	127	121	121
query41	64	61	64	61
query42	130	113	112	112
query43	454	477	445	445
query44	1229	756	768	756
query45	194	208	187	187
query46	888	989	650	650
query47	1667	1782	1633	1633
query48	389	417	323	323
query49	773	495	431	431
query50	672	681	410	410
query51	3993	3911	3943	3911
query52	113	113	104	104
query53	238	264	191	191
query54	307	290	302	290
query55	85	86	86	86
query56	318	336	316	316
query57	1195	1170	1107	1107
query58	296	278	275	275
query59	2441	2569	2448	2448
query60	359	339	338	338
query61	165	195	166	166
query62	787	724	647	647
query63	234	194	199	194
query64	4499	1192	897	897
query65	4045	3974	3979	3974
query66	1095	445	335	335
query67	15546	14980	14919	14919
query68	8588	960	627	627
query69	493	339	305	305
query70	1283	1247	1214	1214
query71	499	337	313	313
query72	5814	4945	4915	4915
query73	717	599	368	368
query74	8877	8517	8782	8517
query75	3957	3352	2796	2796
query76	3768	1139	718	718
query77	824	404	312	312
query78	9568	9757	8927	8927
query79	2172	835	608	608
query80	632	597	507	507
query81	513	275	246	246
query82	464	158	134	134
query83	275	267	255	255
query84	262	115	102	102
query85	941	501	450	450
query86	392	283	304	283
query87	3434	3559	3477	3477
query88	3526	2290	2286	2286
query89	387	331	298	298
query90	1889	230	229	229
query91	178	178	144	144
query92	89	68	64	64
query93	1617	984	663	663
query94	727	443	327	327
query95	507	409	413	409
query96	510	572	293	293
query97	2949	2943	2885	2885
query98	250	221	204	204
query99	1329	1405	1261	1261
Total cold run time: 274631 ms
Total hot run time: 185010 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.99 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7ccf613459c1e6d13cc20be0241d6711f142babd, data reload: false

query1	0.06	0.06	0.05
query2	0.10	0.06	0.05
query3	0.26	0.09	0.08
query4	1.62	0.11	0.12
query5	0.27	0.27	0.26
query6	1.17	0.65	0.63
query7	0.03	0.03	0.03
query8	0.05	0.04	0.04
query9	0.59	0.53	0.51
query10	0.58	0.57	0.58
query11	0.17	0.11	0.12
query12	0.16	0.13	0.12
query13	0.62	0.61	0.61
query14	1.00	1.01	1.00
query15	0.85	0.84	0.85
query16	0.38	0.39	0.40
query17	1.01	1.02	1.04
query18	0.22	0.20	0.20
query19	1.98	1.76	1.84
query20	0.02	0.01	0.02
query21	15.45	0.21	0.13
query22	4.89	0.07	0.05
query23	15.70	0.27	0.10
query24	2.72	0.92	1.03
query25	0.07	0.07	0.06
query26	0.15	0.13	0.12
query27	0.07	0.06	0.06
query28	4.82	1.17	0.93
query29	12.54	3.87	3.22
query30	0.28	0.15	0.14
query31	2.82	0.59	0.40
query32	3.22	0.55	0.47
query33	3.05	3.05	3.06
query34	15.80	5.17	4.55
query35	4.61	4.55	4.60
query36	0.67	0.51	0.49
query37	0.10	0.08	0.06
query38	0.07	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 98.57 s
Total hot run time: 27.99 s

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 26, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@924060929 924060929 merged commit c30c0ff into apache:master Nov 26, 2025
31 of 33 checks passed
@924060929 924060929 deleted the opt_push_project branch November 26, 2025 03:44
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
optimize push down project, this can reduce the scan bytes and shuffle
bytes by prune nested column. apache#57204 related

the sql:
```sql
select coalecse(struct_element(t1.s, 'city'), 'beijing') 
from t1 join t2
on t1.id = t2.id
```

original plan:
```
Project(coalecse(struct_element(t1.s, 'city'), 'beijing'))
                             |
                    Join(t1.id=t2.id)
                    /               \
            Project(t1.id, t1.s)    Project(t2.id)
                 |                    |
            Scan(t1)                Scan(t2)
```

optimize plan:
```

                       Project(coalecse(slot#3, 'beijing'))
                                      |
                               Join(t1.id=t2.id)
                    /                                       \
Project(t1.id, struct_element(t1.s, 'city')apache#3)              Project(t2.id)
              |                                                |
            Scan(t1)                                       Scan(t2)
```
924060929 added a commit that referenced this pull request Dec 23, 2025
optimize push down project, this can reduce the scan bytes and shuffle
bytes by prune nested column. #57204 related

the sql:
```sql
select coalecse(struct_element(t1.s, 'city'), 'beijing')
from t1 join t2
on t1.id = t2.id
```

original plan:
```
Project(coalecse(struct_element(t1.s, 'city'), 'beijing'))
                             |
                    Join(t1.id=t2.id)
                    /               \
            Project(t1.id, t1.s)    Project(t2.id)
                 |                    |
            Scan(t1)                Scan(t2)
```

optimize plan:
```

                       Project(coalecse(slot#3, 'beijing'))
                                      |
                               Join(t1.id=t2.id)
                    /                                       \
Project(t1.id, struct_element(t1.s, 'city')#3)              Project(t2.id)
              |                                                |
            Scan(t1)                                       Scan(t2)
```

(cherry picked from commit c30c0ff)
yiguolei pushed a commit that referenced this pull request Dec 24, 2025
…rning (#59286)

### What problem does this PR solve?

Problem Summary:

### Release note

Cherry-pick #58370 #58354 #59043 #58851 #58485 #58682 #58614 #58373
#57204 #58719 #58471 #58573 #58657

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->

---------

Co-authored-by: 924060929 <lanhuajian@selectdb.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: Jerry Hu <hushenggang@selectdb.com>
Co-authored-by: lihangyu <lihangyu@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants