Skip to content

Conversation

@englefly
Copy link
Contributor

What problem does this PR solve?

the following rules only applies on pattern:
topn->outerJoin
if they are used a rbo rules, we miss the opportunity to optimize the plan, when the initial plan pattern is topn->innerJoin.
to utilize the join reorder, the are moved to cbo rules, and when bottom outer join reorders as the root of join cluster, these rules could be applied.

PushDownTopNThroughJoin
PushDownLimitDistinctThroughJoin
PushDownTopNDistinctThroughJoin

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 989d8c8efc27ed944f164daf32b98e1475cc2091, data reload: false

------ Round 1 ----------------------------------
q1	17579	6193	6046	6046
q2	2049	303	167	167
q3	10455	1243	759	759
q4	10308	863	437	437
q5	9122	2202	1967	1967
q6	217	178	145	145
q7	892	763	588	588
q8	9228	1390	1191	1191
q9	5299	4834	4932	4834
q10	6739	2276	1871	1871
q11	473	287	262	262
q12	345	359	215	215
q13	18292	3618	3062	3062
q14	249	251	216	216
q15	545	504	509	504
q16	644	614	592	592
q17	591	848	340	340
q18	6810	6489	6332	6332
q19	2181	953	561	561
q20	315	334	201	201
q21	2968	2216	2080	2080
q22	371	345	309	309
Total cold run time: 105672 ms
Total hot run time: 32679 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6337	6255	6278	6255
q2	234	325	249	249
q3	2243	2638	2289	2289
q4	1496	1852	1461	1461
q5	4303	4756	4847	4756
q6	185	172	142	142
q7	2051	1964	1840	1840
q8	2662	2860	2710	2710
q9	7222	7230	7257	7230
q10	3062	3249	2751	2751
q11	587	541	514	514
q12	731	755	658	658
q13	3581	3816	3325	3325
q14	294	308	289	289
q15	578	519	507	507
q16	649	683	652	652
q17	1205	1747	1264	1264
q18	7835	7444	7336	7336
q19	859	1164	1088	1088
q20	1995	2002	1905	1905
q21	5838	5147	5209	5147
q22	619	646	601	601
Total cold run time: 54566 ms
Total hot run time: 52969 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195740 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 989d8c8efc27ed944f164daf32b98e1475cc2091, data reload: false

query1	1305	976	938	938
query2	6395	2331	2362	2331
query3	11083	4758	4779	4758
query4	32535	23631	23281	23281
query5	3499	595	449	449
query6	271	188	177	177
query7	3972	489	304	304
query8	289	247	240	240
query9	9212	2744	2747	2744
query10	481	317	256	256
query11	17708	15448	14999	14999
query12	167	106	108	106
query13	1549	518	391	391
query14	9764	7289	7585	7289
query15	270	218	192	192
query16	7420	622	438	438
query17	1567	778	626	626
query18	2018	416	317	317
query19	207	191	194	191
query20	126	117	113	113
query21	203	126	104	104
query22	4486	4490	4403	4403
query23	34036	33811	33494	33494
query24	6449	2350	2366	2350
query25	508	463	404	404
query26	857	286	162	162
query27	2037	480	351	351
query28	5664	2494	2491	2491
query29	633	562	424	424
query30	214	209	164	164
query31	950	875	821	821
query32	91	58	55	55
query33	500	343	298	298
query34	794	864	525	525
query35	772	821	738	738
query36	1022	1051	999	999
query37	132	100	79	79
query38	4083	4336	4090	4090
query39	1509	1480	1430	1430
query40	207	127	101	101
query41	55	49	47	47
query42	127	106	106	106
query43	530	542	510	510
query44	1398	843	846	843
query45	183	178	170	170
query46	876	1053	664	664
query47	1864	1882	1842	1842
query48	392	415	320	320
query49	729	486	389	389
query50	675	673	399	399
query51	7044	7024	6951	6951
query52	101	100	89	89
query53	236	268	187	187
query54	493	512	430	430
query55	89	90	86	86
query56	248	251	248	248
query57	1266	1192	1180	1180
query58	250	235	250	235
query59	3163	3428	3143	3143
query60	269	263	257	257
query61	138	115	112	112
query62	820	792	709	709
query63	225	192	190	190
query64	3734	1035	671	671
query65	3333	3248	3195	3195
query66	898	412	302	302
query67	16398	15654	15413	15413
query68	9296	709	523	523
query69	475	285	253	253
query70	1214	1138	1094	1094
query71	428	351	246	246
query72	6463	3903	3813	3813
query73	672	753	369	369
query74	9876	9105	8921	8921
query75	3927	3157	2682	2682
query76	3586	1175	767	767
query77	761	376	363	363
query78	10036	10008	9323	9323
query79	3265	823	593	593
query80	752	516	432	432
query81	479	287	238	238
query82	600	149	132	132
query83	186	182	153	153
query84	283	89	77	77
query85	838	342	297	297
query86	361	307	282	282
query87	4491	4323	4436	4323
query88	3352	2229	2178	2178
query89	418	321	285	285
query90	1849	192	191	191
query91	134	148	110	110
query92	62	54	54	54
query93	1847	866	528	528
query94	652	383	291	291
query95	336	261	259	259
query96	501	612	277	277
query97	2818	2947	2835	2835
query98	214	203	193	193
query99	1639	1478	1347	1347
Total cold run time: 292330 ms
Total hot run time: 195740 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.34 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 989d8c8efc27ed944f164daf32b98e1475cc2091, data reload: false

query1	0.03	0.03	0.03
query2	0.08	0.03	0.03
query3	0.24	0.07	0.07
query4	1.62	0.11	0.12
query5	0.41	0.43	0.43
query6	1.14	0.65	0.64
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.50	0.50
query10	0.56	0.57	0.54
query11	0.14	0.10	0.10
query12	0.15	0.11	0.10
query13	0.60	0.60	0.60
query14	2.74	2.87	2.75
query15	0.89	0.83	0.81
query16	0.38	0.38	0.38
query17	1.01	1.07	1.06
query18	0.22	0.21	0.20
query19	1.95	1.82	2.00
query20	0.01	0.01	0.01
query21	15.36	0.93	0.61
query22	0.75	0.93	0.68
query23	15.13	1.46	0.53
query24	3.04	1.10	0.98
query25	0.12	0.17	0.12
query26	0.40	0.15	0.13
query27	0.06	0.05	0.04
query28	13.63	1.61	1.05
query29	12.58	3.95	3.27
query30	0.26	0.10	0.06
query31	2.80	0.60	0.38
query32	3.23	0.55	0.45
query33	3.13	3.06	3.04
query34	16.67	5.11	4.48
query35	4.54	4.56	4.50
query36	0.64	0.49	0.48
query37	0.10	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.03	0.02
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.03
query43	0.04	0.03	0.02
Total cold run time: 105.66 s
Total hot run time: 31.34 s

@wm1581066 wm1581066 requested a review from morrySnow January 13, 2025 02:07
@wm1581066 wm1581066 added the usercase Important user case type label label Jan 13, 2025
@englefly englefly force-pushed the rewrite-rule-to-cbo branch from 989d8c8 to 41d9205 Compare January 13, 2025 03:47
@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32791 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 41d9205099a69381261b5b6b031774d0d13402b8, data reload: false

------ Round 1 ----------------------------------
q1	17585	6169	6029	6029
q2	2054	306	183	183
q3	10409	1235	752	752
q4	10227	870	435	435
q5	8175	2247	2020	2020
q6	210	178	153	153
q7	892	761	602	602
q8	9237	1363	1202	1202
q9	5159	4962	4980	4962
q10	6856	2305	1841	1841
q11	486	284	252	252
q12	348	369	223	223
q13	17765	3619	3045	3045
q14	237	228	203	203
q15	568	509	497	497
q16	625	603	583	583
q17	587	867	325	325
q18	6827	6527	6428	6428
q19	2296	951	554	554
q20	311	328	192	192
q21	2860	2241	2013	2013
q22	363	344	297	297
Total cold run time: 104077 ms
Total hot run time: 32791 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6360	6441	6225	6225
q2	241	330	233	233
q3	2250	2678	2296	2296
q4	1422	1800	1398	1398
q5	4384	4708	4872	4708
q6	187	176	140	140
q7	2032	1959	1783	1783
q8	2648	2868	2719	2719
q9	7262	7334	7304	7304
q10	3070	3317	2737	2737
q11	597	547	508	508
q12	663	799	619	619
q13	3486	3844	3246	3246
q14	287	326	277	277
q15	562	526	513	513
q16	670	700	642	642
q17	1228	1753	1264	1264
q18	7707	7400	7415	7400
q19	832	1187	1081	1081
q20	1921	2048	1869	1869
q21	5597	4979	5106	4979
q22	588	617	636	617
Total cold run time: 53994 ms
Total hot run time: 52558 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193287 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 41d9205099a69381261b5b6b031774d0d13402b8, data reload: false

query1	1296	954	947	947
query2	6325	2355	2345	2345
query3	11087	4824	4569	4569
query4	32926	23894	23211	23211
query5	4253	604	454	454
query6	286	203	184	184
query7	3976	488	307	307
query8	296	238	226	226
query9	9393	2682	2666	2666
query10	478	305	253	253
query11	17910	15018	15205	15018
query12	167	105	105	105
query13	1641	503	386	386
query14	11662	6420	7216	6420
query15	259	191	177	177
query16	7933	634	503	503
query17	1557	770	558	558
query18	2128	421	317	317
query19	216	178	158	158
query20	130	119	114	114
query21	204	120	106	106
query22	4674	4524	4400	4400
query23	35037	33274	33324	33274
query24	6075	2254	2407	2254
query25	472	489	416	416
query26	709	253	172	172
query27	2007	475	356	356
query28	5170	2440	2436	2436
query29	532	525	424	424
query30	213	198	154	154
query31	949	856	808	808
query32	74	61	60	60
query33	473	358	281	281
query34	758	856	489	489
query35	816	806	721	721
query36	1044	1021	976	976
query37	134	94	74	74
query38	3976	4316	4046	4046
query39	1507	1458	1439	1439
query40	197	113	99	99
query41	55	48	50	48
query42	125	99	101	99
query43	512	544	497	497
query44	1344	819	823	819
query45	176	188	163	163
query46	873	1044	642	642
query47	1915	1953	1828	1828
query48	393	408	324	324
query49	725	482	390	390
query50	619	660	383	383
query51	7106	7099	6945	6945
query52	102	97	94	94
query53	225	256	187	187
query54	476	502	403	403
query55	80	74	79	74
query56	253	248	251	248
query57	1220	1176	1146	1146
query58	252	245	229	229
query59	3165	3291	3047	3047
query60	282	278	259	259
query61	117	115	124	115
query62	856	820	755	755
query63	242	193	188	188
query64	3013	1021	654	654
query65	3312	3180	3262	3180
query66	775	409	304	304
query67	16331	15703	15386	15386
query68	7669	687	506	506
query69	485	285	253	253
query70	1201	1118	1117	1117
query71	407	279	241	241
query72	6459	3915	3818	3818
query73	633	738	357	357
query74	10401	9069	8767	8767
query75	3588	3203	2652	2652
query76	3288	1164	789	789
query77	649	412	286	286
query78	9973	9816	9379	9379
query79	3712	798	576	576
query80	609	512	436	436
query81	494	269	233	233
query82	671	150	115	115
query83	165	172	149	149
query84	239	84	73	73
query85	732	337	294	294
query86	390	303	292	292
query87	4460	4305	4216	4216
query88	5034	2124	2103	2103
query89	413	317	285	285
query90	1796	187	184	184
query91	133	140	105	105
query92	71	58	56	56
query93	2380	855	525	525
query94	675	400	305	305
query95	342	271	271	271
query96	480	609	281	281
query97	2831	2879	2765	2765
query98	235	201	198	198
query99	1439	1469	1350	1350
Total cold run time: 295538 ms
Total hot run time: 193287 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 41d9205099a69381261b5b6b031774d0d13402b8, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.03	0.04
query3	0.24	0.06	0.07
query4	1.62	0.10	0.10
query5	0.42	0.42	0.42
query6	1.16	0.65	0.65
query7	0.03	0.02	0.02
query8	0.04	0.03	0.02
query9	0.58	0.50	0.50
query10	0.54	0.55	0.53
query11	0.14	0.10	0.11
query12	0.14	0.11	0.10
query13	0.60	0.60	0.60
query14	2.75	2.76	2.87
query15	0.88	0.81	0.81
query16	0.39	0.39	0.37
query17	0.98	1.03	1.02
query18	0.23	0.21	0.20
query19	1.86	1.87	1.99
query20	0.01	0.01	0.00
query21	15.36	1.01	0.59
query22	0.76	0.81	0.67
query23	15.29	1.44	0.58
query24	2.85	1.11	1.78
query25	0.25	0.13	0.06
query26	0.19	0.13	0.14
query27	0.07	0.05	0.05
query28	14.25	1.50	1.04
query29	12.55	3.93	3.24
query30	0.24	0.10	0.06
query31	2.83	0.57	0.39
query32	3.22	0.54	0.45
query33	3.11	3.08	3.02
query34	16.99	5.11	4.53
query35	4.47	4.51	4.54
query36	0.62	0.51	0.47
query37	0.09	0.06	0.06
query38	0.05	0.03	0.03
query39	0.03	0.03	0.03
query40	0.16	0.14	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.04	0.03
Total cold run time: 106.25 s
Total hot run time: 31.41 s

@englefly
Copy link
Contributor Author

run p0

.add(PushDownProjectThroughInnerOuterJoin.INSTANCE)
.add(PushDownProjectThroughSemiJoin.INSTANCE)
.add(TransposeAggSemiJoinProject.INSTANCE)
.addAll(new PushDownTopNThroughJoin().buildRules())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. better to add regression tracking case
  2. apply rqg like random plan testing

@englefly englefly changed the title [opt](nereids) move some topn-join rules from rbo to cbo [opt](nereids) some topn-join rules are both used in rbo and cbo Jan 14, 2025
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 15, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@englefly englefly merged commit 9c3b634 into apache:master Jan 15, 2025
28 of 31 checks passed
@englefly englefly deleted the rewrite-rule-to-cbo branch January 15, 2025 10:02
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…che#46773)

### What problem does this PR solve?
the following rules only applies on pattern: 
topn->outerJoin
if they are used a rbo rules, we miss the opportunity to optimize the
plan, when the initial plan pattern is topn->innerJoin.
to utilize the join reorder, the are moved to cbo rules, and when bottom
outer join reorders as the root of join cluster, these rules could be
applied.

PushDownTopNThroughJoin
PushDownLimitDistinctThroughJoin
PushDownTopNDistinctThroughJoin
@morrySnow morrySnow added the need more test Add more test label Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. need more test Add more test reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants