Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](nereids) extend infer predicates #40878

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

feiniaofeiafei
Copy link
Contributor

@feiniaofeiafei feiniaofeiafei commented Sep 14, 2024

This pr refactors the PredicatePropagation module and adds support for predicate deduction, including:

  1. Support for predicate deduction of like, not in, !=;
  2. Support for predicate deduction of abs(b)=1 for a=b and abs(a)=1;
  3. Support for transitive deduction of non-equivalent relations, for example, a>b b>1 leads to a>1.
  4. Deleted useless predicates.
    But still has something to do in predicate inference:
  5. support expr in infer predicate, e.g. abs(t1.c1)>abs(t2.c2) and abs(t1.c1)<1
  6. need to add expr qualifier info, to determine whether abs(t1.c1) and abs(t2.c2) is from same table.
    tpcds 1000 total time
    before pr:
    689804 | 554230 | 550065 | 546197
    after pr:
    670477 | 551991 | 544594 | 542985
    No performance degradation was observed

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41977 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f1aa1b38d131f953975a4a6ee88080a185771c32, data reload: false

------ Round 1 ----------------------------------
q1	17574	7278	7170	7170
q2	2056	172	152	152
q3	10790	1162	1282	1162
q4	10430	815	850	815
q5	7757	3191	3140	3140
q6	230	151	144	144
q7	1009	624	586	586
q8	9436	1980	2033	1980
q9	6760	6805	6415	6415
q10	7071	2333	2333	2333
q11	430	243	250	243
q12	387	215	211	211
q13	17787	2986	2961	2961
q14	254	210	217	210
q15	557	523	504	504
q16	479	425	418	418
q17	960	906	927	906
q18	7392	6751	6746	6746
q19	1385	1244	1221	1221
q20	591	295	277	277
q21	3951	3470	3393	3393
q22	1119	991	990	990
Total cold run time: 108405 ms
Total hot run time: 41977 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7201	7100	7112	7100
q2	326	230	228	228
q3	3031	3031	3088	3031
q4	2048	2061	1954	1954
q5	5591	5584	5610	5584
q6	238	145	147	145
q7	2219	1777	1812	1777
q8	3303	3367	3399	3367
q9	8766	8938	8794	8794
q10	3520	3546	3539	3539
q11	591	480	495	480
q12	805	596	599	596
q13	10240	3151	3186	3151
q14	317	272	279	272
q15	564	528	518	518
q16	527	478	488	478
q17	1789	1760	1723	1723
q18	8300	7934	8093	7934
q19	1743	1779	1741	1741
q20	2115	1892	1907	1892
q21	5346	5470	5398	5398
q22	1166	1026	1047	1026
Total cold run time: 69746 ms
Total hot run time: 60728 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 199707 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f1aa1b38d131f953975a4a6ee88080a185771c32, data reload: false

query1	1269	876	862	862
query2	6370	1829	1865	1829
query3	10795	3826	3871	3826
query4	54586	25497	23964	23964
query5	5279	517	512	512
query6	448	181	182	181
query7	5758	307	302	302
query8	290	223	218	218
query9	6822	2587	2586	2586
query10	430	298	287	287
query11	16749	15612	15583	15583
query12	156	102	97	97
query13	1518	408	397	397
query14	10382	7439	7215	7215
query15	213	182	178	178
query16	6965	468	477	468
query17	1195	613	594	594
query18	1753	318	317	317
query19	199	154	153	153
query20	124	121	121	121
query21	210	104	104	104
query22	4629	4628	4424	4424
query23	34851	33836	33861	33836
query24	6120	3117	3076	3076
query25	515	400	397	397
query26	622	160	166	160
query27	1586	286	288	286
query28	2841	2088	2062	2062
query29	682	435	431	431
query30	226	153	154	153
query31	961	772	788	772
query32	74	56	57	56
query33	458	305	300	300
query34	883	463	480	463
query35	854	727	716	716
query36	1049	905	916	905
query37	137	84	79	79
query38	4026	3965	3953	3953
query39	1448	1408	1379	1379
query40	203	96	98	96
query41	51	48	48	48
query42	115	97	97	97
query43	484	446	463	446
query44	1163	788	769	769
query45	195	167	165	165
query46	1104	820	867	820
query47	1930	1786	1824	1786
query48	443	347	343	343
query49	693	399	404	399
query50	928	433	432	432
query51	7083	6822	6928	6822
query52	96	84	84	84
query53	243	176	176	176
query54	549	454	446	446
query55	77	74	79	74
query56	269	259	259	259
query57	1207	1103	1103	1103
query58	222	233	265	233
query59	2752	2697	2556	2556
query60	290	265	257	257
query61	102	99	97	97
query62	774	665	677	665
query63	219	186	181	181
query64	1449	653	617	617
query65	3209	3180	3155	3155
query66	686	285	293	285
query67	16126	15482	15528	15482
query68	2021	609	593	593
query69	427	301	299	299
query70	1179	1138	1096	1096
query71	331	269	270	269
query72	5962	4041	4023	4023
query73	761	335	335	335
query74	9429	9043	9116	9043
query75	3451	2713	2806	2713
query76	1387	1301	1296	1296
query77	562	297	307	297
query78	10037	9567	10052	9567
query79	913	879	852	852
query80	785	567	575	567
query81	511	240	237	237
query82	427	195	201	195
query83	225	156	152	152
query84	294	108	104	104
query85	731	362	360	360
query86	327	312	322	312
query87	4389	4536	4548	4536
query88	4101	4026	4014	4014
query89	369	360	352	352
query90	1830	304	293	293
query91	157	161	153	153
query92	74	76	72	72
query93	1039	987	989	987
query94	707	338	398	338
query95	471	413	432	413
query96	469	469	463	463
query97	3107	3107	3135	3107
query98	225	234	228	228
query99	1724	1306	1329	1306
Total cold run time: 294557 ms
Total hot run time: 199707 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.11 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f1aa1b38d131f953975a4a6ee88080a185771c32, data reload: false

query1	0.04	0.05	0.04
query2	0.06	0.03	0.02
query3	0.22	0.06	0.06
query4	1.66	0.11	0.10
query5	0.51	0.52	0.51
query6	1.15	0.73	0.73
query7	0.01	0.01	0.02
query8	0.04	0.03	0.02
query9	0.57	0.51	0.50
query10	0.57	0.58	0.57
query11	0.13	0.10	0.10
query12	0.13	0.10	0.10
query13	0.60	0.59	0.59
query14	1.42	1.48	1.43
query15	0.89	0.86	0.85
query16	0.37	0.38	0.37
query17	1.08	1.08	1.06
query18	0.18	0.18	0.18
query19	1.88	1.82	1.79
query20	0.01	0.02	0.01
query21	15.38	0.58	0.57
query22	3.25	4.07	2.69
query23	17.71	1.06	0.96
query24	2.34	0.57	0.21
query25	0.32	0.11	0.04
query26	0.16	0.13	0.14
query27	0.04	0.04	0.03
query28	11.99	1.02	0.98
query29	12.56	3.31	3.23
query30	0.24	0.06	0.06
query31	2.89	0.37	0.38
query32	3.28	0.46	0.46
query33	2.97	3.00	3.02
query34	15.43	4.27	4.27
query35	4.34	4.31	4.30
query36	0.70	0.49	0.47
query37	0.08	0.06	0.06
query38	0.05	0.03	0.04
query39	0.03	0.02	0.02
query40	0.16	0.12	0.13
query41	0.07	0.02	0.02
query42	0.03	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.57 s
Total hot run time: 31.11 s

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41685 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5903fb1a841a4eea91883305bf0897121429c683, data reload: false

------ Round 1 ----------------------------------
q1	17928	8532	7360	7360
q2	2814	187	194	187
q3	11735	1162	1175	1162
q4	10390	797	726	726
q5	7770	3115	3121	3115
q6	236	148	145	145
q7	1019	640	623	623
q8	9425	2011	2133	2011
q9	6850	6377	6397	6377
q10	7002	2266	2273	2266
q11	429	251	251	251
q12	405	209	211	209
q13	17788	2994	2989	2989
q14	239	217	223	217
q15	573	541	522	522
q16	702	625	625	625
q17	958	822	782	782
q18	7151	6814	6707	6707
q19	1406	1053	1020	1020
q20	602	285	273	273
q21	3964	3116	3180	3116
q22	1119	1002	1016	1002
Total cold run time: 110505 ms
Total hot run time: 41685 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7213	7248	7249	7248
q2	336	232	219	219
q3	2837	2777	2781	2777
q4	1914	1696	1670	1670
q5	5373	5364	5367	5364
q6	225	143	136	136
q7	2056	1725	1700	1700
q8	3167	3300	3316	3300
q9	8377	8394	8386	8386
q10	3369	3348	3360	3348
q11	568	467	468	467
q12	798	588	575	575
q13	4988	2992	2984	2984
q14	311	278	264	264
q15	572	516	514	514
q16	697	675	675	675
q17	1751	1550	1549	1549
q18	7858	7425	7473	7425
q19	1656	1555	1433	1433
q20	2015	1805	1818	1805
q21	5364	5154	5096	5096
q22	1131	1030	1024	1024
Total cold run time: 62576 ms
Total hot run time: 57959 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195312 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5903fb1a841a4eea91883305bf0897121429c683, data reload: false

query1	951	385	374	374
query2	6535	2073	2020	2020
query3	6734	214	222	214
query4	34345	23380	23448	23380
query5	4268	477	480	477
query6	265	175	174	174
query7	4618	311	305	305
query8	306	231	221	221
query9	9890	2681	2669	2669
query10	471	283	292	283
query11	18227	15285	15085	15085
query12	164	102	105	102
query13	1623	417	418	417
query14	10913	8196	7503	7503
query15	325	166	172	166
query16	7697	439	477	439
query17	1754	572	555	555
query18	1574	296	299	296
query19	359	140	148	140
query20	116	108	109	108
query21	215	103	101	101
query22	4471	4234	4204	4204
query23	34791	34085	34067	34067
query24	11227	2969	2924	2924
query25	604	380	387	380
query26	1052	155	158	155
query27	2349	277	283	277
query28	7833	2458	2453	2453
query29	715	422	427	422
query30	328	158	153	153
query31	1007	781	792	781
query32	100	55	58	55
query33	755	294	285	285
query34	983	475	486	475
query35	860	734	719	719
query36	1079	890	927	890
query37	146	81	84	81
query38	4090	3856	3889	3856
query39	1466	1409	1384	1384
query40	204	93	95	93
query41	50	47	80	47
query42	118	99	95	95
query43	520	476	488	476
query44	1341	822	806	806
query45	194	162	165	162
query46	1135	732	778	732
query47	1867	1778	1806	1778
query48	461	363	372	363
query49	1139	418	405	405
query50	817	405	422	405
query51	7087	7047	7045	7045
query52	98	87	85	85
query53	253	185	183	183
query54	1293	464	482	464
query55	83	76	78	76
query56	283	249	280	249
query57	1184	1098	1095	1095
query58	247	232	238	232
query59	3129	2993	2904	2904
query60	300	273	273	273
query61	106	100	101	100
query62	835	658	660	658
query63	218	192	193	192
query64	4125	631	636	631
query65	3230	3171	3179	3171
query66	822	314	331	314
query67	16032	15602	15423	15423
query68	3058	866	854	854
query69	486	350	352	350
query70	1207	1188	1168	1168
query71	342	344	339	339
query72	6112	3568	3578	3568
query73	590	581	592	581
query74	9298	9045	9029	9029
query75	3123	2858	2861	2858
query76	1888	882	880	880
query77	452	378	372	372
query78	9519	9233	9258	9233
query79	934	894	879	879
query80	654	602	602	602
query81	472	257	260	257
query82	232	230	241	230
query83	170	191	165	165
query84	247	118	106	106
query85	738	428	419	419
query86	316	316	324	316
query87	4442	4343	4519	4343
query88	4336	4066	4043	4043
query89	375	365	354	354
query90	1487	315	321	315
query91	163	164	194	164
query92	82	75	71	71
query93	897	883	884	883
query94	576	388	381	381
query95	430	408	405	405
query96	482	483	489	483
query97	3148	3111	3137	3111
query98	230	222	216	216
query99	1455	1320	1290	1290
Total cold run time: 290318 ms
Total hot run time: 195312 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5903fb1a841a4eea91883305bf0897121429c683, data reload: false

query1	0.05	0.05	0.04
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.65	0.10	0.10
query5	0.52	0.49	0.52
query6	1.13	0.73	0.73
query7	0.02	0.01	0.01
query8	0.04	0.03	0.04
query9	0.56	0.50	0.50
query10	0.57	0.58	0.54
query11	0.14	0.10	0.10
query12	0.14	0.11	0.11
query13	0.60	0.59	0.59
query14	3.09	2.97	3.02
query15	0.90	0.82	0.81
query16	0.38	0.38	0.39
query17	1.02	1.01	1.02
query18	0.22	0.20	0.21
query19	1.88	1.96	1.87
query20	0.00	0.01	0.01
query21	15.36	0.61	0.60
query22	2.28	3.86	1.44
query23	17.07	0.90	0.79
query24	2.49	1.06	0.99
query25	0.24	0.19	0.10
query26	0.47	0.14	0.14
query27	0.03	0.04	0.03
query28	10.92	1.10	1.07
query29	12.55	3.27	3.25
query30	0.25	0.06	0.06
query31	2.88	0.39	0.38
query32	3.27	0.48	0.46
query33	2.97	2.99	3.04
query34	17.01	4.42	4.40
query35	4.43	4.44	4.39
query36	0.65	0.47	0.47
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.15	0.12	0.12
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.51 s
Total hot run time: 32.44 s

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei feiniaofeiafei changed the title [feature](nereids)add unequal predicates infer [feature](nereids) extend infer predicates Sep 20, 2024
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
suite("extend_infer_equal_predicate") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. add ut for non-inner join cases
  2. add ut containing subquery, e.g, with constant propagation, etc.
  3. add ut which outer->inner may happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case: select * from t1 where t1.c1 exists (select t2.c2 from t2, t3 where t1.c2 = t2.c3 and t2.c3 = t3.c1) and t1.c2 = 1;

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei feiniaofeiafei force-pushed the non_equal_predicate_infer branch 2 times, most recently from 6a35e96 to 895d95b Compare September 23, 2024 13:06
@feiniaofeiafei
Copy link
Contributor Author

run buildall

2 similar comments
@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

[Feature](nereids) extend derivation of equivalence predicate

[Feature](nereids) extend derivation of equivalence predicate

literal should not be targetExpr and produce new predicate

delete useless code

delete useless code

[Feature](nereids) extend derivation of equivalence predicate

fix a=1 b=1 a=2 deduce to a=b

[Feature](nereids) extend derivation of equivalence predicate

[Feature](nereids) extend derivation of equivalence predicate

[Feature](nereids) extend derivation of equivalence predicate

[Feature](nereids) extend derivation of equivalence predicate

add feut

only support slot equalTo infer, remove the expr equalTo infer support

fix feut

change the logic of pull up predicate from join, and support a=1 b=1 infer a=b

not infer same table predicates

revert 'change the logic of pull up predicate from join, and support a=1 b=1 infer a=b'

sort the equal set, and infer a=b a=c to b=c in order
…ualTo in unequal infer and change pull up join predicates
… decimalv3, and add feut for unequal predicate infer
…literal in infer, add try catch in generate fitlers
@feiniaofeiafei
Copy link
Contributor Author

run buildall

@feiniaofeiafei
Copy link
Contributor Author

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants