Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nereids](mtmv)Support rewrite by mv nested materialized view #33362

Merged
merged 10 commits into from
Apr 20, 2024

Conversation

seawinde
Copy link
Contributor

@seawinde seawinde commented Apr 8, 2024

Proposed changes

Support query rewritting by nested materialized view.
Such as inner_mv def is as following

        select
        l_linenumber,
        o_custkey,
        o_orderkey,
        o_orderstatus,
        l_partkey,
        l_suppkey,
        l_orderkey
        from lineitem
        inner join orders on lineitem.l_orderkey = orders.o_orderkey;

the mv1_0 def is as following:

        select
        l_linenumber,
        o_custkey,
        o_orderkey,
        o_orderstatus,
        l_partkey,
        l_suppkey,
        l_orderkey,
        ps_availqty
        from inner_mv
        inner join partsupp on l_partkey = ps_partkey AND l_suppkey = ps_suppkey;

for the following query, both inner_mv and mv1_0 can be successful when query rewritting by materialized view,and cbo will chose mv1_0 finally.

       select lineitem.l_linenumber
        from lineitem
        inner join orders on l_orderkey = o_orderkey
        inner join partsupp on  l_partkey = ps_partkey AND l_suppkey = ps_suppkey
        where o_orderstatus = 'o' AND l_linenumber in (1, 2, 3, 4, 5)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@seawinde
Copy link
Contributor Author

seawinde commented Apr 8, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38821 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dee85bfb26c14175919cb287d24f19ad36be8357, data reload: false

------ Round 1 ----------------------------------
q1	17602	4302	4233	4233
q2	2022	189	182	182
q3	10727	1173	1194	1173
q4	10502	869	753	753
q5	7701	2783	2743	2743
q6	221	136	134	134
q7	1068	615	617	615
q8	9334	2085	2079	2079
q9	8010	6736	6641	6641
q10	9040	3527	3529	3527
q11	462	233	228	228
q12	442	208	204	204
q13	18855	2892	2953	2892
q14	267	238	244	238
q15	515	465	462	462
q16	498	399	381	381
q17	946	719	745	719
q18	7454	6744	6676	6676
q19	1690	1518	1497	1497
q20	683	324	309	309
q21	3385	2828	2930	2828
q22	350	307	325	307
Total cold run time: 111774 ms
Total hot run time: 38821 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4259	4199	4208	4199
q2	363	265	263	263
q3	2973	2772	2756	2756
q4	1855	1601	1585	1585
q5	5252	5298	5273	5273
q6	207	122	121	121
q7	2279	1837	1880	1837
q8	3196	3337	3332	3332
q9	8635	8582	8515	8515
q10	3900	3719	3683	3683
q11	581	478	488	478
q12	751	573	574	573
q13	15341	2903	2956	2903
q14	307	266	271	266
q15	498	466	473	466
q16	466	428	429	428
q17	1754	1468	1456	1456
q18	7546	7504	7424	7424
q19	1621	1539	1490	1490
q20	1938	1755	1714	1714
q21	5026	4697	4886	4697
q22	535	460	492	460
Total cold run time: 69283 ms
Total hot run time: 53919 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181859 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dee85bfb26c14175919cb287d24f19ad36be8357, data reload: false

query1	883	1121	1123	1121
query2	7216	1959	1959	1959
query3	6656	204	206	204
query4	23581	21218	21234	21218
query5	4129	389	387	387
query6	269	182	184	182
query7	4584	288	283	283
query8	223	170	169	169
query9	9067	2277	2277	2277
query10	551	234	249	234
query11	14776	14171	14175	14171
query12	145	88	89	88
query13	1637	367	370	367
query14	9542	6785	6823	6785
query15	222	174	178	174
query16	7676	262	253	253
query17	1521	604	561	561
query18	1953	280	289	280
query19	203	154	153	153
query20	93	87	85	85
query21	199	125	127	125
query22	4987	4841	4792	4792
query23	33745	32902	32980	32902
query24	12589	2918	2926	2918
query25	679	383	387	383
query26	1901	152	154	152
query27	3116	316	308	308
query28	7840	1910	1885	1885
query29	1251	598	607	598
query30	318	166	162	162
query31	956	735	706	706
query32	95	58	57	57
query33	730	245	250	245
query34	1091	470	475	470
query35	831	693	692	692
query36	1045	923	880	880
query37	279	71	69	69
query38	3650	3418	3387	3387
query39	1572	1555	1524	1524
query40	278	128	130	128
query41	51	48	48	48
query42	105	103	96	96
query43	495	456	448	448
query44	1345	716	706	706
query45	272	258	260	258
query46	1065	703	731	703
query47	1940	1825	1855	1825
query48	360	289	288	288
query49	1147	375	368	368
query50	758	378	380	378
query51	6725	6613	6693	6613
query52	113	90	97	90
query53	351	285	283	283
query54	330	235	249	235
query55	79	74	74	74
query56	242	223	223	223
query57	1205	1153	1164	1153
query58	230	205	203	203
query59	2879	2733	2850	2733
query60	264	252	248	248
query61	111	109	107	107
query62	655	445	445	445
query63	311	284	285	284
query64	6302	3982	3941	3941
query65	3118	3043	3057	3043
query66	1396	325	313	313
query67	15512	14984	15029	14984
query68	8804	532	555	532
query69	549	321	311	311
query70	1253	1129	1154	1129
query71	504	273	268	268
query72	6515	2584	2412	2412
query73	752	316	315	315
query74	6852	6410	6517	6410
query75	3520	2328	2225	2225
query76	4935	1106	1114	1106
query77	660	245	251	245
query78	10990	10350	10193	10193
query79	9068	525	511	511
query80	1655	434	418	418
query81	498	224	228	224
query82	753	88	94	88
query83	204	161	160	160
query84	265	82	81	81
query85	1440	261	256	256
query86	490	305	315	305
query87	3761	3563	3553	3553
query88	6197	2268	2288	2268
query89	527	378	378	378
query90	1988	176	181	176
query91	122	92	93	92
query92	58	46	46	46
query93	7165	510	497	497
query94	1121	174	176	174
query95	405	306	320	306
query96	601	262	255	255
query97	2681	2480	2474	2474
query98	237	216	208	208
query99	1256	819	842	819
Total cold run time: 305266 ms
Total hot run time: 181859 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.51 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dee85bfb26c14175919cb287d24f19ad36be8357, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.22	0.05	0.04
query4	1.68	0.07	0.07
query5	0.49	0.49	0.49
query6	1.13	0.65	0.65
query7	0.02	0.01	0.02
query8	0.05	0.04	0.05
query9	0.54	0.49	0.48
query10	0.54	0.55	0.53
query11	0.17	0.12	0.12
query12	0.15	0.12	0.12
query13	0.59	0.58	0.58
query14	0.77	0.76	0.78
query15	0.83	0.80	0.78
query16	0.35	0.38	0.35
query17	0.94	1.00	1.00
query18	0.20	0.23	0.23
query19	1.76	1.67	1.65
query20	0.02	0.01	0.01
query21	15.43	0.65	0.65
query22	4.10	6.32	2.29
query23	18.15	1.30	1.22
query24	1.68	0.30	0.22
query25	0.14	0.08	0.07
query26	0.26	0.17	0.15
query27	0.09	0.08	0.08
query28	13.30	0.98	0.98
query29	12.60	3.29	3.32
query30	0.26	0.07	0.06
query31	2.85	0.38	0.37
query32	3.30	0.46	0.45
query33	2.82	2.79	2.79
query34	17.12	4.43	4.46
query35	4.53	4.47	4.48
query36	0.64	0.45	0.46
query37	0.20	0.14	0.14
query38	0.15	0.14	0.13
query39	0.05	0.04	0.03
query40	0.16	0.13	0.14
query41	0.09	0.05	0.04
query42	0.06	0.04	0.04
query43	0.04	0.04	0.04
Total cold run time: 108.59 s
Total hot run time: 30.51 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit dee85bfb26c14175919cb287d24f19ad36be8357 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      33 seconds loaded 861443392 Bytes, about 24 MB/s
Insert into select:       13.5 seconds inserted 10000000 Rows, about 740K ops/s

@seawinde
Copy link
Contributor Author

seawinde commented Apr 9, 2024

run buildall

3 similar comments
@seawinde
Copy link
Contributor Author

seawinde commented Apr 9, 2024

run buildall

@seawinde
Copy link
Contributor Author

run buildall

@seawinde
Copy link
Contributor Author

run buildall

@seawinde seawinde force-pushed the support_rewrite_by_mv_nested branch from 8f4b675 to c58ae47 Compare April 11, 2024 11:50
@seawinde
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

PR approved by anyone and no changes requested.

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38741 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c038951807db5beed38082d3bdd12e4d1fbc3630, data reload: false

------ Round 1 ----------------------------------
q1	5956	4509	4311	4311
q2	1628	195	188	188
q3	2871	1215	1180	1180
q4	6174	823	764	764
q5	2654	2717	2887	2717
q6	275	155	143	143
q7	1019	604	604	604
q8	1981	2067	2058	2058
q9	6707	6636	6576	6576
q10	8459	3560	3532	3532
q11	444	241	234	234
q12	378	230	214	214
q13	16838	2956	2928	2928
q14	271	232	229	229
q15	529	479	481	479
q16	488	397	389	389
q17	959	607	649	607
q18	7366	6818	6648	6648
q19	1578	1546	1518	1518
q20	697	311	298	298
q21	3391	2812	2833	2812
q22	363	312	318	312
Total cold run time: 71026 ms
Total hot run time: 38741 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4294	4196	4219	4196
q2	395	286	273	273
q3	2990	2724	2741	2724
q4	1907	1632	1590	1590
q5	5350	5397	5316	5316
q6	210	122	122	122
q7	2233	1882	1863	1863
q8	3228	3367	3340	3340
q9	8679	8658	8656	8656
q10	3931	3691	3666	3666
q11	595	482	500	482
q12	773	592	602	592
q13	17079	2914	2965	2914
q14	299	278	284	278
q15	506	491	468	468
q16	492	467	454	454
q17	1822	1518	1522	1518
q18	7966	8031	7799	7799
q19	1711	1567	1642	1567
q20	2041	1819	1837	1819
q21	5151	4941	5034	4941
q22	560	492	489	489
Total cold run time: 72212 ms
Total hot run time: 55067 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186622 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c038951807db5beed38082d3bdd12e4d1fbc3630, data reload: false

query1	896	373	358	358
query2	6468	2680	2435	2435
query3	6661	206	207	206
query4	23570	21360	21448	21360
query5	4119	427	430	427
query6	296	198	186	186
query7	4588	294	294	294
query8	241	191	184	184
query9	8737	2317	2285	2285
query10	416	245	245	245
query11	14738	14365	14174	14174
query12	140	88	86	86
query13	1637	359	361	359
query14	10067	7693	7899	7693
query15	295	189	193	189
query16	8161	270	268	268
query17	1927	586	582	582
query18	2115	315	276	276
query19	251	153	154	153
query20	92	87	87	87
query21	196	122	123	122
query22	5086	4907	4859	4859
query23	33866	33303	33273	33273
query24	10866	2998	3022	2998
query25	583	387	385	385
query26	665	167	162	162
query27	2284	361	388	361
query28	5811	2099	2072	2072
query29	876	607	623	607
query30	307	191	184	184
query31	979	797	755	755
query32	96	52	55	52
query33	733	256	244	244
query34	1170	512	507	507
query35	866	737	714	714
query36	1089	954	943	943
query37	120	73	75	73
query38	3539	3309	3527	3309
query39	1645	1619	1601	1601
query40	185	134	128	128
query41	49	45	44	44
query42	109	103	101	101
query43	601	562	551	551
query44	1226	765	748	748
query45	287	293	264	264
query46	1166	770	785	770
query47	2012	1941	1958	1941
query48	377	298	301	298
query49	814	410	420	410
query50	780	406	417	406
query51	6843	6783	6855	6783
query52	103	93	98	93
query53	349	286	287	286
query54	307	251	245	245
query55	81	81	75	75
query56	249	233	228	228
query57	1302	1189	1200	1189
query58	231	217	218	217
query59	3609	3321	3044	3044
query60	265	233	245	233
query61	107	103	118	103
query62	611	464	455	455
query63	307	292	284	284
query64	4781	4000	4075	4000
query65	3104	3030	3084	3030
query66	743	329	325	325
query67	15407	15168	14955	14955
query68	5290	570	535	535
query69	498	311	299	299
query70	1251	1196	1211	1196
query71	1406	1272	1269	1269
query72	6492	2617	2497	2497
query73	707	321	324	321
query74	6893	6479	6535	6479
query75	3359	2652	2619	2619
query76	3371	1039	1015	1015
query77	597	269	268	268
query78	11078	10212	10186	10186
query79	3490	533	545	533
query80	2020	434	438	434
query81	511	245	236	236
query82	783	97	100	97
query83	323	163	165	163
query84	263	92	81	81
query85	1722	270	324	270
query86	470	307	298	298
query87	3479	3273	3276	3273
query88	4526	2331	2331	2331
query89	481	377	378	377
query90	1945	190	183	183
query91	125	97	99	97
query92	64	47	50	47
query93	4999	522	514	514
query94	1083	180	185	180
query95	394	307	303	303
query96	634	260	263	260
query97	3112	2999	2927	2927
query98	244	246	216	216
query99	1186	856	844	844
Total cold run time: 284993 ms
Total hot run time: 186622 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c038951807db5beed38082d3bdd12e4d1fbc3630, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.68	0.07	0.06
query5	0.49	0.49	0.49
query6	1.48	0.73	0.73
query7	0.02	0.01	0.01
query8	0.04	0.04	0.04
query9	0.55	0.52	0.49
query10	0.55	0.55	0.55
query11	0.16	0.11	0.12
query12	0.14	0.11	0.12
query13	0.60	0.59	0.59
query14	0.76	0.79	0.78
query15	0.82	0.80	0.82
query16	0.37	0.37	0.38
query17	1.00	0.95	0.96
query18	0.23	0.22	0.24
query19	1.73	1.71	1.68
query20	0.01	0.01	0.01
query21	15.40	0.66	0.65
query22	4.35	7.34	1.83
query23	18.27	1.39	1.27
query24	1.86	0.22	0.24
query25	0.14	0.09	0.08
query26	0.28	0.16	0.17
query27	0.08	0.09	0.08
query28	13.30	0.99	0.97
query29	12.58	3.23	3.23
query30	0.26	0.06	0.06
query31	2.86	0.38	0.39
query32	3.27	0.46	0.45
query33	2.84	2.77	2.82
query34	17.36	4.40	4.39
query35	4.47	4.44	4.50
query36	0.66	0.46	0.46
query37	0.18	0.16	0.15
query38	0.16	0.15	0.15
query39	0.04	0.04	0.03
query40	0.17	0.14	0.13
query41	0.09	0.04	0.05
query42	0.06	0.05	0.04
query43	0.04	0.04	0.03
Total cold run time: 109.7 s
Total hot run time: 30.12 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit c038951807db5beed38082d3bdd12e4d1fbc3630 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       13.3 seconds inserted 10000000 Rows, about 751K ops/s

Copy link
Contributor

@morrySnow morrySnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update all stream api to for loop

protected List<StructInfo> getValidQueryStructInfos(Plan queryPlan, CascadesContext cascadesContext,
BitSet materializedViewTableSet) {
return MaterializedViewUtils.extractStructInfo(queryPlan, cascadesContext, materializedViewTableSet)
.stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use for loop

Comment on lines +151 to +156
return queryTableSets.stream()
// Just construct the struct info which mv table set contains all the query table set
.filter(queryTableSet -> materializedViewTableSet.isEmpty()
|| StructInfo.containsAll(materializedViewTableSet, queryTableSet))
.map(tableMap -> structInfoMap.getStructInfo(tableMap, tableMap, ownerGroup, plan))
.collect(Collectors.toList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use for loop

Comment on lines +175 to +176
Lists.newArrayList(),
Lists.newArrayList(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ImmutalbeList.of()

@@ -43,12 +45,12 @@
* Get from rewrite plan and can also get from plan struct info, if from plan struct info it depends on
* the nodes from graph.
*/
public class ExpressionLineageReplacer extends DefaultPlanVisitor<Expression, ExpressionReplaceContext> {
public class ExpressionLineageReplacer extends DefaultPlanVisitor<Void, ExpressionReplaceContext> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replacer return Void is weird

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 20, 2024
@morrySnow morrySnow merged commit 6d6f46e into apache:master Apr 20, 2024
28 of 30 checks passed
yiguolei pushed a commit that referenced this pull request Apr 21, 2024
Support query rewritting by nested materialized view.
Such as `inner_mv` def is as following

>             select
>             l_linenumber,
>             o_custkey,
>             o_orderkey,
>             o_orderstatus,
>             l_partkey,
>             l_suppkey,
>             l_orderkey
>             from lineitem
>             inner join orders on lineitem.l_orderkey = orders.o_orderkey;

the mv1_0 def is as following:

>             select
>             l_linenumber,
>             o_custkey,
>             o_orderkey,
>             o_orderstatus,
>             l_partkey,
>             l_suppkey,
>             l_orderkey,
>             ps_availqty
>             from inner_mv
>             inner join partsupp on l_partkey = ps_partkey AND l_suppkey = ps_suppkey;


for the following query, both inner_mv and mv1_0 can be successful when query rewritting by materialized view,and cbo will chose `mv1_0` finally.

>            select lineitem.l_linenumber
>             from lineitem
>             inner join orders on l_orderkey = o_orderkey
>             inner join partsupp on  l_partkey = ps_partkey AND l_suppkey = ps_suppkey
>             where o_orderstatus = 'o' AND l_linenumber in (1, 2, 3, 4, 5)
morrySnow pushed a commit that referenced this pull request Apr 21, 2024
…is not enough to provide all the data for the query (#33800)

When the materialized view is not enough to provide all the data for the query, if the materialized view is increment update by partition. we can union materialized view and origin query to reponse the query.

this depends on #33362

such as materialized view def is as following:

>         CREATE MATERIALIZED VIEW mv_10086
>         BUILD IMMEDIATE REFRESH AUTO ON MANUAL
>         partition by(l_shipdate)
>         DISTRIBUTED BY RANDOM BUCKETS 2
>         PROPERTIES ('replication_num' = '1') 
>         AS 
>     select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
>     from lineitem
>     left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
>     group by
>     l_shipdate,
>     o_orderdate,
>     l_partkey,
>     l_suppkey;

the materialized view data is as following:
+------------+-------------+-----------+-----------+-----------+
| l_shipdate | o_orderdate | l_partkey | l_suppkey | sum_total |
+------------+-------------+-----------+-----------+-----------+
| 2023-10-18 | 2023-10-18  |         2 |         3 |    109.20 |
| 2023-10-17 | 2023-10-17  |         2 |         3 |     99.50 |
| 2023-10-19 | 2023-10-19  |         2 |         3 |     99.50 |
+------------+-------------+-----------+-----------+-----------+

when we insert data to partition `2023-10-17`,  if we run query as following
```
    select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
    from lineitem
    left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
    group by
    l_shipdate,
    o_orderdate,
    l_partkey,
    l_suppkey;
```
query rewrite by materialzied view will fail with message   `Check partition query used validation fail`
if we turn on the switch `SET enable_materialized_view_union_rewrite = true;` default true
we run the query above again, it will success and will use union all  materialized view and origin query to response the query correctly. the plan is as following:


```
| Explain String(Nereids Planner)                                                                                                                                                                    |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                                                                                                    |
|   OUTPUT EXPRS:                                                                                                                                                                                    |
|     l_shipdate[#52]                                                                                                                                                                                |
|     o_orderdate[#53]                                                                                                                                                                               |
|     l_partkey[#54]                                                                                                                                                                                 |
|     l_suppkey[#55]                                                                                                                                                                                 |
|     sum_total[#56]                                                                                                                                                                                 |
|   PARTITION: UNPARTITIONED                                                                                                                                                                         |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   VRESULT SINK                                                                                                                                                                                     |
|      MYSQL_PROTOCAL                                                                                                                                                                                |
|                                                                                                                                                                                                    |
|   11:VEXCHANGE                                                                                                                                                                                     |
|      offset: 0                                                                                                                                                                                     |
|      distribute expr lists:                                                                                                                                                                        |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 1                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: HASH_PARTITIONED: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                   |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 11                                                                                                                                                                                |
|     UNPARTITIONED                                                                                                                                                                                  |
|                                                                                                                                                                                                    |
|   10:VUNION(756)                                                                                                                                                                                   |
|   |                                                                                                                                                                                                |
|   |----9:VAGGREGATE (merge finalize)(753)                                                                                                                                                          |
|   |    |  output: sum(partial_sum(o_totalprice)[#46])[#51]                                                                                                                                         |
|   |    |  group by: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                              |
|   |    |  cardinality=2                                                                                                                                                                            |
|   |    |  distribute expr lists: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                 |
|   |    |                                                                                                                                                                                           |
|   |    8:VEXCHANGE                                                                                                                                                                                 |
|   |       offset: 0                                                                                                                                                                                |
|   |       distribute expr lists: l_shipdate[#42]                                                                                                                                                   |
|   |                                                                                                                                                                                                |
|   1:VEXCHANGE                                                                                                                                                                                      |
|      offset: 0                                                                                                                                                                                     |
|      distribute expr lists:                                                                                                                                                                        |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 2                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: HASH_PARTITIONED: o_orderkey[#21], o_orderdate[#25]                                                                                                                                   |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 08                                                                                                                                                                                |
|     HASH_PARTITIONED: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                            |
|                                                                                                                                                                                                    |
|   7:VAGGREGATE (update serialize)(747)                                                                                                                                                             |
|   |  STREAMING                                                                                                                                                                                     |
|   |  output: partial_sum(o_totalprice[#41])[#46]                                                                                                                                                   |
|   |  group by: l_shipdate[#37], o_orderdate[#38], l_partkey[#39], l_suppkey[#40]                                                                                                                   |
|   |  cardinality=2                                                                                                                                                                                 |
|   |  distribute expr lists: l_shipdate[#37]                                                                                                                                                        |
|   |                                                                                                                                                                                                |
|   6:VHASH JOIN(741)                                                                                                                                                                                |
|   |  join op: RIGHT OUTER JOIN(PARTITIONED)[]                                                                                                                                                      |
|   |  equal join conjunct: (o_orderkey[#21] = l_orderkey[#5])                                                                                                                                       |
|   |  equal join conjunct: (o_orderdate[#25] = l_shipdate[#15])                                                                                                                                     |
|   |  runtime filters: RF000[min_max] <- l_orderkey[#5](2/2/2048), RF001[bloom] <- l_orderkey[#5](2/2/2048), RF002[min_max] <- l_shipdate[#15](1/1/2048), RF003[bloom] <- l_shipdate[#15](1/1/2048) |
|   |  cardinality=2                                                                                                                                                                                 |
|   |  vec output tuple id: 4                                                                                                                                                                        |
|   |  output tuple id: 4                                                                                                                                                                            |
|   |  vIntermediate tuple ids: 3                                                                                                                                                                    |
|   |  hash output slot ids: 6 7 24 25 15                                                                                                                                                            |
|   |  final projections: l_shipdate[#36], o_orderdate[#32], l_partkey[#34], l_suppkey[#35], o_totalprice[#31]                                                                                       |
|   |  final project output tuple id: 4                                                                                                                                                              |
|   |  distribute expr lists: o_orderkey[#21], o_orderdate[#25]                                                                                                                                      |
|   |  distribute expr lists: l_orderkey[#5], l_shipdate[#15]                                                                                                                                        |
|   |                                                                                                                                                                                                |
|   |----3:VEXCHANGE                                                                                                                                                                                 |
|   |       offset: 0                                                                                                                                                                                |
|   |       distribute expr lists: l_orderkey[#5]                                                                                                                                                    |
|   |                                                                                                                                                                                                |
|   5:VEXCHANGE                                                                                                                                                                                      |
|      offset: 0                                                                                                                                                                                     |
|      distribute expr lists:                                                                                                                                                                        |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 3                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: RANDOM                                                                                                                                                                                |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 05                                                                                                                                                                                |
|     HASH_PARTITIONED: o_orderkey[#21], o_orderdate[#25]                                                                                                                                            |
|                                                                                                                                                                                                    |
|   4:VOlapScanNode(722)                                                                                                                                                                             |
|      TABLE: union_db.orders(orders), PREAGGREGATION: ON                                                                                                                                            |
|      runtime filters: RF000[min_max] -> o_orderkey[#21], RF001[bloom] -> o_orderkey[#21], RF002[min_max] -> o_orderdate[#25], RF003[bloom] -> o_orderdate[#25]                                     |
|      partitions=3/3 (p_20231017,p_20231018,p_20231019), tablets=9/9, tabletList=161188,161190,161192 ...                                                                                           |
|      cardinality=3, avgRowSize=0.0, numNodes=1                                                                                                                                                     |
|      pushAggOp=NONE                                                                                                                                                                                |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 4                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: HASH_PARTITIONED: l_orderkey[#5]                                                                                                                                                      |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 03                                                                                                                                                                                |
|     HASH_PARTITIONED: l_orderkey[#5], l_shipdate[#15]                                                                                                                                              |
|                                                                                                                                                                                                    |
|   2:VOlapScanNode(729)                                                                                                                                                                             |
|      TABLE: union_db.lineitem(lineitem), PREAGGREGATION: ON                                                                                                                                        |
|      PREDICATES: (l_shipdate[#15] >= '2023-10-17') AND (l_shipdate[#15] < '2023-10-18')                                                                                                            |
|      partitions=1/3 (p_20231017), tablets=3/3, tabletList=161223,161225,161227                                                                                                                     |
|      cardinality=2, avgRowSize=0.0, numNodes=1                                                                                                                                                     |
|      pushAggOp=NONE                                                                                                                                                                                |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 5                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: RANDOM                                                                                                                                                                                |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 01                                                                                                                                                                                |
|     RANDOM                                                                                                                                                                                         |
|                                                                                                                                                                                                    |
|   0:VOlapScanNode(718)                                                                                                                                                                             |
|      TABLE: union_db.mv_10086(mv_10086), PREAGGREGATION: ON                                                                                                                                        |
|      partitions=2/3 (p_20231018_20231019,p_20231019_20231020), tablets=4/4, tabletList=161251,161253,161265 ...                                                                                    |
|      cardinality=2, avgRowSize=0.0, numNodes=1                                                                                                                                                     |
|      pushAggOp=NONE                                                                                                                                                                                |
|                                                                                                                                                                                                    |
| MaterializedView                                                                                                                                                                                   |
| MaterializedViewRewriteSuccessAndChose:                                                                                                                                                            |
|   Names: mv_10086                                                                                                                                                                                  |
| MaterializedViewRewriteSuccessButNotChose:                                                                                                                                                         |
|                                                                                                                                                                                                    |
| MaterializedViewRewriteFail:                                                                                                                                                                       |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
yiguolei pushed a commit that referenced this pull request Apr 21, 2024
…is not enough to provide all the data for the query (#33800)

When the materialized view is not enough to provide all the data for the query, if the materialized view is increment update by partition. we can union materialized view and origin query to reponse the query.

this depends on #33362

such as materialized view def is as following:

>         CREATE MATERIALIZED VIEW mv_10086
>         BUILD IMMEDIATE REFRESH AUTO ON MANUAL
>         partition by(l_shipdate)
>         DISTRIBUTED BY RANDOM BUCKETS 2
>         PROPERTIES ('replication_num' = '1') 
>         AS 
>     select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
>     from lineitem
>     left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
>     group by
>     l_shipdate,
>     o_orderdate,
>     l_partkey,
>     l_suppkey;

the materialized view data is as following:
+------------+-------------+-----------+-----------+-----------+
| l_shipdate | o_orderdate | l_partkey | l_suppkey | sum_total |
+------------+-------------+-----------+-----------+-----------+
| 2023-10-18 | 2023-10-18  |         2 |         3 |    109.20 |
| 2023-10-17 | 2023-10-17  |         2 |         3 |     99.50 |
| 2023-10-19 | 2023-10-19  |         2 |         3 |     99.50 |
+------------+-------------+-----------+-----------+-----------+

when we insert data to partition `2023-10-17`,  if we run query as following
```
    select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
    from lineitem
    left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
    group by
    l_shipdate,
    o_orderdate,
    l_partkey,
    l_suppkey;
```
query rewrite by materialzied view will fail with message   `Check partition query used validation fail`
if we turn on the switch `SET enable_materialized_view_union_rewrite = true;` default true
we run the query above again, it will success and will use union all  materialized view and origin query to response the query correctly. the plan is as following:


```
| Explain String(Nereids Planner)                                                                                                                                                                    |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                                                                                                    |
|   OUTPUT EXPRS:                                                                                                                                                                                    |
|     l_shipdate[#52]                                                                                                                                                                                |
|     o_orderdate[#53]                                                                                                                                                                               |
|     l_partkey[#54]                                                                                                                                                                                 |
|     l_suppkey[#55]                                                                                                                                                                                 |
|     sum_total[#56]                                                                                                                                                                                 |
|   PARTITION: UNPARTITIONED                                                                                                                                                                         |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   VRESULT SINK                                                                                                                                                                                     |
|      MYSQL_PROTOCAL                                                                                                                                                                                |
|                                                                                                                                                                                                    |
|   11:VEXCHANGE                                                                                                                                                                                     |
|      offset: 0                                                                                                                                                                                     |
|      distribute expr lists:                                                                                                                                                                        |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 1                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: HASH_PARTITIONED: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                   |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 11                                                                                                                                                                                |
|     UNPARTITIONED                                                                                                                                                                                  |
|                                                                                                                                                                                                    |
|   10:VUNION(756)                                                                                                                                                                                   |
|   |                                                                                                                                                                                                |
|   |----9:VAGGREGATE (merge finalize)(753)                                                                                                                                                          |
|   |    |  output: sum(partial_sum(o_totalprice)[#46])[#51]                                                                                                                                         |
|   |    |  group by: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                              |
|   |    |  cardinality=2                                                                                                                                                                            |
|   |    |  distribute expr lists: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                 |
|   |    |                                                                                                                                                                                           |
|   |    8:VEXCHANGE                                                                                                                                                                                 |
|   |       offset: 0                                                                                                                                                                                |
|   |       distribute expr lists: l_shipdate[#42]                                                                                                                                                   |
|   |                                                                                                                                                                                                |
|   1:VEXCHANGE                                                                                                                                                                                      |
|      offset: 0                                                                                                                                                                                     |
|      distribute expr lists:                                                                                                                                                                        |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 2                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: HASH_PARTITIONED: o_orderkey[#21], o_orderdate[#25]                                                                                                                                   |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 08                                                                                                                                                                                |
|     HASH_PARTITIONED: l_shipdate[#42], o_orderdate[#43], l_partkey[#44], l_suppkey[#45]                                                                                                            |
|                                                                                                                                                                                                    |
|   7:VAGGREGATE (update serialize)(747)                                                                                                                                                             |
|   |  STREAMING                                                                                                                                                                                     |
|   |  output: partial_sum(o_totalprice[#41])[#46]                                                                                                                                                   |
|   |  group by: l_shipdate[#37], o_orderdate[#38], l_partkey[#39], l_suppkey[#40]                                                                                                                   |
|   |  cardinality=2                                                                                                                                                                                 |
|   |  distribute expr lists: l_shipdate[#37]                                                                                                                                                        |
|   |                                                                                                                                                                                                |
|   6:VHASH JOIN(741)                                                                                                                                                                                |
|   |  join op: RIGHT OUTER JOIN(PARTITIONED)[]                                                                                                                                                      |
|   |  equal join conjunct: (o_orderkey[#21] = l_orderkey[#5])                                                                                                                                       |
|   |  equal join conjunct: (o_orderdate[#25] = l_shipdate[#15])                                                                                                                                     |
|   |  runtime filters: RF000[min_max] <- l_orderkey[#5](2/2/2048), RF001[bloom] <- l_orderkey[#5](2/2/2048), RF002[min_max] <- l_shipdate[#15](1/1/2048), RF003[bloom] <- l_shipdate[#15](1/1/2048) |
|   |  cardinality=2                                                                                                                                                                                 |
|   |  vec output tuple id: 4                                                                                                                                                                        |
|   |  output tuple id: 4                                                                                                                                                                            |
|   |  vIntermediate tuple ids: 3                                                                                                                                                                    |
|   |  hash output slot ids: 6 7 24 25 15                                                                                                                                                            |
|   |  final projections: l_shipdate[#36], o_orderdate[#32], l_partkey[#34], l_suppkey[#35], o_totalprice[#31]                                                                                       |
|   |  final project output tuple id: 4                                                                                                                                                              |
|   |  distribute expr lists: o_orderkey[#21], o_orderdate[#25]                                                                                                                                      |
|   |  distribute expr lists: l_orderkey[#5], l_shipdate[#15]                                                                                                                                        |
|   |                                                                                                                                                                                                |
|   |----3:VEXCHANGE                                                                                                                                                                                 |
|   |       offset: 0                                                                                                                                                                                |
|   |       distribute expr lists: l_orderkey[#5]                                                                                                                                                    |
|   |                                                                                                                                                                                                |
|   5:VEXCHANGE                                                                                                                                                                                      |
|      offset: 0                                                                                                                                                                                     |
|      distribute expr lists:                                                                                                                                                                        |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 3                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: RANDOM                                                                                                                                                                                |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 05                                                                                                                                                                                |
|     HASH_PARTITIONED: o_orderkey[#21], o_orderdate[#25]                                                                                                                                            |
|                                                                                                                                                                                                    |
|   4:VOlapScanNode(722)                                                                                                                                                                             |
|      TABLE: union_db.orders(orders), PREAGGREGATION: ON                                                                                                                                            |
|      runtime filters: RF000[min_max] -> o_orderkey[#21], RF001[bloom] -> o_orderkey[#21], RF002[min_max] -> o_orderdate[#25], RF003[bloom] -> o_orderdate[#25]                                     |
|      partitions=3/3 (p_20231017,p_20231018,p_20231019), tablets=9/9, tabletList=161188,161190,161192 ...                                                                                           |
|      cardinality=3, avgRowSize=0.0, numNodes=1                                                                                                                                                     |
|      pushAggOp=NONE                                                                                                                                                                                |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 4                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: HASH_PARTITIONED: l_orderkey[#5]                                                                                                                                                      |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 03                                                                                                                                                                                |
|     HASH_PARTITIONED: l_orderkey[#5], l_shipdate[#15]                                                                                                                                              |
|                                                                                                                                                                                                    |
|   2:VOlapScanNode(729)                                                                                                                                                                             |
|      TABLE: union_db.lineitem(lineitem), PREAGGREGATION: ON                                                                                                                                        |
|      PREDICATES: (l_shipdate[#15] >= '2023-10-17') AND (l_shipdate[#15] < '2023-10-18')                                                                                                            |
|      partitions=1/3 (p_20231017), tablets=3/3, tabletList=161223,161225,161227                                                                                                                     |
|      cardinality=2, avgRowSize=0.0, numNodes=1                                                                                                                                                     |
|      pushAggOp=NONE                                                                                                                                                                                |
|                                                                                                                                                                                                    |
| PLAN FRAGMENT 5                                                                                                                                                                                    |
|                                                                                                                                                                                                    |
|   PARTITION: RANDOM                                                                                                                                                                                |
|                                                                                                                                                                                                    |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                        |
|                                                                                                                                                                                                    |
|   STREAM DATA SINK                                                                                                                                                                                 |
|     EXCHANGE ID: 01                                                                                                                                                                                |
|     RANDOM                                                                                                                                                                                         |
|                                                                                                                                                                                                    |
|   0:VOlapScanNode(718)                                                                                                                                                                             |
|      TABLE: union_db.mv_10086(mv_10086), PREAGGREGATION: ON                                                                                                                                        |
|      partitions=2/3 (p_20231018_20231019,p_20231019_20231020), tablets=4/4, tabletList=161251,161253,161265 ...                                                                                    |
|      cardinality=2, avgRowSize=0.0, numNodes=1                                                                                                                                                     |
|      pushAggOp=NONE                                                                                                                                                                                |
|                                                                                                                                                                                                    |
| MaterializedView                                                                                                                                                                                   |
| MaterializedViewRewriteSuccessAndChose:                                                                                                                                                            |
|   Names: mv_10086                                                                                                                                                                                  |
| MaterializedViewRewriteSuccessButNotChose:                                                                                                                                                         |
|                                                                                                                                                                                                    |
| MaterializedViewRewriteFail:                                                                                                                                                                       |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
morrySnow pushed a commit that referenced this pull request Apr 24, 2024
…formance (#34050)

Optimize the nested materialized view rewrite performance when exists many join
This is brought by #33362
seawinde added a commit to seawinde/doris that referenced this pull request Apr 24, 2024
…formance (apache#34050)

Optimize the nested materialized view rewrite performance when exists many join
This is brought by apache#33362
yiguolei pushed a commit that referenced this pull request Apr 24, 2024
…formance (#34050) (#34078)

Optimize the nested materialized view rewrite performance when exists many join
This is brought by #33362
yiguolei pushed a commit that referenced this pull request Apr 24, 2024
…formance (#34050)

Optimize the nested materialized view rewrite performance when exists many join
This is brought by #33362
yiguolei pushed a commit that referenced this pull request Apr 24, 2024
…formance (#34050)

Optimize the nested materialized view rewrite performance when exists many join
This is brought by #33362
yiguolei pushed a commit that referenced this pull request Apr 25, 2024
…formance (#34050)

Optimize the nested materialized view rewrite performance when exists many join
This is brought by #33362
@yiguolei yiguolei mentioned this pull request Apr 26, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants