branch-3.1: [enhance](multi-catalog) Runtime Filter Partition Pruning for Data Lake Tables (#53399) #55040

suxiaogang223 · 2025-08-20T03:51:56Z

bp: #53399

hello-stephen · 2025-08-20T03:52:00Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

suxiaogang223 · 2025-08-20T03:52:07Z

run buildall

hello-stephen · 2025-08-20T04:08:39Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.18% (1212/1493)
Line Coverage	65.52% (21567/32916)
Region Coverage	67.09% (10865/16195)
Branch Coverage	56.67% (5724/10100)

hello-stephen · 2025-08-20T05:44:34Z

BE UT Coverage Report

Increment line coverage 2.13% (1/47) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	45.49% (12724/27970)
Line Coverage	36.37% (113419/311836)
Region Coverage	34.01% (64921/190901)
Branch Coverage	31.04% (34076/109770)

suxiaogang223 · 2025-08-20T06:46:30Z

run buildall

doris-robot · 2025-08-20T07:08:59Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.18% (1212/1493)
Line Coverage	65.48% (21555/32916)
Region Coverage	67.05% (10858/16195)
Branch Coverage	56.68% (5725/10100)

doris-robot · 2025-08-20T07:35:01Z

TPC-H: Total hot run time: 32564 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit df64ec420f1dc3442f0d680a3772a80ceb037867, data reload: false

------ Round 1 ----------------------------------
q1	17619	5496	5381	5381
q2	2036	402	279	279
q3	12187	1217	738	738
q4	10282	870	446	446
q5	8775	2372	2138	2138
q6	184	165	134	134
q7	905	757	621	621
q8	9339	1435	1186	1186
q9	5225	4977	4928	4928
q10	6770	2269	1816	1816
q11	488	294	266	266
q12	342	353	210	210
q13	17782	3605	3029	3029
q14	241	262	206	206
q15	524	468	481	468
q16	415	423	368	368
q17	589	867	366	366
q18	7033	6287	6440	6287
q19	2068	964	569	569
q20	314	325	198	198
q21	2807	2134	1952	1952
q22	1073	987	978	978
Total cold run time: 106998 ms
Total hot run time: 32564 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5673	5505	6693	5505
q2	236	322	230	230
q3	2272	2634	2341	2341
q4	1360	1776	1301	1301
q5	4366	5076	5041	5041
q6	160	158	125	125
q7	2056	1973	1794	1794
q8	2654	2858	2682	2682
q9	7273	7206	7146	7146
q10	3054	3222	2741	2741
q11	556	527	498	498
q12	695	794	595	595
q13	3390	3767	3179	3179
q14	270	286	265	265
q15	533	473	464	464
q16	429	485	423	423
q17	1201	1739	1248	1248
q18	7651	7315	7224	7224
q19	802	1117	1073	1073
q20	2012	2080	1895	1895
q21	5278	4951	4631	4631
q22	1080	1069	1004	1004
Total cold run time: 53001 ms
Total hot run time: 51405 ms

hello-stephen · 2025-08-20T07:43:55Z

BE UT Coverage Report

Increment line coverage 2.13% (1/47) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	45.49% (12724/27970)
Line Coverage	36.37% (113424/311836)
Region Coverage	34.01% (64917/190901)
Branch Coverage	31.04% (34068/109770)

doris-robot · 2025-08-20T07:47:06Z

TPC-DS: Total hot run time: 192354 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit df64ec420f1dc3442f0d680a3772a80ceb037867, data reload: false

query1	942	398	382	382
query2	6246	1927	1918	1918
query3	8683	195	192	192
query4	33791	24164	23468	23468
query5	3661	614	460	460
query6	285	188	167	167
query7	4195	504	317	317
query8	297	237	237	237
query9	9413	2556	2556	2556
query10	478	326	256	256
query11	18462	15428	15112	15112
query12	161	105	104	104
query13	1559	546	407	407
query14	8835	7672	7385	7385
query15	245	194	184	184
query16	8144	658	512	512
query17	1549	789	607	607
query18	2131	422	336	336
query19	234	191	168	168
query20	129	124	116	116
query21	209	135	110	110
query22	4523	4561	4409	4409
query23	35207	34029	34390	34029
query24	7183	2663	2671	2663
query25	526	461	413	413
query26	823	290	169	169
query27	2005	477	353	353
query28	5153	2195	2142	2142
query29	704	600	449	449
query30	250	190	157	157
query31	1014	928	825	825
query32	85	57	64	57
query33	523	377	301	301
query34	752	896	533	533
query35	793	835	724	724
query36	1032	1056	967	967
query37	99	90	65	65
query38	4045	3987	4042	3987
query39	1513	1481	1488	1481
query40	212	116	103	103
query41	50	49	49	49
query42	118	102	98	98
query43	518	552	490	490
query44	1334	838	827	827
query45	184	185	168	168
query46	876	1071	675	675
query47	1959	1977	1958	1958
query48	403	416	343	343
query49	746	496	400	400
query50	681	683	428	428
query51	7320	7359	7187	7187
query52	118	107	96	96
query53	243	274	200	200
query54	550	543	468	468
query55	81	82	84	82
query56	261	273	269	269
query57	1274	1263	1213	1213
query58	240	222	233	222
query59	3233	3257	3078	3078
query60	292	284	267	267
query61	118	111	119	111
query62	803	747	733	733
query63	239	191	189	189
query64	3745	988	724	724
query65	3382	3314	3348	3314
query66	797	431	325	325
query67	16707	15817	15593	15593
query68	6721	829	533	533
query69	507	311	270	270
query70	1183	1121	1095	1095
query71	381	296	268	268
query72	5350	3807	3788	3788
query73	638	749	353	353
query74	10320	9092	9076	9076
query75	3171	3137	2676	2676
query76	2946	1201	775	775
query77	497	362	272	272
query78	10352	10371	9552	9552
query79	3547	861	583	583
query80	743	519	425	425
query81	495	254	217	217
query82	613	118	94	94
query83	170	159	142	142
query84	282	102	79	79
query85	774	359	304	304
query86	344	284	289	284
query87	4300	4320	4170	4170
query88	3484	2383	2359	2359
query89	417	336	296	296
query90	1871	188	185	185
query91	137	134	107	107
query92	62	56	53	53
query93	1976	878	529	529
query94	718	421	306	306
query95	341	276	269	269
query96	486	610	283	283
query97	3189	3306	3151	3151
query98	218	215	211	211
query99	1536	1429	1321	1321
Total cold run time: 290132 ms
Total hot run time: 192354 ms

doris-robot · 2025-08-20T07:52:20Z

ClickBench: Total hot run time: 29.06 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit df64ec420f1dc3442f0d680a3772a80ceb037867, data reload: false

query1	0.03	0.03	0.03
query2	0.09	0.05	0.04
query3	0.24	0.06	0.06
query4	1.62	0.08	0.08
query5	0.52	0.53	0.52
query6	1.13	0.74	0.75
query7	0.02	0.01	0.01
query8	0.06	0.05	0.04
query9	0.56	0.50	0.49
query10	0.56	0.56	0.55
query11	0.17	0.12	0.12
query12	0.15	0.13	0.13
query13	0.63	0.60	0.60
query14	0.78	0.81	0.79
query15	0.87	0.84	0.84
query16	0.38	0.37	0.37
query17	1.06	1.00	0.99
query18	0.18	0.20	0.20
query19	1.94	1.84	1.80
query20	0.02	0.01	0.02
query21	15.36	0.96	0.66
query22	0.76	0.76	0.68
query23	14.78	1.44	0.67
query24	2.22	0.35	0.22
query25	0.14	0.08	0.10
query26	0.30	0.18	0.17
query27	0.08	0.08	0.09
query28	13.45	1.20	0.54
query29	12.62	4.08	3.34
query30	0.25	0.08	0.07
query31	2.82	0.60	0.39
query32	3.22	0.58	0.49
query33	3.00	3.03	3.07
query34	16.59	5.16	4.57
query35	4.61	4.63	4.55
query36	0.63	0.48	0.47
query37	0.20	0.17	0.17
query38	0.16	0.17	0.16
query39	0.06	0.04	0.04
query40	0.16	0.13	0.13
query41	0.10	0.05	0.05
query42	0.07	0.05	0.05
query43	0.05	0.05	0.04
Total cold run time: 102.64 s
Total hot run time: 29.06 s

hello-stephen · 2025-08-20T09:37:21Z

BE Regression && UT Coverage Report

Increment line coverage 8.51% (4/47) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	76.41% (21014/27500)
Line Coverage	69.74% (216742/310779)
Region Coverage	67.66% (129661/191635)
Branch Coverage	61.20% (67481/110262)

suxiaogang223 · 2025-08-21T06:34:25Z

run external

suxiaogang223 · 2025-08-21T13:40:36Z

run buildall

doris-robot · 2025-08-21T14:05:44Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.18% (1212/1493)
Line Coverage	65.61% (21596/32916)
Region Coverage	67.19% (10881/16195)
Branch Coverage	56.79% (5736/10100)

hello-stephen · 2025-08-21T14:57:02Z

FE UT Coverage Report

Increment line coverage 6.10% (10/164) 🎉
Increment coverage report
Complete coverage report

doris-robot · 2025-08-21T15:16:33Z

TPC-H: Total hot run time: 32639 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c2f1383b6253d2448c14a72a67cc788c07eec213, data reload: false

------ Round 1 ----------------------------------
q1	17640	5496	5494	5494
q2	2018	641	280	280
q3	11645	1238	751	751
q4	10262	875	461	461
q5	8800	2415	2146	2146
q6	185	167	132	132
q7	909	742	631	631
q8	9346	1481	1119	1119
q9	5193	5011	4915	4915
q10	6749	2289	1832	1832
q11	480	291	265	265
q12	330	347	210	210
q13	17775	3614	2995	2995
q14	233	226	208	208
q15	538	466	465	465
q16	426	427	364	364
q17	622	884	376	376
q18	7087	6585	6217	6217
q19	1724	962	558	558
q20	332	337	208	208
q21	2839	2160	1996	1996
q22	1087	1043	1016	1016
Total cold run time: 106220 ms
Total hot run time: 32639 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5579	5538	5550	5538
q2	238	327	242	242
q3	2245	2654	2266	2266
q4	1366	1811	1416	1416
q5	4387	4987	4926	4926
q6	169	158	130	130
q7	2045	1994	1861	1861
q8	2661	2889	2719	2719
q9	7240	7290	7214	7214
q10	3022	3238	2759	2759
q11	580	514	495	495
q12	690	752	626	626
q13	3433	3788	3138	3138
q14	286	301	277	277
q15	525	478	478	478
q16	437	486	423	423
q17	1244	1775	1243	1243
q18	7746	7424	7395	7395
q19	797	1144	1111	1111
q20	2026	2040	1913	1913
q21	5483	5007	4568	4568
q22	1097	1098	1033	1033
Total cold run time: 53296 ms
Total hot run time: 51771 ms

doris-robot · 2025-08-21T15:28:44Z

TPC-DS: Total hot run time: 192391 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c2f1383b6253d2448c14a72a67cc788c07eec213, data reload: false

query1	978	402	409	402
query2	6231	1912	1837	1837
query3	8674	196	196	196
query4	33657	23886	23866	23866
query5	3713	582	446	446
query6	277	191	172	172
query7	4193	497	313	313
query8	302	235	228	228
query9	9357	2615	2607	2607
query10	495	329	271	271
query11	18044	15327	15184	15184
query12	139	110	102	102
query13	1534	528	402	402
query14	9324	6497	6546	6497
query15	253	192	180	180
query16	7923	650	518	518
query17	1561	779	617	617
query18	2087	418	323	323
query19	207	199	167	167
query20	125	127	129	127
query21	208	132	115	115
query22	4544	4727	4471	4471
query23	35263	33774	33935	33774
query24	7650	2704	2721	2704
query25	552	504	425	425
query26	1151	290	178	178
query27	2518	468	365	365
query28	5853	2208	2174	2174
query29	820	625	489	489
query30	239	189	159	159
query31	1072	933	830	830
query32	93	60	62	60
query33	556	381	326	326
query34	764	847	528	528
query35	783	862	743	743
query36	1012	1079	991	991
query37	110	95	72	72
query38	3959	4046	3992	3992
query39	1510	1499	1463	1463
query40	210	118	104	104
query41	50	47	51	47
query42	134	120	117	117
query43	499	525	477	477
query44	1395	822	827	822
query45	180	179	167	167
query46	885	1045	668	668
query47	2026	1989	1986	1986
query48	434	439	352	352
query49	787	493	408	408
query50	682	695	429	429
query51	7436	7325	7294	7294
query52	104	100	92	92
query53	230	259	192	192
query54	570	558	478	478
query55	84	86	83	83
query56	269	284	250	250
query57	1279	1299	1195	1195
query58	233	219	221	219
query59	3027	3100	3035	3035
query60	287	291	270	270
query61	152	120	111	111
query62	775	765	698	698
query63	228	199	195	195
query64	4788	1018	663	663
query65	3378	3311	3331	3311
query66	958	410	303	303
query67	16409	15749	15770	15749
query68	7867	805	538	538
query69	485	316	273	273
query70	1181	1088	1085	1085
query71	371	287	264	264
query72	5818	3764	3826	3764
query73	635	755	370	370
query74	10680	9117	9286	9117
query75	3188	3147	2692	2692
query76	3055	1164	756	756
query77	522	363	271	271
query78	10405	10459	9645	9645
query79	3503	887	590	590
query80	714	525	442	442
query81	497	259	217	217
query82	612	114	90	90
query83	165	162	147	147
query84	247	95	82	82
query85	842	360	306	306
query86	383	300	305	300
query87	4325	4287	4222	4222
query88	5289	2431	2416	2416
query89	414	338	293	293
query90	1755	192	196	192
query91	142	136	112	112
query92	70	60	51	51
query93	2525	896	534	534
query94	691	396	309	309
query95	347	282	274	274
query96	493	612	285	285
query97	3219	3288	3214	3214
query98	228	206	198	198
query99	1983	1431	1313	1313
Total cold run time: 297871 ms
Total hot run time: 192391 ms

doris-robot · 2025-08-21T15:33:58Z

ClickBench: Total hot run time: 29.22 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c2f1383b6253d2448c14a72a67cc788c07eec213, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.24	0.06	0.06
query4	1.64	0.08	0.07
query5	0.53	0.49	0.51
query6	1.14	0.75	0.73
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.56	0.49	0.49
query10	0.56	0.55	0.56
query11	0.18	0.12	0.11
query12	0.16	0.13	0.13
query13	0.61	0.61	0.59
query14	0.78	0.81	0.81
query15	0.84	0.83	0.83
query16	0.38	0.39	0.37
query17	1.08	1.08	1.07
query18	0.18	0.18	0.19
query19	1.96	1.86	1.82
query20	0.02	0.01	0.01
query21	15.37	0.97	0.67
query22	0.75	0.75	0.66
query23	14.90	1.48	0.68
query24	2.15	0.38	0.22
query25	0.13	0.09	0.09
query26	0.29	0.19	0.18
query27	0.08	0.08	0.08
query28	13.45	1.29	0.54
query29	12.61	4.12	3.38
query30	0.25	0.10	0.08
query31	2.80	0.60	0.39
query32	3.22	0.56	0.48
query33	3.02	3.03	3.04
query34	16.68	5.27	4.59
query35	4.59	4.60	4.58
query36	0.63	0.50	0.48
query37	0.20	0.17	0.17
query38	0.16	0.16	0.17
query39	0.06	0.04	0.04
query40	0.17	0.14	0.14
query41	0.10	0.05	0.06
query42	0.07	0.06	0.05
query43	0.05	0.04	0.05
Total cold run time: 102.78 s
Total hot run time: 29.22 s

doris-robot · 2025-08-21T15:45:24Z

BE UT Coverage Report

Increment line coverage 2.08% (1/48) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	45.48% (12723/27972)
Line Coverage	36.37% (113397/311824)
Region Coverage	34.01% (64935/190905)
Branch Coverage	31.05% (34085/109772)

hello-stephen · 2025-08-21T18:24:09Z

BE Regression && UT Coverage Report

Increment line coverage 25.00% (12/48) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	77.63% (21350/27502)
Line Coverage	71.84% (223252/310767)
Region Coverage	69.80% (133771/191639)
Branch Coverage	63.37% (69879/110264)

…ke Tables (apache#53399) follow: apache#47025 This PR implements dynamic partition pruning based on runtime filters for Iceberg, Paimon, and Hudi data lake tables, extending and enhancing the previous PR [apache#47025](apache#47025). In PR [apache#47025](apache#47025), we implemented runtime filter-based dynamic partition pruning for Hive tables. However, due to significant differences in partition metadata formats between Iceberg, Paimon, Hudi and traditional Hive tables, specialized adaptation and implementation are required for these data lake formats. - During split generation in scan nodes, when `enable_runtime_filter_partition_prune` is enabled, call corresponding partition value extraction functions - Pass extracted partition values to backend through `TFileRangeDesc.data_lake_partition_values` field - Store partition values in `Map<String, String>` format, with keys as partition column names and values as serialized partition values - Process partition column information in `FileScanner::_generate_data_lake_partition_columns()` - Runtime filters can perform partition pruning based on this partition value information, avoiding scanning of non-matching partition files Dynamic partition pruning supports the following types of queries: ```sql -- Equality queries SELECT count(*) FROM iceberg_table WHERE partition_col = ( SELECT partition_col FROM iceberg_table GROUP BY partition_col HAVING count(*) > 0 ORDER BY partition_col DESC LIMIT 1 ); -- IN queries SELECT count(*) FROM paimon_table WHERE partition_col IN ( SELECT partition_col FROM paimon_table GROUP BY partition_col HAVING count(*) > 0 ORDER BY partition_col DESC LIMIT 2 ); -- Function expression queries SELECT count(*) FROM hudi_table WHERE abs(partition_col) = ( SELECT partition_col FROM hudi_table GROUP BY partition_col HAVING count(*) > 0 ORDER BY partition_col DESC LIMIT 1 ); ``` Partition data types supported by each format: **Common Support**: - **Numeric types**: INT, BIGINT, DECIMAL, FLOAT, DOUBLE, TINYINT, SMALLINT - **String types**: STRING, VARCHAR, CHAR - **Date/time types**: DATE, TIMESTAMP - **Boolean type**: BOOLEAN - **Binary types**: BINARY (except for Paimon) **Format-specific Support**: - **Iceberg**: Additionally supports TIMESTAMP_NTZ type for timezone-free timestamps - **Paimon**: Does not support BINARY as partition key (currently binary as partition key causes issues in Spark) - **Hudi**: Based on Hive partition format, supports all Hive-compatible types **Notes**: - TIME and UUID types are supported at the code level, but since Spark does not support these types as partition keys, test cases do not include related test scenarios - In actual production environments, if these types are used, the dynamic partition pruning feature can still work normally

suxiaogang223 · 2025-08-22T04:00:07Z

run buildall

hello-stephen · 2025-08-22T04:21:51Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.18% (1212/1493)
Line Coverage	65.48% (21554/32916)
Region Coverage	67.09% (10866/16195)
Branch Coverage	56.65% (5722/10100)

doris-robot · 2025-08-22T04:52:10Z

TPC-H: Total hot run time: 32630 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a17058c5ee123cf3cbec06c6b12f6cc4dc7edad9, data reload: false

------ Round 1 ----------------------------------
q1	17881	5464	5451	5451
q2	2043	395	287	287
q3	12019	1241	762	762
q4	10549	863	456	456
q5	9440	2383	2124	2124
q6	189	167	139	139
q7	913	745	645	645
q8	9314	1418	1144	1144
q9	5186	4938	4944	4938
q10	6790	2248	1789	1789
q11	466	271	256	256
q12	326	370	210	210
q13	17786	3641	2989	2989
q14	233	217	205	205
q15	524	482	457	457
q16	422	432	366	366
q17	612	871	378	378
q18	6799	6378	6368	6368
q19	1436	963	557	557
q20	329	336	206	206
q21	2774	2132	1956	1956
q22	1072	1013	947	947
Total cold run time: 107103 ms
Total hot run time: 32630 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5557	5547	5491	5491
q2	238	330	231	231
q3	2239	2615	2302	2302
q4	1286	1794	1344	1344
q5	4416	4993	4982	4982
q6	166	165	127	127
q7	2091	1930	1818	1818
q8	2619	2859	2712	2712
q9	7178	7266	7230	7230
q10	3035	3269	2695	2695
q11	568	532	487	487
q12	641	789	593	593
q13	3407	3765	3129	3129
q14	268	291	289	289
q15	525	465	470	465
q16	441	483	424	424
q17	1216	1740	1275	1275
q18	7614	7454	7419	7419
q19	801	1168	1041	1041
q20	1992	2042	1892	1892
q21	5347	4992	4749	4749
q22	1065	1073	993	993
Total cold run time: 52710 ms
Total hot run time: 51688 ms

This reverts commit df64ec4.

… Data Lake Tables (apache#53399)" This reverts commit 8d98908.

…ke Tables (apache#53399) follow: apache#47025 This PR implements dynamic partition pruning based on runtime filters for Iceberg, Paimon, and Hudi data lake tables, extending and enhancing the previous PR [apache#47025](apache#47025). In PR [apache#47025](apache#47025), we implemented runtime filter-based dynamic partition pruning for Hive tables. However, due to significant differences in partition metadata formats between Iceberg, Paimon, Hudi and traditional Hive tables, specialized adaptation and implementation are required for these data lake formats. - During split generation in scan nodes, when `enable_runtime_filter_partition_prune` is enabled, call corresponding partition value extraction functions - Pass extracted partition values to backend through `TFileRangeDesc.data_lake_partition_values` field - Store partition values in `Map<String, String>` format, with keys as partition column names and values as serialized partition values - Process partition column information in `FileScanner::_generate_data_lake_partition_columns()` - Runtime filters can perform partition pruning based on this partition value information, avoiding scanning of non-matching partition files Dynamic partition pruning supports the following types of queries: ```sql -- Equality queries SELECT count(*) FROM iceberg_table WHERE partition_col = ( SELECT partition_col FROM iceberg_table GROUP BY partition_col HAVING count(*) > 0 ORDER BY partition_col DESC LIMIT 1 ); -- IN queries SELECT count(*) FROM paimon_table WHERE partition_col IN ( SELECT partition_col FROM paimon_table GROUP BY partition_col HAVING count(*) > 0 ORDER BY partition_col DESC LIMIT 2 ); -- Function expression queries SELECT count(*) FROM hudi_table WHERE abs(partition_col) = ( SELECT partition_col FROM hudi_table GROUP BY partition_col HAVING count(*) > 0 ORDER BY partition_col DESC LIMIT 1 ); ``` Partition data types supported by each format: **Common Support**: - **Numeric types**: INT, BIGINT, DECIMAL, FLOAT, DOUBLE, TINYINT, SMALLINT - **String types**: STRING, VARCHAR, CHAR - **Date/time types**: DATE, TIMESTAMP - **Boolean type**: BOOLEAN - **Binary types**: BINARY (except for Paimon) **Format-specific Support**: - **Iceberg**: Additionally supports TIMESTAMP_NTZ type for timezone-free timestamps - **Paimon**: Does not support BINARY as partition key (currently binary as partition key causes issues in Spark) - **Hudi**: Based on Hive partition format, supports all Hive-compatible types **Notes**: - TIME and UUID types are supported at the code level, but since Spark does not support these types as partition keys, test cases do not include related test scenarios - In actual production environments, if these types are used, the dynamic partition pruning feature can still work normally

suxiaogang223 · 2025-08-25T02:48:18Z

run buildall

hello-stephen · 2025-08-25T03:11:03Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.18% (1212/1493)
Line Coverage	65.55% (21589/32934)
Region Coverage	67.13% (10879/16205)
Branch Coverage	56.77% (5737/10106)

This reverts commit 58c50fa.

doris-robot · 2025-08-25T04:19:24Z

TPC-H: Total hot run time: 33026 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 73e6eb0ee94d4ad9997a89e98b78a1b14a8dc711, data reload: false

------ Round 1 ----------------------------------
q1	17611	5497	5672	5497
q2	2033	388	283	283
q3	11829	1244	762	762
q4	10250	883	468	468
q5	8529	2381	2202	2202
q6	185	164	128	128
q7	890	749	619	619
q8	9331	1479	1246	1246
q9	5283	4957	4891	4891
q10	6789	2295	1841	1841
q11	497	294	271	271
q12	336	353	210	210
q13	17799	3625	3023	3023
q14	244	232	209	209
q15	524	474	478	474
q16	434	437	385	385
q17	608	867	365	365
q18	7097	6425	6440	6425
q19	1767	954	574	574
q20	326	338	201	201
q21	2932	2169	1961	1961
q22	1062	1033	991	991
Total cold run time: 106356 ms
Total hot run time: 33026 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5558	5506	5510	5506
q2	242	323	236	236
q3	2253	2698	2327	2327
q4	1350	1813	1351	1351
q5	4400	5108	4983	4983
q6	172	158	127	127
q7	2084	1930	1831	1831
q8	2659	2867	2710	2710
q9	7310	7260	7173	7173
q10	3081	3250	2745	2745
q11	574	538	498	498
q12	648	756	598	598
q13	3398	3803	3211	3211
q14	284	298	277	277
q15	530	476	472	472
q16	429	495	426	426
q17	1246	1776	1237	1237
q18	7631	7512	7489	7489
q19	821	1195	1054	1054
q20	2038	2063	1918	1918
q21	5307	4895	4493	4493
q22	1103	1065	1027	1027
Total cold run time: 53118 ms
Total hot run time: 51689 ms

doris-robot · 2025-08-25T04:31:33Z

TPC-DS: Total hot run time: 193568 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 73e6eb0ee94d4ad9997a89e98b78a1b14a8dc711, data reload: false

query1	969	414	408	408
query2	6215	1969	1915	1915
query3	8692	199	195	195
query4	33441	23957	23614	23614
query5	3779	628	475	475
query6	302	211	187	187
query7	4209	515	321	321
query8	314	260	247	247
query9	9495	2638	2603	2603
query10	442	322	273	273
query11	18122	15419	15233	15233
query12	156	106	102	102
query13	1533	559	401	401
query14	10190	7652	6822	6822
query15	248	189	182	182
query16	8130	687	494	494
query17	1560	762	589	589
query18	2151	444	331	331
query19	251	195	164	164
query20	132	122	121	121
query21	213	128	111	111
query22	4687	4709	4571	4571
query23	35038	34419	34396	34396
query24	7418	2744	2785	2744
query25	528	469	401	401
query26	1163	292	185	185
query27	2170	502	358	358
query28	5333	2209	2175	2175
query29	793	598	462	462
query30	250	185	153	153
query31	1019	907	826	826
query32	113	60	60	60
query33	487	372	320	320
query34	755	872	548	548
query35	791	835	751	751
query36	1030	1075	959	959
query37	107	95	70	70
query38	4014	4069	3977	3977
query39	1536	1497	1472	1472
query40	206	118	105	105
query41	50	51	49	49
query42	140	103	111	103
query43	521	529	485	485
query44	1426	874	846	846
query45	191	175	174	174
query46	909	1069	687	687
query47	1966	1969	1902	1902
query48	418	424	350	350
query49	799	508	435	435
query50	727	711	452	452
query51	7381	7303	7131	7131
query52	106	105	96	96
query53	237	266	194	194
query54	575	568	478	478
query55	83	79	81	79
query56	282	267	269	267
query57	1249	1296	1207	1207
query58	238	216	212	212
query59	3038	3159	3019	3019
query60	296	294	273	273
query61	116	127	111	111
query62	796	755	696	696
query63	235	204	191	191
query64	4818	1016	651	651
query65	3398	3336	3310	3310
query66	999	440	333	333
query67	16433	15750	15837	15750
query68	2465	873	560	560
query69	475	318	285	285
query70	1228	1142	1143	1142
query71	377	322	280	280
query72	6035	3855	4059	3855
query73	650	752	354	354
query74	9884	9362	9320	9320
query75	3117	3126	2687	2687
query76	2909	1198	785	785
query77	513	378	289	289
query78	10458	10539	9595	9595
query79	2893	924	609	609
query80	1506	538	440	440
query81	542	253	227	227
query82	575	118	94	94
query83	202	160	140	140
query84	244	97	81	81
query85	811	369	303	303
query86	450	318	310	310
query87	4332	4320	4267	4267
query88	5351	2423	2407	2407
query89	403	342	300	300
query90	1810	192	194	192
query91	141	158	111	111
query92	70	58	52	52
query93	2674	900	539	539
query94	724	420	317	317
query95	348	292	278	278
query96	491	625	292	292
query97	3221	3325	3170	3170
query98	233	216	202	202
query99	1362	1397	1283	1283
Total cold run time: 291903 ms
Total hot run time: 193568 ms

doris-robot · 2025-08-25T04:36:57Z

ClickBench: Total hot run time: 29.18 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 73e6eb0ee94d4ad9997a89e98b78a1b14a8dc711, data reload: false

query1	0.03	0.03	0.03
query2	0.08	0.04	0.04
query3	0.24	0.06	0.05
query4	1.64	0.08	0.08
query5	0.53	0.52	0.50
query6	1.13	0.74	0.74
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.59	0.51	0.49
query10	0.55	0.55	0.55
query11	0.17	0.12	0.12
query12	0.15	0.12	0.12
query13	0.61	0.60	0.59
query14	0.78	0.81	0.82
query15	0.86	0.83	0.84
query16	0.39	0.38	0.36
query17	1.00	1.03	1.07
query18	0.20	0.21	0.20
query19	1.97	1.87	1.82
query20	0.02	0.01	0.02
query21	15.38	0.94	0.65
query22	0.76	0.76	0.68
query23	14.84	1.46	0.67
query24	2.20	0.41	0.22
query25	0.14	0.09	0.09
query26	0.29	0.19	0.17
query27	0.07	0.08	0.08
query28	13.44	1.30	0.55
query29	12.64	4.07	3.42
query30	0.25	0.09	0.07
query31	2.82	0.58	0.39
query32	3.22	0.58	0.49
query33	3.04	3.02	3.09
query34	16.61	5.30	4.53
query35	4.65	4.62	4.66
query36	0.63	0.52	0.48
query37	0.19	0.16	0.17
query38	0.18	0.15	0.16
query39	0.06	0.05	0.04
query40	0.16	0.14	0.14
query41	0.10	0.05	0.05
query42	0.06	0.05	0.06
query43	0.05	0.04	0.04
Total cold run time: 102.79 s
Total hot run time: 29.18 s

doris-robot · 2025-08-25T04:51:54Z

BE UT Coverage Report

Increment line coverage 2.13% (1/47) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	45.48% (12722/27972)
Line Coverage	36.36% (113397/311856)
Region Coverage	34.00% (64909/190921)
Branch Coverage	31.04% (34079/109790)

hello-stephen · 2025-08-25T07:20:20Z

BE Regression && UT Coverage Report

Increment line coverage 8.51% (4/47) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	76.46% (21028/27502)
Line Coverage	69.81% (216954/310799)
Region Coverage	67.75% (129850/191655)
Branch Coverage	61.29% (67588/110282)

suxiaogang223 · 2025-08-25T09:47:21Z

run external

suxiaogang223 · 2025-08-25T10:03:02Z

run external

suxiaogang223 · 2025-08-25T12:13:33Z

run buildall

doris-robot · 2025-08-25T12:29:12Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.18% (1212/1493)
Line Coverage	65.42% (21547/32934)
Region Coverage	67.07% (10868/16205)
Branch Coverage	56.63% (5723/10106)

doris-robot · 2025-08-25T12:56:18Z

TPC-H: Total hot run time: 32583 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0719919d9822a6943eccc8913c9b83aed1ee136d, data reload: false

------ Round 1 ----------------------------------
q1	17860	5534	5552	5534
q2	2036	417	271	271
q3	12335	1223	727	727
q4	10241	872	453	453
q5	8507	2399	2120	2120
q6	181	164	135	135
q7	899	747	635	635
q8	9343	1415	1154	1154
q9	5315	4928	4894	4894
q10	6735	2256	1827	1827
q11	472	282	270	270
q12	340	349	201	201
q13	17775	3593	2949	2949
q14	220	223	207	207
q15	518	469	464	464
q16	419	444	383	383
q17	611	868	375	375
q18	7017	6360	6273	6273
q19	1800	948	566	566
q20	335	333	204	204
q21	2944	2162	1954	1954
q22	1061	1017	987	987
Total cold run time: 106964 ms
Total hot run time: 32583 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5961	5481	5499	5481
q2	233	317	232	232
q3	2211	2597	2275	2275
q4	1332	1848	1392	1392
q5	4392	4969	4977	4969
q6	195	162	124	124
q7	2085	1927	1833	1833
q8	2617	2817	2706	2706
q9	7262	7231	7189	7189
q10	3044	3284	2661	2661
q11	573	509	499	499
q12	657	795	647	647
q13	3358	3716	3166	3166
q14	268	316	280	280
q15	520	461	451	451
q16	434	481	446	446
q17	1222	1717	1260	1260
q18	7647	7342	7426	7342
q19	815	1114	1063	1063
q20	2012	2051	1900	1900
q21	5301	4869	4631	4631
q22	1073	1034	990	990
Total cold run time: 53212 ms
Total hot run time: 51537 ms

hello-stephen · 2025-08-25T13:06:54Z

BE UT Coverage Report

Increment line coverage 9.26% (5/54) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	45.49% (12724/27972)
Line Coverage	36.37% (113436/311861)
Region Coverage	34.01% (64931/190931)
Branch Coverage	31.04% (34084/109796)

doris-robot · 2025-08-25T13:08:25Z

TPC-DS: Total hot run time: 191428 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0719919d9822a6943eccc8913c9b83aed1ee136d, data reload: false

query1	950	396	423	396
query2	6210	1980	1855	1855
query3	8679	198	196	196
query4	33648	24168	23430	23430
query5	4155	602	469	469
query6	307	203	177	177
query7	4190	495	318	318
query8	298	238	231	231
query9	9374	2605	2593	2593
query10	496	356	265	265
query11	18176	15347	15171	15171
query12	161	107	105	105
query13	1553	553	420	420
query14	9162	6623	7801	6623
query15	256	193	184	184
query16	8141	643	502	502
query17	1586	796	601	601
query18	2140	412	329	329
query19	237	187	173	173
query20	123	121	123	121
query21	219	130	106	106
query22	4674	4564	4366	4366
query23	35211	34126	33860	33860
query24	7296	2745	2696	2696
query25	542	489	442	442
query26	831	290	176	176
query27	2156	505	369	369
query28	5052	2213	2200	2200
query29	724	613	465	465
query30	246	191	159	159
query31	991	895	829	829
query32	89	71	60	60
query33	496	385	337	337
query34	741	860	522	522
query35	773	807	733	733
query36	1026	1131	952	952
query37	113	94	72	72
query38	3992	3995	4008	3995
query39	1536	1459	1475	1459
query40	214	119	115	115
query41	58	50	52	50
query42	124	103	101	101
query43	507	512	481	481
query44	1369	809	810	809
query45	189	182	178	178
query46	885	1072	680	680
query47	1990	1994	1956	1956
query48	420	439	364	364
query49	721	500	413	413
query50	683	686	429	429
query51	7291	7279	7083	7083
query52	103	101	95	95
query53	237	253	190	190
query54	559	558	472	472
query55	79	76	81	76
query56	274	271	252	252
query57	1290	1298	1229	1229
query58	237	217	235	217
query59	3078	3112	3061	3061
query60	300	277	263	263
query61	116	112	135	112
query62	780	756	670	670
query63	227	192	189	189
query64	3696	972	635	635
query65	3348	3275	3281	3275
query66	793	408	314	314
query67	16647	16001	15475	15475
query68	7859	817	531	531
query69	495	299	266	266
query70	1139	1126	1092	1092
query71	414	310	260	260
query72	5174	3870	3845	3845
query73	664	748	348	348
query74	10187	9409	9210	9210
query75	3926	3151	2651	2651
query76	3497	1188	766	766
query77	766	374	267	267
query78	10279	10372	9574	9574
query79	3208	892	589	589
query80	724	520	434	434
query81	481	261	219	219
query82	236	119	85	85
query83	180	163	146	146
query84	287	103	84	84
query85	738	348	293	293
query86	338	308	303	303
query87	4314	4309	4205	4205
query88	3433	2424	2415	2415
query89	423	333	293	293
query90	1987	188	187	187
query91	138	148	107	107
query92	59	58	51	51
query93	2091	889	544	544
query94	654	410	260	260
query95	346	282	270	270
query96	498	605	281	281
query97	3223	3260	3183	3183
query98	213	206	202	202
query99	1583	1444	1326	1326
Total cold run time: 292319 ms
Total hot run time: 191428 ms

doris-robot · 2025-08-25T13:13:39Z

ClickBench: Total hot run time: 29.29 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0719919d9822a6943eccc8913c9b83aed1ee136d, data reload: false

query1	0.03	0.03	0.03
query2	0.08	0.04	0.04
query3	0.24	0.06	0.05
query4	1.62	0.08	0.09
query5	0.52	0.52	0.52
query6	1.12	0.75	0.74
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.57	0.50	0.48
query10	0.55	0.55	0.55
query11	0.16	0.12	0.11
query12	0.15	0.13	0.12
query13	0.63	0.61	0.59
query14	0.77	0.79	0.80
query15	0.87	0.83	0.84
query16	0.38	0.38	0.37
query17	1.02	1.08	1.06
query18	0.18	0.18	0.22
query19	1.96	1.87	1.85
query20	0.02	0.01	0.01
query21	15.37	0.97	0.65
query22	0.78	0.79	0.72
query23	14.73	1.49	0.70
query24	2.26	0.36	0.22
query25	0.15	0.08	0.08
query26	0.28	0.18	0.18
query27	0.08	0.07	0.08
query28	13.40	1.30	0.55
query29	12.62	4.03	3.44
query30	0.25	0.08	0.08
query31	2.81	0.60	0.40
query32	3.23	0.56	0.48
query33	2.99	3.07	3.01
query34	16.63	5.23	4.56
query35	4.67	4.59	4.65
query36	0.61	0.47	0.47
query37	0.20	0.16	0.16
query38	0.16	0.16	0.16
query39	0.06	0.04	0.04
query40	0.17	0.14	0.13
query41	0.10	0.05	0.05
query42	0.07	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 102.61 s
Total hot run time: 29.29 s

hello-stephen · 2025-08-25T15:07:40Z

BE Regression && UT Coverage Report

Increment line coverage 29.63% (16/54) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	77.70% (21368/27502)
Line Coverage	71.94% (223607/310804)
Region Coverage	69.87% (133918/191665)
Branch Coverage	63.48% (70016/110288)

suxiaogang223 · 2025-08-25T15:43:51Z

run p0

hello-stephen · 2025-08-25T18:20:36Z

BE Regression && UT Coverage Report

Increment line coverage 16.67% (9/54) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	76.48% (21033/27502)
Line Coverage	69.87% (217169/310804)
Region Coverage	67.83% (130007/191665)
Branch Coverage	61.40% (67722/110288)

hello-stephen · 2025-08-26T02:15:37Z

BE Regression && UT Coverage Report

Increment line coverage 16.67% (9/54) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	76.48% (21033/27502)
Line Coverage	69.87% (217169/310804)
Region Coverage	67.83% (130007/191665)
Branch Coverage	61.40% (67722/110288)

suxiaogang223 requested a review from morrySnow as a code owner August 20, 2025 03:51

morrySnow previously approved these changes Aug 21, 2025

View reviewed changes

suxiaogang223 dismissed morrySnow’s stale review via c2f1383 August 21, 2025 13:40

suxiaogang223 added 3 commits August 22, 2025 11:59

fix build

e05c3ce

fix

a17058c

suxiaogang223 force-pushed the data_lake_3.1 branch from c2f1383 to a17058c Compare August 22, 2025 03:59

suxiaogang223 added 4 commits August 25, 2025 10:27

Revert "fix build"

1ea74e0

This reverts commit df64ec4.

Revert "[enhance](multi-catalog) Runtime Filter Partition Pruning for…

2d194af

… Data Lake Tables (apache#53399)" This reverts commit 8d98908.

fix build

73e6eb0

Reapply "fix"

2fd989a

This reverts commit 58c50fa.

Merge branch 'branch-3.1' into data_lake_3.1

95f6f34

fix!!!!

0719919

morningman approved these changes Aug 26, 2025

View reviewed changes

morrySnow approved these changes Aug 26, 2025

View reviewed changes

morrySnow merged commit 2495fbc into apache:branch-3.1 Aug 26, 2025
20 of 22 checks passed

suxiaogang223 deleted the data_lake_3.1 branch September 23, 2025 03:19

branch-3.1: [enhance](multi-catalog) Runtime Filter Partition Pruning for Data Lake Tables (#53399) #55040

branch-3.1: [enhance](multi-catalog) Runtime Filter Partition Pruning for Data Lake Tables (#53399) #55040

Uh oh!

Conversation

suxiaogang223 commented Aug 20, 2025

Uh oh!

hello-stephen commented Aug 20, 2025

Uh oh!

suxiaogang223 commented Aug 20, 2025

Uh oh!

hello-stephen commented Aug 20, 2025

Cloud UT Coverage Report

Uh oh!

hello-stephen commented Aug 20, 2025

BE UT Coverage Report

Uh oh!

suxiaogang223 commented Aug 20, 2025

Uh oh!

doris-robot commented Aug 20, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Aug 20, 2025

Uh oh!

hello-stephen commented Aug 20, 2025

BE UT Coverage Report

Uh oh!

doris-robot commented Aug 20, 2025

Uh oh!

doris-robot commented Aug 20, 2025

Uh oh!

hello-stephen commented Aug 20, 2025

BE Regression && UT Coverage Report

Uh oh!

suxiaogang223 commented Aug 21, 2025

Uh oh!

suxiaogang223 commented Aug 21, 2025

Uh oh!

doris-robot commented Aug 21, 2025

Cloud UT Coverage Report

Uh oh!

hello-stephen commented Aug 21, 2025

FE UT Coverage Report

Uh oh!

doris-robot commented Aug 21, 2025

Uh oh!

doris-robot commented Aug 21, 2025

Uh oh!

doris-robot commented Aug 21, 2025

Uh oh!

doris-robot commented Aug 21, 2025

BE UT Coverage Report

Uh oh!

hello-stephen commented Aug 21, 2025

BE Regression && UT Coverage Report

Uh oh!

suxiaogang223 commented Aug 22, 2025

Uh oh!

hello-stephen commented Aug 22, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Aug 22, 2025

Uh oh!

suxiaogang223 commented Aug 25, 2025

Uh oh!

hello-stephen commented Aug 25, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Aug 25, 2025

Uh oh!

doris-robot commented Aug 25, 2025

Uh oh!

doris-robot commented Aug 25, 2025

Uh oh!

doris-robot commented Aug 25, 2025

BE UT Coverage Report

Uh oh!

hello-stephen commented Aug 25, 2025

BE Regression && UT Coverage Report

Uh oh!

suxiaogang223 commented Aug 25, 2025