Skip to content

Conversation

@suxiaogang223
Copy link
Contributor

@suxiaogang223 suxiaogang223 commented Jul 16, 2025

What problem does this PR solve?

follow: #47025

PR Overview

This PR implements dynamic partition pruning based on runtime filters for Iceberg, Paimon, and Hudi data lake tables, extending and enhancing the previous PR #47025.

Background

In PR #47025, we implemented runtime filter-based dynamic partition pruning for Hive tables. However, due to significant differences in partition metadata formats between Iceberg, Paimon, Hudi and traditional Hive tables, specialized adaptation and implementation are required for these data lake formats.

Main Features

1. Core Implementation

Frontend (FE) Changes

  • During split generation in scan nodes, when enable_runtime_filter_partition_prune is enabled, call corresponding partition value extraction functions
  • Pass extracted partition values to backend through TFileRangeDesc.data_lake_partition_values field
  • Store partition values in Map<String, String> format, with keys as partition column names and values as serialized partition values

Backend (BE) Changes

  • Process partition column information in FileScanner::_generate_data_lake_partition_columns()
  • Runtime filters can perform partition pruning based on this partition value information, avoiding scanning of non-matching partition files

2. Supported Query Types

Dynamic partition pruning supports the following types of queries:

-- Equality queries
SELECT count(*) FROM iceberg_table 
WHERE partition_col = (
    SELECT partition_col FROM iceberg_table 
    GROUP BY partition_col 
    HAVING count(*) > 0 
    ORDER BY partition_col DESC 
    LIMIT 1
);

-- IN queries
SELECT count(*) FROM paimon_table 
WHERE partition_col IN (
    SELECT partition_col FROM paimon_table 
    GROUP BY partition_col 
    HAVING count(*) > 0 
    ORDER BY partition_col DESC 
    LIMIT 2
);

-- Function expression queries
SELECT count(*) FROM hudi_table 
WHERE abs(partition_col) = (
    SELECT partition_col FROM hudi_table 
    GROUP BY partition_col 
    HAVING count(*) > 0 
    ORDER BY partition_col DESC 
    LIMIT 1
);

3. Supported Data Types

Partition data types supported by each format:

Common Support:

  • Numeric types: INT, BIGINT, DECIMAL, FLOAT, DOUBLE, TINYINT, SMALLINT
  • String types: STRING, VARCHAR, CHAR
  • Date/time types: DATE, TIMESTAMP
  • Boolean type: BOOLEAN
  • Binary types: BINARY (except for Paimon)

Format-specific Support:

  • Iceberg: Additionally supports TIMESTAMP_NTZ type for timezone-free timestamps
  • Paimon: Does not support BINARY as partition key (currently binary as partition key causes issues in Spark)
  • Hudi: Based on Hive partition format, supports all Hive-compatible types

Notes:

  • TIME and UUID types are supported at the code level, but since Spark does not support these types as partition keys, test cases do not include related test scenarios
  • In actual production environments, if these types are used, the dynamic partition pruning feature can still work normally

Release note

Impl runtime filter partition pruning for Iceberg/Paimon

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223 suxiaogang223 marked this pull request as draft July 16, 2025 12:35
@suxiaogang223 suxiaogang223 force-pushed the rf_partition_pruning_data_lake branch from 31cfbd2 to abeecff Compare July 18, 2025 09:36
@suxiaogang223 suxiaogang223 marked this pull request as ready for review July 22, 2025 14:38
@suxiaogang223
Copy link
Contributor Author

run buildall

@suxiaogang223 suxiaogang223 force-pushed the rf_partition_pruning_data_lake branch from a28e5d9 to 28966e7 Compare July 22, 2025 15:25
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.42% (1302/1619)
Line Coverage 65.77% (21797/33140)
Region Coverage 67.09% (10955/16328)
Branch Coverage 56.61% (5761/10176)

@suxiaogang223 suxiaogang223 changed the title [enhance](multi-catalog) impl runtime filter partition pruning for Iceberg [enhance](multi-catalog) Runtime Filter Partition Pruning for Data Lake Tables Jul 22, 2025
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.42% (1302/1619)
Line Coverage 65.81% (21810/33140)
Region Coverage 67.12% (10960/16328)
Branch Coverage 56.69% (5769/10176)

@doris-robot
Copy link

TPC-H: Total hot run time: 34111 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c04b77a2d1aed59b4f164ef452f8deb7cc06adc2, data reload: false

------ Round 1 ----------------------------------
q1	17562	5238	5136	5136
q2	1959	278	187	187
q3	10565	1281	704	704
q4	10302	1012	504	504
q5	8609	2494	2332	2332
q6	184	161	129	129
q7	903	744	599	599
q8	9298	1325	1122	1122
q9	7080	5156	5156	5156
q10	6960	2361	1969	1969
q11	478	285	263	263
q12	354	352	219	219
q13	17791	3743	3087	3087
q14	232	230	213	213
q15	558	491	475	475
q16	431	432	377	377
q17	618	878	356	356
q18	7431	7214	7194	7194
q19	1295	943	568	568
q20	344	342	226	226
q21	3965	3216	2338	2338
q22	1068	1045	957	957
Total cold run time: 107987 ms
Total hot run time: 34111 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5225	5101	5095	5095
q2	246	333	216	216
q3	2195	2665	2331	2331
q4	1362	1862	1354	1354
q5	4463	4622	4602	4602
q6	218	171	123	123
q7	2037	1984	1780	1780
q8	2692	2527	2708	2527
q9	7371	7360	7256	7256
q10	3074	3305	2822	2822
q11	570	575	583	575
q12	691	774	631	631
q13	3491	4013	3400	3400
q14	325	309	287	287
q15	529	463	473	463
q16	470	509	440	440
q17	1200	1635	1404	1404
q18	7893	7688	7686	7686
q19	787	784	780	780
q20	1891	1966	1788	1788
q21	4880	4353	4359	4353
q22	1082	1091	980	980
Total cold run time: 52692 ms
Total hot run time: 50893 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187424 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c04b77a2d1aed59b4f164ef452f8deb7cc06adc2, data reload: false

query1	1038	383	396	383
query2	6513	1735	1729	1729
query3	6748	220	220	220
query4	26340	23693	23025	23025
query5	4365	600	488	488
query6	320	219	205	205
query7	4628	505	300	300
query8	269	245	215	215
query9	8554	2857	2826	2826
query10	463	327	298	298
query11	15953	14931	14924	14924
query12	153	111	107	107
query13	1657	507	397	397
query14	8786	5867	5873	5867
query15	203	188	166	166
query16	7148	648	415	415
query17	1195	698	578	578
query18	1989	394	305	305
query19	187	181	165	165
query20	118	124	117	117
query21	215	118	101	101
query22	4116	4136	4061	4061
query23	34101	33266	33142	33142
query24	8120	2336	2358	2336
query25	553	509	441	441
query26	1243	273	158	158
query27	2753	511	355	355
query28	4388	2210	2168	2168
query29	797	602	486	486
query30	288	220	188	188
query31	930	842	779	779
query32	85	77	76	76
query33	551	388	337	337
query34	788	839	572	572
query35	808	807	726	726
query36	973	983	901	901
query37	126	99	83	83
query38	4105	4147	4079	4079
query39	1484	1419	1435	1419
query40	226	124	109	109
query41	61	59	53	53
query42	119	108	111	108
query43	511	496	465	465
query44	1314	851	836	836
query45	175	173	166	166
query46	839	1007	637	637
query47	1816	1819	1725	1725
query48	377	429	297	297
query49	746	490	381	381
query50	702	679	408	408
query51	5469	5687	5589	5589
query52	112	108	103	103
query53	228	254	188	188
query54	598	589	523	523
query55	86	85	86	85
query56	323	324	307	307
query57	1202	1204	1122	1122
query58	283	269	268	268
query59	2664	2765	2514	2514
query60	350	342	320	320
query61	128	124	151	124
query62	801	721	622	622
query63	225	187	194	187
query64	4353	1007	726	726
query65	4235	4185	4176	4176
query66	1177	412	332	332
query67	15848	15620	15718	15620
query68	7896	902	553	553
query69	470	324	280	280
query70	1179	1106	1147	1106
query71	458	343	311	311
query72	5579	4866	5041	4866
query73	728	693	353	353
query74	8911	9137	8665	8665
query75	3772	3113	2703	2703
query76	3475	1172	722	722
query77	795	381	325	325
query78	10069	10059	9296	9296
query79	2616	785	607	607
query80	592	524	477	477
query81	503	254	219	219
query82	486	137	106	106
query83	250	253	253	253
query84	264	100	91	91
query85	852	376	325	325
query86	397	312	323	312
query87	4418	4441	4235	4235
query88	3667	2220	2199	2199
query89	419	312	277	277
query90	1872	215	218	215
query91	136	137	109	109
query92	88	71	65	65
query93	2004	966	627	627
query94	647	396	296	296
query95	400	309	303	303
query96	477	561	273	273
query97	2736	2815	2605	2605
query98	247	215	210	210
query99	1314	1411	1311	1311
Total cold run time: 275736 ms
Total hot run time: 187424 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c04b77a2d1aed59b4f164ef452f8deb7cc06adc2, data reload: false

query1	0.03	0.03	0.04
query2	0.11	0.06	0.06
query3	0.29	0.06	0.08
query4	1.59	0.08	0.08
query5	0.42	0.40	0.40
query6	1.18	0.66	0.65
query7	0.03	0.01	0.01
query8	0.06	0.05	0.05
query9	0.65	0.53	0.53
query10	0.58	0.59	0.58
query11	0.25	0.13	0.13
query12	0.25	0.13	0.13
query13	0.64	0.63	0.63
query14	0.81	0.83	0.88
query15	0.99	0.91	0.88
query16	0.38	0.38	0.39
query17	1.09	1.08	1.08
query18	0.22	0.21	0.22
query19	2.04	1.84	1.91
query20	0.01	0.02	0.02
query21	15.35	1.01	0.67
query22	0.94	1.14	0.99
query23	14.69	1.47	0.77
query24	5.46	0.57	0.30
query25	0.17	0.10	0.08
query26	0.56	0.22	0.18
query27	0.09	0.08	0.09
query28	10.97	1.19	0.57
query29	12.60	4.08	3.38
query30	3.12	3.07	2.96
query31	2.80	0.62	0.43
query32	3.24	0.60	0.50
query33	3.15	3.12	3.20
query34	16.35	5.44	4.71
query35	4.82	4.89	4.86
query36	0.66	0.52	0.51
query37	0.21	0.18	0.18
query38	0.17	0.16	0.16
query39	0.05	0.04	0.04
query40	0.20	0.17	0.17
query41	0.10	0.06	0.05
query42	0.06	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 107.43 s
Total hot run time: 33.38 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 7.22% (7/97) 🎉
Increment coverage report
Complete coverage report

new String[0], partition.getPartitionValues());
hudiSplit.setTableFormatType(TableFormatType.HUDI);
if (sessionVariable.isEnableRuntimeFilterPartitionPrune()) {
hudiSplit.setHudiPartitionValues(HudiUtils.getPartitionInfoMap(hmsTable, partition));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the HudiUtils.getPartitionInfoMap(hmsTable, partition) should only be called once for one partition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unresolved?

@suxiaogang223
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.42% (1302/1619)
Line Coverage 65.78% (21799/33140)
Region Coverage 67.03% (10944/16328)
Branch Coverage 56.60% (5760/10176)

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 33.33% (16/48) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.55% (15936/27690)
Line Coverage 46.35% (143271/309140)
Region Coverage 35.77% (107943/301733)
Branch Coverage 38.30% (47643/124395)

@suxiaogang223 suxiaogang223 force-pushed the rf_partition_pruning_data_lake branch from ff87bdb to 511964b Compare July 24, 2025 14:19
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.42% (1302/1619)
Line Coverage 65.78% (21801/33140)
Region Coverage 67.11% (10958/16328)
Branch Coverage 56.65% (5765/10176)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 9.43% (10/106) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 33900 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 511964bf10ea517185f81405d7b06bafb3857acd, data reload: false

------ Round 1 ----------------------------------
q1	17654	5272	5100	5100
q2	1922	279	180	180
q3	10628	1313	677	677
q4	10305	1005	509	509
q5	8460	2426	2366	2366
q6	182	158	128	128
q7	909	754	588	588
q8	9315	1334	1058	1058
q9	7087	5168	5109	5109
q10	6887	2393	1962	1962
q11	477	279	282	279
q12	341	350	217	217
q13	17770	3714	3062	3062
q14	230	239	210	210
q15	550	486	477	477
q16	428	422	378	378
q17	596	872	382	382
q18	7687	7179	7123	7123
q19	1214	946	549	549
q20	354	353	229	229
q21	3922	3215	2333	2333
q22	1070	1049	984	984
Total cold run time: 107988 ms
Total hot run time: 33900 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5146	5108	5098	5098
q2	236	315	217	217
q3	2164	2663	2315	2315
q4	1411	1835	1342	1342
q5	4363	4487	4495	4487
q6	213	170	125	125
q7	2014	1960	1798	1798
q8	2679	2474	2508	2474
q9	7339	7241	7221	7221
q10	3108	3281	2882	2882
q11	598	503	507	503
q12	688	792	606	606
q13	3648	4008	3467	3467
q14	276	303	272	272
q15	528	476	484	476
q16	443	464	430	430
q17	1213	1637	1604	1604
q18	7792	7838	7517	7517
q19	783	780	854	780
q20	1949	1941	1818	1818
q21	4776	4264	4365	4264
q22	1085	1051	975	975
Total cold run time: 52452 ms
Total hot run time: 50671 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186951 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 511964bf10ea517185f81405d7b06bafb3857acd, data reload: false

query1	1000	407	401	401
query2	6513	1685	1632	1632
query3	6736	226	223	223
query4	26558	23302	23092	23092
query5	4323	614	454	454
query6	299	210	209	209
query7	4616	496	285	285
query8	263	233	216	216
query9	8657	2893	2910	2893
query10	487	328	281	281
query11	15966	15070	14766	14766
query12	159	117	110	110
query13	1664	561	437	437
query14	9265	5848	5929	5848
query15	216	201	173	173
query16	7423	627	436	436
query17	1203	742	601	601
query18	2019	419	328	328
query19	198	195	163	163
query20	124	113	116	113
query21	213	130	111	111
query22	4187	4137	4042	4042
query23	33981	33038	32829	32829
query24	8130	2379	2403	2379
query25	536	459	392	392
query26	1236	313	155	155
query27	2718	507	358	358
query28	4360	2239	2207	2207
query29	759	557	438	438
query30	283	238	191	191
query31	903	835	753	753
query32	81	79	74	74
query33	559	366	329	329
query34	790	835	512	512
query35	798	832	752	752
query36	959	1018	921	921
query37	125	101	87	87
query38	4217	4117	4078	4078
query39	1459	1420	1405	1405
query40	241	127	115	115
query41	60	54	54	54
query42	123	115	110	110
query43	490	490	478	478
query44	1315	868	852	852
query45	177	164	164	164
query46	835	1006	630	630
query47	1769	1832	1741	1741
query48	381	427	314	314
query49	744	482	384	384
query50	620	703	398	398
query51	5403	5511	5511	5511
query52	115	110	101	101
query53	227	255	187	187
query54	603	610	548	548
query55	90	86	82	82
query56	319	314	303	303
query57	1178	1199	1104	1104
query58	272	262	272	262
query59	2612	2685	2508	2508
query60	348	324	321	321
query61	126	122	123	122
query62	804	703	660	660
query63	217	193	228	193
query64	4349	1023	663	663
query65	4266	4167	4149	4149
query66	1124	411	335	335
query67	15805	15783	15553	15553
query68	8651	906	564	564
query69	482	369	291	291
query70	1258	1104	1053	1053
query71	454	334	317	317
query72	5313	4697	4599	4599
query73	709	564	356	356
query74	8898	9214	8992	8992
query75	4023	3075	2605	2605
query76	3689	1136	718	718
query77	807	379	330	330
query78	9908	9951	9225	9225
query79	2810	830	607	607
query80	649	525	534	525
query81	473	256	223	223
query82	469	141	115	115
query83	290	251	233	233
query84	284	111	79	79
query85	801	367	330	330
query86	342	321	307	307
query87	4362	4509	4376	4376
query88	3083	2290	2268	2268
query89	402	310	282	282
query90	1948	217	217	217
query91	143	135	112	112
query92	89	69	62	62
query93	1583	959	627	627
query94	679	397	305	305
query95	450	316	322	316
query96	486	579	283	283
query97	2702	2697	2591	2591
query98	244	214	217	214
query99	1444	1400	1313	1313
Total cold run time: 276521 ms
Total hot run time: 186951 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.55 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 511964bf10ea517185f81405d7b06bafb3857acd, data reload: false

query1	0.04	0.04	0.04
query2	0.12	0.05	0.05
query3	0.29	0.06	0.06
query4	1.60	0.09	0.09
query5	0.41	0.39	0.40
query6	1.15	0.65	0.64
query7	0.03	0.02	0.01
query8	0.07	0.05	0.05
query9	0.66	0.52	0.53
query10	0.59	0.57	0.57
query11	0.25	0.13	0.13
query12	0.25	0.14	0.14
query13	0.65	0.64	0.61
query14	0.81	0.84	0.84
query15	0.97	0.89	0.90
query16	0.39	0.38	0.38
query17	1.06	1.08	1.05
query18	0.24	0.22	0.23
query19	2.03	1.85	1.86
query20	0.02	0.01	0.01
query21	15.37	0.97	0.67
query22	0.93	1.00	0.93
query23	14.72	1.52	0.76
query24	5.16	0.56	0.32
query25	0.17	0.10	0.09
query26	0.55	0.22	0.18
query27	0.09	0.10	0.09
query28	11.06	1.18	0.58
query29	12.59	4.16	3.45
query30	3.05	3.04	3.05
query31	2.82	0.61	0.42
query32	3.25	0.61	0.50
query33	3.16	3.16	3.16
query34	16.52	5.45	4.76
query35	4.81	4.86	4.96
query36	0.64	0.53	0.50
query37	0.20	0.19	0.17
query38	0.18	0.16	0.17
query39	0.05	0.05	0.05
query40	0.20	0.17	0.17
query41	0.11	0.05	0.05
query42	0.07	0.06	0.06
query43	0.07	0.04	0.05
Total cold run time: 107.4 s
Total hot run time: 33.55 s

@suxiaogang223
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.42% (1302/1619)
Line Coverage 65.77% (21798/33142)
Region Coverage 67.03% (10945/16328)
Branch Coverage 56.59% (5759/10176)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 9.43% (10/106) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 18.87% (10/53) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 58.19% (16339/28080)
Line Coverage 47.09% (147476/313165)
Region Coverage 36.11% (110375/305638)
Branch Coverage 38.95% (48984/125773)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 18.87% (10/53) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 81.26% (22396/27562)
Line Coverage 74.10% (231731/312741)
Region Coverage 61.50% (192752/313405)
Branch Coverage 65.42% (83232/127221)

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 7, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit 33f96c4 into apache:master Aug 7, 2025
27 of 30 checks passed
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Aug 20, 2025
…ke Tables (apache#53399)

follow: apache#47025

This PR implements dynamic partition pruning based on runtime filters
for Iceberg, Paimon, and Hudi data lake tables, extending and enhancing
the previous PR [apache#47025](apache#47025).

In PR [apache#47025](apache#47025), we
implemented runtime filter-based dynamic partition pruning for Hive
tables. However, due to significant differences in partition metadata
formats between Iceberg, Paimon, Hudi and traditional Hive tables,
specialized adaptation and implementation are required for these data
lake formats.
- During split generation in scan nodes, when
`enable_runtime_filter_partition_prune` is enabled, call corresponding
partition value extraction functions
- Pass extracted partition values to backend through
`TFileRangeDesc.data_lake_partition_values` field
- Store partition values in `Map<String, String>` format, with keys as
partition column names and values as serialized partition values
- Process partition column information in
`FileScanner::_generate_data_lake_partition_columns()`
- Runtime filters can perform partition pruning based on this partition
value information, avoiding scanning of non-matching partition files

Dynamic partition pruning supports the following types of queries:

```sql
-- Equality queries
SELECT count(*) FROM iceberg_table
WHERE partition_col = (
    SELECT partition_col FROM iceberg_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 1
);

-- IN queries
SELECT count(*) FROM paimon_table
WHERE partition_col IN (
    SELECT partition_col FROM paimon_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 2
);

-- Function expression queries
SELECT count(*) FROM hudi_table
WHERE abs(partition_col) = (
    SELECT partition_col FROM hudi_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 1
);
```

Partition data types supported by each format:

**Common Support**:
- **Numeric types**: INT, BIGINT, DECIMAL, FLOAT, DOUBLE, TINYINT,
SMALLINT
- **String types**: STRING, VARCHAR, CHAR
- **Date/time types**: DATE, TIMESTAMP
- **Boolean type**: BOOLEAN
- **Binary types**: BINARY (except for Paimon)

**Format-specific Support**:
- **Iceberg**: Additionally supports TIMESTAMP_NTZ type for
timezone-free timestamps
- **Paimon**: Does not support BINARY as partition key (currently binary
as partition key causes issues in Spark)
- **Hudi**: Based on Hive partition format, supports all Hive-compatible
types

**Notes**:
- TIME and UUID types are supported at the code level, but since Spark
does not support these types as partition keys, test cases do not
include related test scenarios
- In actual production environments, if these types are used, the
dynamic partition pruning feature can still work normally
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Aug 22, 2025
…ke Tables (apache#53399)

follow: apache#47025

This PR implements dynamic partition pruning based on runtime filters
for Iceberg, Paimon, and Hudi data lake tables, extending and enhancing
the previous PR [apache#47025](apache#47025).

In PR [apache#47025](apache#47025), we
implemented runtime filter-based dynamic partition pruning for Hive
tables. However, due to significant differences in partition metadata
formats between Iceberg, Paimon, Hudi and traditional Hive tables,
specialized adaptation and implementation are required for these data
lake formats.
- During split generation in scan nodes, when
`enable_runtime_filter_partition_prune` is enabled, call corresponding
partition value extraction functions
- Pass extracted partition values to backend through
`TFileRangeDesc.data_lake_partition_values` field
- Store partition values in `Map<String, String>` format, with keys as
partition column names and values as serialized partition values
- Process partition column information in
`FileScanner::_generate_data_lake_partition_columns()`
- Runtime filters can perform partition pruning based on this partition
value information, avoiding scanning of non-matching partition files

Dynamic partition pruning supports the following types of queries:

```sql
-- Equality queries
SELECT count(*) FROM iceberg_table
WHERE partition_col = (
    SELECT partition_col FROM iceberg_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 1
);

-- IN queries
SELECT count(*) FROM paimon_table
WHERE partition_col IN (
    SELECT partition_col FROM paimon_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 2
);

-- Function expression queries
SELECT count(*) FROM hudi_table
WHERE abs(partition_col) = (
    SELECT partition_col FROM hudi_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 1
);
```

Partition data types supported by each format:

**Common Support**:
- **Numeric types**: INT, BIGINT, DECIMAL, FLOAT, DOUBLE, TINYINT,
SMALLINT
- **String types**: STRING, VARCHAR, CHAR
- **Date/time types**: DATE, TIMESTAMP
- **Boolean type**: BOOLEAN
- **Binary types**: BINARY (except for Paimon)

**Format-specific Support**:
- **Iceberg**: Additionally supports TIMESTAMP_NTZ type for
timezone-free timestamps
- **Paimon**: Does not support BINARY as partition key (currently binary
as partition key causes issues in Spark)
- **Hudi**: Based on Hive partition format, supports all Hive-compatible
types

**Notes**:
- TIME and UUID types are supported at the code level, but since Spark
does not support these types as partition keys, test cases do not
include related test scenarios
- In actual production environments, if these types are used, the
dynamic partition pruning feature can still work normally
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Aug 25, 2025
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Aug 25, 2025
…ke Tables (apache#53399)

follow: apache#47025

This PR implements dynamic partition pruning based on runtime filters
for Iceberg, Paimon, and Hudi data lake tables, extending and enhancing
the previous PR [apache#47025](apache#47025).

In PR [apache#47025](apache#47025), we
implemented runtime filter-based dynamic partition pruning for Hive
tables. However, due to significant differences in partition metadata
formats between Iceberg, Paimon, Hudi and traditional Hive tables,
specialized adaptation and implementation are required for these data
lake formats.
- During split generation in scan nodes, when
`enable_runtime_filter_partition_prune` is enabled, call corresponding
partition value extraction functions
- Pass extracted partition values to backend through
`TFileRangeDesc.data_lake_partition_values` field
- Store partition values in `Map<String, String>` format, with keys as
partition column names and values as serialized partition values
- Process partition column information in
`FileScanner::_generate_data_lake_partition_columns()`
- Runtime filters can perform partition pruning based on this partition
value information, avoiding scanning of non-matching partition files

Dynamic partition pruning supports the following types of queries:

```sql
-- Equality queries
SELECT count(*) FROM iceberg_table
WHERE partition_col = (
    SELECT partition_col FROM iceberg_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 1
);

-- IN queries
SELECT count(*) FROM paimon_table
WHERE partition_col IN (
    SELECT partition_col FROM paimon_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 2
);

-- Function expression queries
SELECT count(*) FROM hudi_table
WHERE abs(partition_col) = (
    SELECT partition_col FROM hudi_table
    GROUP BY partition_col
    HAVING count(*) > 0
    ORDER BY partition_col DESC
    LIMIT 1
);
```

Partition data types supported by each format:

**Common Support**:
- **Numeric types**: INT, BIGINT, DECIMAL, FLOAT, DOUBLE, TINYINT,
SMALLINT
- **String types**: STRING, VARCHAR, CHAR
- **Date/time types**: DATE, TIMESTAMP
- **Boolean type**: BOOLEAN
- **Binary types**: BINARY (except for Paimon)

**Format-specific Support**:
- **Iceberg**: Additionally supports TIMESTAMP_NTZ type for
timezone-free timestamps
- **Paimon**: Does not support BINARY as partition key (currently binary
as partition key causes issues in Spark)
- **Hudi**: Based on Hive partition format, supports all Hive-compatible
types

**Notes**:
- TIME and UUID types are supported at the code level, but since Spark
does not support these types as partition keys, test cases do not
include related test scenarios
- In actual production environments, if these types are used, the
dynamic partition pruning feature can still work normally
morrySnow pushed a commit that referenced this pull request Aug 26, 2025
@suxiaogang223 suxiaogang223 deleted the rf_partition_pruning_data_lake branch September 23, 2025 03:18
dataroaring pushed a commit that referenced this pull request Nov 6, 2025
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: ##53399
github-actions bot pushed a commit that referenced this pull request Nov 6, 2025
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: ##53399
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Nov 18, 2025
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #apache#53399
0AyanamiRei added a commit to 0AyanamiRei/doris that referenced this pull request Nov 25, 2025
zhangstar333 added a commit that referenced this pull request Jan 9, 2026
… type (#59564)

### What problem does this PR solve?
Related PR: #53399
Problem Summary:
the serializePartitionValue function will return String value.
But the binary type use String with utf8 will be cause data corrupted,
and it is not same with origin data.
zhangstar333 added a commit to zhangstar333/incubator-doris that referenced this pull request Jan 9, 2026
… type (apache#59564)

### What problem does this PR solve?
Related PR: apache#53399
Problem Summary:
the serializePartitionValue function will return String value.
But the binary type use String with utf8 will be cause data corrupted,
and it is not same with origin data.
zzzxl1993 pushed a commit to zzzxl1993/doris that referenced this pull request Jan 13, 2026
… type (apache#59564)

### What problem does this PR solve?
Related PR: apache#53399
Problem Summary:
the serializePartitionValue function will return String value.
But the binary type use String with utf8 will be cause data corrupted,
and it is not same with origin data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants