Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Nov 6, 2025

What problem does this PR solve?

Problem Summary:

  1. The previous page index could only handle SQL WHERE conditions that only contained AND, but this PR can handle conditions that contain OR.
  2. Because the topn runtime filter is dynamically maintained, this PR delays the timing of the topn RF min-max filter until the row group reader is created.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 6, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

hubgeter commented Nov 6, 2025

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 190261 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 198881516cadd419f904b2cc9dd10ec8d00daf55, data reload: false

query1	1019	407	422	407
query2	6574	1733	1702	1702
query3	6766	237	225	225
query4	26611	23342	23432	23342
query5	4852	635	487	487
query6	336	241	229	229
query7	4654	497	305	305
query8	299	259	249	249
query9	8718	2617	2605	2605
query10	520	349	305	305
query11	15481	15154	15572	15154
query12	196	120	126	120
query13	1850	601	495	495
query14	11836	9942	9721	9721
query15	253	206	195	195
query16	7841	735	535	535
query17	2065	805	627	627
query18	2047	458	347	347
query19	254	236	192	192
query20	141	139	130	130
query21	216	140	113	113
query22	4334	4703	4389	4389
query23	34857	33962	33550	33550
query24	8495	2465	2416	2416
query25	567	499	467	467
query26	1228	288	167	167
query27	2690	496	359	359
query28	4358	2205	2223	2205
query29	829	646	559	559
query30	314	213	195	195
query31	925	826	710	710
query32	89	75	71	71
query33	602	386	339	339
query34	815	862	531	531
query35	824	831	760	760
query36	970	1010	897	897
query37	129	113	91	91
query38	3564	3576	3487	3487
query39	1496	1424	1421	1421
query40	226	140	124	124
query41	78	81	63	63
query42	125	109	110	109
query43	496	499	454	454
query44	1247	752	745	745
query45	190	178	175	175
query46	891	987	657	657
query47	1745	1761	1738	1738
query48	392	434	329	329
query49	803	538	436	436
query50	667	697	417	417
query51	3849	3918	3915	3915
query52	110	110	106	106
query53	250	276	211	211
query54	332	306	288	288
query55	88	88	85	85
query56	342	346	330	330
query57	1177	1175	1109	1109
query58	301	291	304	291
query59	2529	2672	2470	2470
query60	342	339	329	329
query61	158	155	154	154
query62	823	752	670	670
query63	236	201	196	196
query64	4454	1152	862	862
query65	4058	3977	3948	3948
query66	1089	435	346	346
query67	15419	15382	15055	15055
query68	8663	880	596	596
query69	493	323	289	289
query70	1364	1261	1282	1261
query71	492	344	323	323
query72	5833	4914	4888	4888
query73	701	591	356	356
query74	8841	9001	8886	8886
query75	4106	3374	2866	2866
query76	3852	1162	777	777
query77	823	427	337	337
query78	9440	9894	8977	8977
query79	1940	801	611	611
query80	683	565	494	494
query81	483	253	225	225
query82	426	154	129	129
query83	270	258	244	244
query84	242	108	97	97
query85	941	496	441	441
query86	358	318	287	287
query87	3697	3718	3652	3652
query88	3168	2222	2199	2199
query89	387	328	288	288
query90	1980	214	216	214
query91	165	166	133	133
query92	80	72	59	59
query93	1160	987	650	650
query94	685	433	329	329
query95	389	307	298	298
query96	484	582	279	279
query97	2940	2962	2869	2869
query98	243	213	208	208
query99	1689	1377	1288	1288
Total cold run time: 279050 ms
Total hot run time: 190261 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 198881516cadd419f904b2cc9dd10ec8d00daf55, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.11	0.12
query5	0.28	0.26	0.25
query6	1.17	0.63	0.66
query7	0.03	0.03	0.03
query8	0.05	0.04	0.04
query9	0.60	0.53	0.52
query10	0.58	0.57	0.58
query11	0.16	0.11	0.11
query12	0.15	0.11	0.12
query13	0.62	0.62	0.60
query14	1.00	0.99	0.99
query15	0.84	0.84	0.83
query16	0.39	0.40	0.38
query17	1.02	1.04	1.02
query18	0.22	0.20	0.20
query19	1.90	1.79	1.81
query20	0.02	0.01	0.02
query21	15.45	0.18	0.12
query22	5.09	0.07	0.05
query23	15.66	0.25	0.09
query24	2.22	0.50	1.46
query25	0.10	0.06	0.06
query26	0.14	0.14	0.13
query27	0.06	0.06	0.05
query28	5.86	1.15	0.93
query29	12.56	3.91	3.19
query30	0.29	0.14	0.11
query31	2.81	0.60	0.39
query32	3.23	0.56	0.47
query33	3.09	3.04	3.06
query34	15.91	5.16	4.57
query35	4.62	4.56	4.52
query36	0.66	0.50	0.50
query37	0.10	0.06	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.14	0.14
query41	0.08	0.02	0.03
query42	0.03	0.04	0.03
query43	0.04	0.03	0.03
Total cold run time: 99.31 s
Total hot run time: 27.44 s

@hubgeter
Copy link
Contributor Author

hubgeter commented Nov 7, 2025

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 189821 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 430ea3c5ab158b515a310da2e7f0a24899feb573, data reload: false

query1	1044	402	395	395
query2	6599	1685	1661	1661
query3	6750	216	221	216
query4	26284	23510	23345	23345
query5	5577	603	463	463
query6	325	235	206	206
query7	4642	488	291	291
query8	292	247	244	244
query9	8698	2579	2578	2578
query10	518	335	281	281
query11	15852	14995	14969	14969
query12	179	116	112	112
query13	1688	542	428	428
query14	12438	9159	9146	9146
query15	199	184	172	172
query16	7651	707	509	509
query17	1584	751	625	625
query18	2048	521	363	363
query19	228	232	190	190
query20	141	135	130	130
query21	236	147	123	123
query22	4680	4632	4651	4632
query23	34527	33885	33760	33760
query24	8188	2524	2473	2473
query25	589	545	482	482
query26	1234	275	190	190
query27	3504	515	371	371
query28	4348	2246	2218	2218
query29	902	657	515	515
query30	340	240	203	203
query31	960	824	782	782
query32	101	78	78	78
query33	608	415	346	346
query34	799	915	561	561
query35	840	880	805	805
query36	996	1046	959	959
query37	134	120	94	94
query38	3656	3647	3559	3559
query39	1466	1434	1401	1401
query40	227	134	120	120
query41	65	62	65	62
query42	123	111	111	111
query43	495	487	480	480
query44	1206	754	747	747
query45	190	181	179	179
query46	940	990	633	633
query47	1760	1826	1758	1758
query48	386	432	325	325
query49	790	496	421	421
query50	640	685	403	403
query51	3844	3916	3939	3916
query52	109	107	101	101
query53	235	267	197	197
query54	315	297	280	280
query55	90	87	81	81
query56	324	329	321	321
query57	1190	1188	1143	1143
query58	301	281	293	281
query59	2582	2668	2504	2504
query60	350	356	331	331
query61	197	205	182	182
query62	788	747	656	656
query63	219	189	194	189
query64	4434	1161	849	849
query65	4066	3941	3937	3937
query66	1038	424	318	318
query67	15542	15195	15044	15044
query68	8562	934	599	599
query69	518	319	274	274
query70	1323	1230	1220	1220
query71	519	320	324	320
query72	6040	4978	5025	4978
query73	694	630	359	359
query74	8851	9080	8884	8884
query75	4134	3352	2818	2818
query76	3865	1147	729	729
query77	823	389	303	303
query78	9610	9666	8920	8920
query79	2022	863	590	590
query80	642	565	506	506
query81	484	258	223	223
query82	461	154	127	127
query83	264	256	255	255
query84	247	116	96	96
query85	881	490	430	430
query86	336	302	291	291
query87	3776	3756	3614	3614
query88	3754	2230	2238	2230
query89	391	332	297	297
query90	2046	224	223	223
query91	165	164	137	137
query92	88	67	63	63
query93	1629	1007	641	641
query94	704	465	349	349
query95	388	327	303	303
query96	489	585	278	278
query97	2934	2988	2837	2837
query98	244	209	205	205
query99	1447	1410	1277	1277
Total cold run time: 281622 ms
Total hot run time: 189821 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 430ea3c5ab158b515a310da2e7f0a24899feb573, data reload: false

query1	0.05	0.06	0.05
query2	0.09	0.04	0.04
query3	0.25	0.08	0.08
query4	1.60	0.12	0.12
query5	0.27	0.27	0.26
query6	1.21	0.66	0.64
query7	0.03	0.02	0.02
query8	0.06	0.04	0.04
query9	0.61	0.52	0.52
query10	0.58	0.57	0.56
query11	0.17	0.11	0.11
query12	0.14	0.11	0.13
query13	0.62	0.61	0.60
query14	1.01	1.00	1.00
query15	0.85	0.84	0.82
query16	0.38	0.38	0.40
query17	1.01	1.00	1.04
query18	0.21	0.20	0.19
query19	1.94	1.83	1.83
query20	0.02	0.01	0.01
query21	15.45	0.17	0.13
query22	5.11	0.07	0.04
query23	15.70	0.27	0.10
query24	3.01	1.21	0.72
query25	0.08	0.05	0.06
query26	0.14	0.14	0.13
query27	0.05	0.05	0.05
query28	4.34	1.13	0.94
query29	12.56	3.87	3.24
query30	0.28	0.13	0.12
query31	2.82	0.57	0.37
query32	3.24	0.55	0.47
query33	3.11	3.09	3.03
query34	15.77	5.18	4.49
query35	4.63	4.60	4.57
query36	0.68	0.50	0.50
query37	0.10	0.07	0.06
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.15
query41	0.09	0.03	0.02
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.63 s
Total hot run time: 27.69 s

@hubgeter
Copy link
Contributor Author

hubgeter commented Nov 8, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34442 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ff4fcbd21e33d5f5396840b3663fd2872ab23222, data reload: false

------ Round 1 ----------------------------------
q1	17628	5159	5048	5048
q2	2022	315	233	233
q3	10251	1308	734	734
q4	10240	954	374	374
q5	7502	2497	2308	2308
q6	181	171	139	139
q7	935	763	626	626
q8	9358	1349	1145	1145
q9	7049	5174	5191	5174
q10	6915	2244	1800	1800
q11	497	302	286	286
q12	373	387	231	231
q13	17768	3636	3043	3043
q14	236	237	210	210
q15	590	523	508	508
q16	1073	998	941	941
q17	599	847	407	407
q18	8035	7160	7118	7118
q19	1094	964	570	570
q20	356	356	239	239
q21	3863	2502	2332	2332
q22	1082	1043	976	976
Total cold run time: 107647 ms
Total hot run time: 34442 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5124	5153	5105	5105
q2	246	331	231	231
q3	2209	2705	2290	2290
q4	1359	1777	1399	1399
q5	4254	4524	4484	4484
q6	210	171	143	143
q7	2072	1971	1893	1893
q8	2630	2646	2559	2559
q9	7319	7431	7390	7390
q10	3090	3349	2815	2815
q11	606	556	506	506
q12	744	802	656	656
q13	3494	4005	3362	3362
q14	311	323	295	295
q15	555	518	520	518
q16	1055	1121	1060	1060
q17	1225	1591	1484	1484
q18	8089	7664	7178	7178
q19	778	755	917	755
q20	1902	2049	1852	1852
q21	4766	4485	4292	4292
q22	1078	1044	1016	1016
Total cold run time: 53116 ms
Total hot run time: 51283 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188197 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ff4fcbd21e33d5f5396840b3663fd2872ab23222, data reload: false

query1	1075	425	390	390
query2	6596	1694	1715	1694
query3	6747	224	218	218
query4	27030	23593	23236	23236
query5	4479	638	480	480
query6	331	248	229	229
query7	4638	502	300	300
query8	316	253	253	253
query9	8704	2635	2603	2603
query10	489	325	289	289
query11	15421	15249	15075	15075
query12	184	121	123	121
query13	1684	551	448	448
query14	10441	9463	9281	9281
query15	207	231	177	177
query16	7318	659	528	528
query17	1238	766	619	619
query18	1987	420	321	321
query19	208	200	190	190
query20	131	135	125	125
query21	217	145	115	115
query22	4093	4134	4050	4050
query23	34097	32989	33007	32989
query24	8429	2432	2437	2432
query25	613	538	448	448
query26	1240	276	155	155
query27	2768	491	352	352
query28	4389	2238	2191	2191
query29	862	604	482	482
query30	299	224	193	193
query31	916	797	710	710
query32	83	79	72	72
query33	592	371	338	338
query34	800	845	519	519
query35	814	845	767	767
query36	948	988	925	925
query37	128	113	91	91
query38	3557	3507	3467	3467
query39	1502	1430	1414	1414
query40	224	132	117	117
query41	62	59	64	59
query42	136	115	113	113
query43	493	515	468	468
query44	1253	758	754	754
query45	183	181	176	176
query46	882	980	654	654
query47	1757	1769	1718	1718
query48	390	425	328	328
query49	788	531	422	422
query50	651	677	407	407
query51	3886	3922	3882	3882
query52	113	110	104	104
query53	249	276	194	194
query54	313	291	282	282
query55	87	90	85	85
query56	328	314	311	311
query57	1182	1210	1135	1135
query58	292	272	275	272
query59	2534	2734	2607	2607
query60	344	350	348	348
query61	199	188	193	188
query62	792	720	688	688
query63	240	196	205	196
query64	4708	1213	874	874
query65	4050	3959	3963	3959
query66	1196	465	323	323
query67	15455	15185	14896	14896
query68	7048	863	603	603
query69	502	328	283	283
query70	1341	1273	1273	1273
query71	435	337	311	311
query72	5873	5000	4859	4859
query73	627	571	359	359
query74	8882	9063	8666	8666
query75	3298	3327	2813	2813
query76	3245	1166	749	749
query77	512	411	325	325
query78	9641	9670	8961	8961
query79	1398	837	600	600
query80	1555	589	497	497
query81	544	272	233	233
query82	398	162	131	131
query83	275	271	262	262
query84	255	112	96	96
query85	942	489	457	457
query86	379	299	302	299
query87	3726	3743	3622	3622
query88	2898	2303	2260	2260
query89	382	332	291	291
query90	1761	222	226	222
query91	176	171	143	143
query92	74	71	64	64
query93	1109	965	644	644
query94	739	462	350	350
query95	404	322	316	316
query96	486	572	293	293
query97	2907	2962	2849	2849
query98	231	214	209	209
query99	1360	1435	1310	1310
Total cold run time: 271761 ms
Total hot run time: 188197 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.22 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ff4fcbd21e33d5f5396840b3663fd2872ab23222, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.06	0.05
query3	0.26	0.08	0.09
query4	1.60	0.12	0.11
query5	0.28	0.25	0.24
query6	1.20	0.64	0.64
query7	0.04	0.03	0.02
query8	0.06	0.04	0.05
query9	0.61	0.55	0.51
query10	0.58	0.58	0.57
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.61	0.60	0.61
query14	1.01	1.01	0.99
query15	0.85	0.83	0.84
query16	0.38	0.40	0.40
query17	1.03	0.99	1.00
query18	0.22	0.21	0.20
query19	1.95	1.91	1.80
query20	0.02	0.02	0.01
query21	15.46	0.19	0.13
query22	5.04	0.07	0.05
query23	15.64	0.27	0.10
query24	3.41	0.61	0.32
query25	0.08	0.06	0.05
query26	0.14	0.13	0.14
query27	0.06	0.05	0.06
query28	4.61	1.12	0.92
query29	12.55	3.93	3.24
query30	0.28	0.13	0.12
query31	2.83	0.61	0.38
query32	3.24	0.55	0.47
query33	3.08	3.02	3.14
query34	15.79	5.23	4.49
query35	4.54	4.57	4.54
query36	0.67	0.51	0.51
query37	0.09	0.06	0.07
query38	0.06	0.05	0.03
query39	0.04	0.03	0.03
query40	0.17	0.14	0.15
query41	0.08	0.04	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 99.09 s
Total hot run time: 27.22 s

@hubgeter
Copy link
Contributor Author

hubgeter commented Nov 9, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34480 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cd940840bd8937c8423935551b90bd9055113a15, data reload: false

------ Round 1 ----------------------------------
q1	18753	5229	5142	5142
q2	2019	334	214	214
q3	10308	1309	734	734
q4	10245	955	376	376
q5	7458	2416	2365	2365
q6	188	173	141	141
q7	952	770	629	629
q8	9342	1310	1139	1139
q9	6986	5243	5159	5159
q10	6920	2263	1793	1793
q11	507	297	289	289
q12	367	372	228	228
q13	17800	3669	3015	3015
q14	238	243	215	215
q15	587	499	516	499
q16	1003	1000	948	948
q17	591	863	362	362
q18	7704	7106	7188	7106
q19	1285	970	581	581
q20	346	336	222	222
q21	3771	3231	2330	2330
q22	1060	1013	993	993
Total cold run time: 108430 ms
Total hot run time: 34480 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5208	5226	5152	5152
q2	246	329	228	228
q3	2199	2656	2315	2315
q4	1350	1783	1326	1326
q5	4243	4572	4518	4518
q6	226	178	134	134
q7	2020	2025	1910	1910
q8	2702	2639	2562	2562
q9	7383	7300	7410	7300
q10	3036	3313	2779	2779
q11	606	547	519	519
q12	687	805	648	648
q13	3538	3874	3300	3300
q14	294	328	297	297
q15	659	524	509	509
q16	1037	1087	1069	1069
q17	1188	1526	1408	1408
q18	7851	7723	7469	7469
q19	808	784	888	784
q20	1909	1973	1820	1820
q21	4842	4386	4349	4349
q22	1069	1065	1023	1023
Total cold run time: 53101 ms
Total hot run time: 51419 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187984 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cd940840bd8937c8423935551b90bd9055113a15, data reload: false

query1	1053	411	419	411
query2	6577	1653	1749	1653
query3	6751	229	218	218
query4	26434	23553	23464	23464
query5	4379	604	450	450
query6	343	233	216	216
query7	4648	492	307	307
query8	298	278	247	247
query9	8686	2584	2613	2584
query10	486	326	277	277
query11	15989	15028	14816	14816
query12	178	115	113	113
query13	1684	558	429	429
query14	10977	9257	9240	9240
query15	198	190	179	179
query16	7297	692	514	514
query17	1271	742	622	622
query18	1978	408	311	311
query19	211	193	175	175
query20	134	123	125	123
query21	211	134	109	109
query22	4071	4178	4120	4120
query23	33860	33426	33062	33062
query24	8384	2408	2417	2408
query25	595	518	442	442
query26	1230	269	158	158
query27	2757	493	343	343
query28	4402	2189	2183	2183
query29	822	613	485	485
query30	312	221	201	201
query31	903	793	720	720
query32	83	73	72	72
query33	596	361	325	325
query34	800	862	508	508
query35	820	855	735	735
query36	952	968	887	887
query37	129	105	91	91
query38	3540	3488	3450	3450
query39	1498	1439	1405	1405
query40	225	127	117	117
query41	62	61	60	60
query42	124	117	111	111
query43	481	499	462	462
query44	1214	739	737	737
query45	186	182	174	174
query46	882	981	628	628
query47	1789	1807	1710	1710
query48	404	418	317	317
query49	777	538	419	419
query50	639	690	401	401
query51	3843	3865	3925	3865
query52	111	114	105	105
query53	240	266	192	192
query54	303	290	269	269
query55	88	86	87	86
query56	332	311	323	311
query57	1190	1191	1131	1131
query58	282	271	270	270
query59	2594	2720	2471	2471
query60	343	341	334	334
query61	159	166	190	166
query62	780	735	679	679
query63	230	194	194	194
query64	4552	1167	853	853
query65	4032	3957	3977	3957
query66	1175	438	337	337
query67	15232	15146	15053	15053
query68	4760	904	616	616
query69	494	317	282	282
query70	1318	1294	1281	1281
query71	408	338	311	311
query72	6061	4931	5126	4931
query73	665	591	369	369
query74	8829	8985	8857	8857
query75	3290	3312	2840	2840
query76	3319	1146	723	723
query77	531	429	317	317
query78	9708	9908	8867	8867
query79	1441	834	617	617
query80	821	597	530	530
query81	498	262	232	232
query82	408	160	131	131
query83	283	273	259	259
query84	258	111	96	96
query85	998	488	441	441
query86	319	309	303	303
query87	3719	3687	3562	3562
query88	2820	2272	2232	2232
query89	376	344	291	291
query90	1741	218	233	218
query91	164	174	152	152
query92	73	68	68	68
query93	1119	997	643	643
query94	667	445	340	340
query95	400	330	321	321
query96	487	547	281	281
query97	2922	2979	2871	2871
query98	233	222	213	213
query99	1322	1376	1327	1327
Total cold run time: 268258 ms
Total hot run time: 187984 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cd940840bd8937c8423935551b90bd9055113a15, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.62	0.11	0.11
query5	0.28	0.25	0.25
query6	1.15	0.64	0.63
query7	0.03	0.02	0.02
query8	0.05	0.05	0.04
query9	0.60	0.52	0.52
query10	0.58	0.57	0.57
query11	0.16	0.12	0.11
query12	0.15	0.12	0.11
query13	0.61	0.60	0.60
query14	1.00	0.99	1.00
query15	0.84	0.84	0.84
query16	0.38	0.38	0.39
query17	1.05	1.02	1.03
query18	0.22	0.19	0.20
query19	1.84	1.78	1.83
query20	0.02	0.01	0.02
query21	15.43	0.18	0.14
query22	5.08	0.07	0.05
query23	15.68	0.26	0.10
query24	2.78	0.81	0.29
query25	0.08	0.06	0.06
query26	0.14	0.13	0.13
query27	0.07	0.06	0.05
query28	4.15	1.14	0.93
query29	12.54	3.99	3.24
query30	0.28	0.14	0.11
query31	2.82	0.59	0.38
query32	3.23	0.54	0.47
query33	3.04	3.06	3.13
query34	15.80	5.22	4.58
query35	4.59	4.57	4.60
query36	0.67	0.51	0.49
query37	0.09	0.07	0.07
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.16	0.15
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.04
Total cold run time: 97.9 s
Total hot run time: 27.38 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 33.33% (145/435) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.77% (18225/34534)
Line Coverage 38.12% (165734/434818)
Region Coverage 33.14% (128965/389190)
Branch Coverage 33.86% (55299/163329)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 40.46% (176/435) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.56% (24296/33950)
Line Coverage 58.09% (253009/435549)
Region Coverage 53.51% (211210/394724)
Branch Coverage 54.83% (90150/164412)

@morningman morningman requested a review from Copilot November 10, 2025 04:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the Parquet page index filtering implementation by improving the handling of OR predicates and removing the colname_to_value_range parameter from various reader initialization methods. The changes optimize page index filtering to better support complex predicate combinations.

Key Changes:

  • Refactored page index filtering logic to support OR predicates with a new evaluate_and method that works with CachedPageIndexStat and RowRanges
  • Removed the colname_to_value_range parameter from all reader init_reader methods (ORC, Parquet, Iceberg, Paimon, Hudi, Hive, etc.)
  • Introduced RowRanges as a unified structure for representing row ranges to read, replacing the previous vector<RowRange> approach

Reviewed Changes

Copilot reviewed 38 out of 40 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
regression-test/suites/external_table_p0/hive/test_hive_page_index.groovy New test suite for Hive page index filtering with various predicate combinations
docker/thirdparties/docker-compose/hive/scripts/preinstalled_data/parquet_table/decimals_1_10/decimals_1_10.parquet Binary test data file for decimal column tests
docker/thirdparties/docker-compose/hive/scripts/create_preinstalled_scripts/run82.hql HQL script to create test table for decimals
be/src/vec/exec/format/parquet/vparquet_reader.{h,cpp} Major refactoring of page index filtering, min-max-bloom filter processing, and row group iteration logic
be/src/vec/exec/format/parquet/vparquet_group_reader.{h,cpp} Updated to use RowRanges instead of vector<RowRange>
be/src/vec/exec/format/parquet/vparquet_column_reader.{h,cpp} Updated column readers to work with RowRanges
be/src/vec/exec/format/parquet/parquet_predicate.h Added PageIndexStat and CachedPageIndexStat structures, renamed get_min_max_value to parse_min_max_value
be/src/vec/exec/format/parquet/vparquet_page_index.{h,cpp} Removed unused create_skipped_row_range method, made parse methods const
be/src/vec/exec/format/parquet/parquet_common.h Replaced custom RowRange with segment_v2::RowRange and RowRanges
be/src/vec/exec/format/orc/vorc_reader.{h,cpp} Removed colname_to_value_range parameter from init_reader
be/src/vec/exec/format/table/*.{h,cpp} Updated table format readers (Iceberg, Paimon, Hudi, Hive, TransactionalHive) to remove colname_to_value_range parameter
be/src/vec/exec/scan/file_scanner.cpp Updated all reader initialization calls to remove colname_to_value_range
be/src/olap/rowset/segment_v2/row_ranges.h Added get_range method to RowRanges
be/src/olap/column_predicate.h Added new evaluate_and method signature for page index filtering
be/src/olap/block_column_predicate.{h,cpp} Implemented page index filtering for AND/OR block predicates
be/src/olap/comparison_predicate.h Added page index filtering support for comparison predicates
be/src/olap/in_list_predicate.h Added page index filtering support for IN list predicates
be/src/olap/null_predicate.h Added page index filtering support for NULL predicates
be/src/olap/push_handler.{h,cpp} Removed unused colname_to_value_range member
be/test/vec/exec/*.cpp Updated test code to remove colname_to_value_range parameter, removed obsolete test methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

std::function<bool(PageIndexStat**, int)> get_stat_func;
};

// The encoded Parquet min-max value is parsed into `fields`;
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "The encoded Parquet min-max value is parsed into fields" has an extra space between "into" and the backtick. It should be a single space.

Suggested change
// The encoded Parquet min-max value is parsed into `fields`;
// The encoded Parquet min-max value is parsed into `fields`;

Copilot uses AI. Check for mistakes.

String enabled = context.config.otherConfigs.get("enableHiveTest")
if (enabled == null || !enabled.equalsIgnoreCase("true")) {
logger.info("diable Hive test.")
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spelling "diable" should be "disable".

Copilot uses AI. Check for mistakes.
Comment on lines 84 to 86
order_qt_q33 """ select * from decimals_1_10 where d_1 is null or d_10 is null ; """
order_qt_q33 """ select * from decimals_1_10 where d_1 is null"""
order_qt_q33 """ select * from decimals_1_10 where d_10 is null ; """
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate query identifiers detected. Lines 84, 85, and 86 all use the same identifier order_qt_q33, but they are executing different queries. Each query should have a unique identifier, such as order_qt_q34 and order_qt_q35 for the second and third queries respectively.

Copilot uses AI. Check for mistakes.
RowRanges* candidate_row_ranges);

// Row Group Filter
// check this range contain this tow group.
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "check this range contain this tow group" has a spelling error. "tow" should be "row".

Suggested change
// check this range contain this tow group.
// check this range contain this row group.

Copilot uses AI. Check for mistakes.
@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34203 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 467a6586550dba161eb4bd40ce7ddf3383543658, data reload: false

------ Round 1 ----------------------------------
q1	17618	5215	5090	5090
q2	2045	347	200	200
q3	10205	1344	723	723
q4	10236	993	373	373
q5	7489	2273	2394	2273
q6	184	169	138	138
q7	917	787	620	620
q8	9340	1325	1101	1101
q9	7011	5118	5252	5118
q10	6891	2262	1849	1849
q11	504	303	290	290
q12	341	373	230	230
q13	17787	3679	3072	3072
q14	226	227	214	214
q15	603	509	505	505
q16	1024	1011	933	933
q17	620	893	362	362
q18	7636	7180	7032	7032
q19	1096	964	575	575
q20	367	345	235	235
q21	3683	3240	2297	2297
q22	1053	1037	973	973
Total cold run time: 106876 ms
Total hot run time: 34203 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5170	5133	5131	5131
q2	275	325	232	232
q3	2152	2694	2318	2318
q4	1350	1750	1368	1368
q5	4224	4544	4635	4544
q6	222	207	135	135
q7	2062	1987	1801	1801
q8	2625	2609	2600	2600
q9	7288	7393	7329	7329
q10	3089	3269	2804	2804
q11	600	541	515	515
q12	659	840	657	657
q13	3575	3960	3440	3440
q14	293	318	264	264
q15	580	523	510	510
q16	1109	1189	1066	1066
q17	1212	1592	1414	1414
q18	7963	7698	7629	7629
q19	864	831	980	831
q20	1993	2123	1960	1960
q21	5083	4422	4326	4326
q22	1109	1051	982	982
Total cold run time: 53497 ms
Total hot run time: 51856 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187676 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 467a6586550dba161eb4bd40ce7ddf3383543658, data reload: false

query1	1030	416	404	404
query2	6586	1740	1714	1714
query3	6756	226	232	226
query4	26153	23678	23035	23035
query5	4804	627	464	464
query6	333	230	210	210
query7	4638	496	302	302
query8	309	257	244	244
query9	8708	2622	2642	2622
query10	497	344	292	292
query11	15805	15139	14891	14891
query12	180	120	111	111
query13	1686	550	445	445
query14	11485	9301	9338	9301
query15	200	188	173	173
query16	7672	670	546	546
query17	1186	755	613	613
query18	2028	425	312	312
query19	206	202	174	174
query20	133	127	126	126
query21	216	128	114	114
query22	3860	4034	3852	3852
query23	34054	32826	33175	32826
query24	8450	2440	2434	2434
query25	591	518	454	454
query26	1242	275	160	160
query27	2750	504	348	348
query28	4358	2226	2201	2201
query29	806	592	488	488
query30	303	227	198	198
query31	923	805	728	728
query32	84	77	75	75
query33	606	381	341	341
query34	812	894	527	527
query35	825	843	770	770
query36	960	1002	892	892
query37	127	117	94	94
query38	3569	3532	3443	3443
query39	1466	1412	1425	1412
query40	229	158	126	126
query41	87	80	65	65
query42	131	115	116	115
query43	511	510	491	491
query44	1250	757	748	748
query45	200	190	177	177
query46	893	1009	663	663
query47	1778	1778	1712	1712
query48	399	434	318	318
query49	795	519	449	449
query50	655	694	408	408
query51	3889	3888	3855	3855
query52	122	114	107	107
query53	251	274	203	203
query54	327	325	302	302
query55	90	92	90	90
query56	347	357	339	339
query57	1175	1177	1119	1119
query58	294	292	287	287
query59	2603	2721	2536	2536
query60	374	386	350	350
query61	200	190	194	190
query62	818	743	685	685
query63	242	200	207	200
query64	4658	1305	998	998
query65	4064	3969	3960	3960
query66	1116	477	404	404
query67	15304	15309	14768	14768
query68	8504	945	596	596
query69	488	328	293	293
query70	1349	1267	1305	1267
query71	499	341	320	320
query72	5877	5095	4899	4899
query73	656	602	359	359
query74	9018	9131	8926	8926
query75	4007	3331	2801	2801
query76	3779	1132	750	750
query77	818	420	321	321
query78	9358	9686	8859	8859
query79	1978	825	609	609
query80	652	582	513	513
query81	483	267	236	236
query82	431	159	135	135
query83	260	261	243	243
query84	250	111	94	94
query85	901	492	436	436
query86	340	304	283	283
query87	3756	3753	3599	3599
query88	3260	2273	2237	2237
query89	398	326	313	313
query90	2029	225	220	220
query91	169	162	133	133
query92	81	70	65	65
query93	1105	991	646	646
query94	688	449	323	323
query95	404	326	308	308
query96	483	571	283	283
query97	2943	2981	2866	2866
query98	253	215	217	215
query99	1419	1414	1301	1301
Total cold run time: 275835 ms
Total hot run time: 187676 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.54 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 467a6586550dba161eb4bd40ce7ddf3383543658, data reload: false

query1	0.06	0.06	0.05
query2	0.10	0.05	0.04
query3	0.25	0.09	0.08
query4	1.61	0.11	0.11
query5	0.27	0.25	0.25
query6	1.16	0.63	0.66
query7	0.04	0.02	0.02
query8	0.05	0.04	0.05
query9	0.58	0.53	0.52
query10	0.58	0.57	0.56
query11	0.16	0.11	0.12
query12	0.16	0.12	0.12
query13	0.64	0.60	0.60
query14	1.00	1.02	0.99
query15	0.85	0.85	0.83
query16	0.40	0.40	0.38
query17	1.01	1.04	1.06
query18	0.22	0.19	0.20
query19	1.91	1.83	1.87
query20	0.02	0.01	0.01
query21	15.45	0.18	0.13
query22	5.13	0.08	0.05
query23	15.68	0.26	0.10
query24	2.72	1.20	0.39
query25	0.07	0.06	0.07
query26	0.14	0.13	0.14
query27	0.08	0.06	0.05
query28	4.62	1.13	0.96
query29	12.62	3.88	3.27
query30	0.29	0.13	0.11
query31	2.81	0.59	0.38
query32	3.22	0.54	0.47
query33	3.03	3.06	3.05
query34	15.96	5.22	4.56
query35	4.59	4.68	4.59
query36	0.66	0.50	0.49
query37	0.10	0.06	0.06
query38	0.07	0.04	0.05
query39	0.04	0.03	0.02
query40	0.16	0.14	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.67 s
Total hot run time: 27.54 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 33.33% (145/435) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.81% (18284/34622)
Line Coverage 38.17% (166238/435527)
Region Coverage 33.20% (129331/389581)
Branch Coverage 33.90% (55452/163579)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 40.46% (176/435) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.55% (24344/34024)
Line Coverage 58.05% (253231/436212)
Region Coverage 53.49% (211351/395105)
Branch Coverage 54.77% (90180/164654)

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 24, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@Gabriel39 Gabriel39 merged commit 05ee9e7 into apache:master Nov 24, 2025
29 of 31 checks passed
morningman pushed a commit that referenced this pull request Dec 1, 2025
…ition.columns prop table cause be core. (#58532)

### What problem does this PR solve?
Related PR: #57771
Problem Summary:
Fixed a core issue when reading Hudi Parquet format tables with the
`hoodie.properties`
`hoodie.datasource.write.drop.partition.columns=false`.

```
*** SIGSEGV address not mapped to object (@0x18) received by PID 12234 (TID 38368 OR 0x7f0bd279e640) from PID 24; stack trace: ***
11:01:31    0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
11:01:31    1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    3# 0x00007F18963FB520 in /lib/x86_64-linux-gnu/libc.so.6
11:01:31    4# std::_Function_handler<bool (doris::vectorized::ParquetPredicate::PageIndexStat**, int), doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*)::$_1>::_M_invoke(std::_Any_data const&, doris::vectorized::ParquetPredicate::PageIndexStat**&&, int&&) at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292
11:01:31    5# doris::InListPredicateBase<(doris::PrimitiveType)2, (doris::PredicateType)7, doris::HybridSet<(doris::PrimitiveType)2, doris::FixedContainer<bool, 1ul>, doris::vectorized::PredicateColumnType<(doris::PrimitiveType)2> > >::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/in_list_predicate.h:345
11:01:31    6# doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:148
11:01:31    7# doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31    8# doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1082
11:01:31    9# doris::vectorized::ParquetReader::_next_row_group_reader() in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31   10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:598
11:01:31   11# doris::vectorized::HudiReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/hudi_reader.cpp:29
11:01:31   12# doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/table_format_reader.h:82
11:01:31   13# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/file_scanner.cpp:472
```
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…r parquet page index. (apache#57771)

Problem Summary:
1. The previous page index could only handle SQL WHERE conditions that
only contained AND, but this PR can handle conditions that contain OR.
2. Because the topn runtime filter is dynamically maintained, this PR
delays the timing of the topn RF min-max filter until the row group
reader is created.
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…ition.columns prop table cause be core. (apache#58532)

### What problem does this PR solve?
Related PR: apache#57771
Problem Summary:
Fixed a core issue when reading Hudi Parquet format tables with the
`hoodie.properties`
`hoodie.datasource.write.drop.partition.columns=false`.

```
*** SIGSEGV address not mapped to object (@0x18) received by PID 12234 (TID 38368 OR 0x7f0bd279e640) from PID 24; stack trace: ***
11:01:31    0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
11:01:31    1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    3# 0x00007F18963FB520 in /lib/x86_64-linux-gnu/libc.so.6
11:01:31    4# std::_Function_handler<bool (doris::vectorized::ParquetPredicate::PageIndexStat**, int), doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*)::$_1>::_M_invoke(std::_Any_data const&, doris::vectorized::ParquetPredicate::PageIndexStat**&&, int&&) at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292
11:01:31    5# doris::InListPredicateBase<(doris::PrimitiveType)2, (doris::PredicateType)7, doris::HybridSet<(doris::PrimitiveType)2, doris::FixedContainer<bool, 1ul>, doris::vectorized::PredicateColumnType<(doris::PrimitiveType)2> > >::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/in_list_predicate.h:345
11:01:31    6# doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:148
11:01:31    7# doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31    8# doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1082
11:01:31    9# doris::vectorized::ParquetReader::_next_row_group_reader() in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31   10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:598
11:01:31   11# doris::vectorized::HudiReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/hudi_reader.cpp:29
11:01:31   12# doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/table_format_reader.h:82
11:01:31   13# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/file_scanner.cpp:472
```
hubgeter added a commit to hubgeter/doris that referenced this pull request Jan 8, 2026
…r parquet page index. (apache#57771)

Problem Summary:
1. The previous page index could only handle SQL WHERE conditions that
only contained AND, but this PR can handle conditions that contain OR.
2. Because the topn runtime filter is dynamically maintained, this PR
delays the timing of the topn RF min-max filter until the row group
reader is created.
yiguolei pushed a commit that referenced this pull request Jan 9, 2026
…x filter for parquet page index. (#57771) (#59680)

bp #57771
### What problem does this PR solve?
Problem Summary:
1. The previous page index could only handle SQL WHERE conditions that
only contained AND, but this PR can handle conditions that contain OR.
2. Because the topn runtime filter is dynamically maintained, this PR
delays the timing of the topn RF min-max filter until the row group
reader is created.
github-actions bot pushed a commit that referenced this pull request Jan 12, 2026
…ition.columns prop table cause be core. (#58532)

### What problem does this PR solve?
Related PR: #57771
Problem Summary:
Fixed a core issue when reading Hudi Parquet format tables with the
`hoodie.properties`
`hoodie.datasource.write.drop.partition.columns=false`.

```
*** SIGSEGV address not mapped to object (@0x18) received by PID 12234 (TID 38368 OR 0x7f0bd279e640) from PID 24; stack trace: ***
11:01:31    0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
11:01:31    1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    3# 0x00007F18963FB520 in /lib/x86_64-linux-gnu/libc.so.6
11:01:31    4# std::_Function_handler<bool (doris::vectorized::ParquetPredicate::PageIndexStat**, int), doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*)::$_1>::_M_invoke(std::_Any_data const&, doris::vectorized::ParquetPredicate::PageIndexStat**&&, int&&) at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292
11:01:31    5# doris::InListPredicateBase<(doris::PrimitiveType)2, (doris::PredicateType)7, doris::HybridSet<(doris::PrimitiveType)2, doris::FixedContainer<bool, 1ul>, doris::vectorized::PredicateColumnType<(doris::PrimitiveType)2> > >::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/in_list_predicate.h:345
11:01:31    6# doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:148
11:01:31    7# doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31    8# doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1082
11:01:31    9# doris::vectorized::ParquetReader::_next_row_group_reader() in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31   10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:598
11:01:31   11# doris::vectorized::HudiReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/hudi_reader.cpp:29
11:01:31   12# doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/table_format_reader.h:82
11:01:31   13# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/file_scanner.cpp:472
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.3-merged dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants