Skip to content

Conversation

@hubgeter
Copy link
Contributor

What problem does this PR solve?

Related PR: #57771
Problem Summary:
Fixed a core issue when reading Hudi Parquet format tables with the hoodie.properties hoodie.datasource.write.drop.partition.columns=false.

*** SIGSEGV address not mapped to object (@0x18) received by PID 12234 (TID 38368 OR 0x7f0bd279e640) from PID 24; stack trace: ***
11:01:31    0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
11:01:31    1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    3# 0x00007F18963FB520 in /lib/x86_64-linux-gnu/libc.so.6
11:01:31    4# std::_Function_handler<bool (doris::vectorized::ParquetPredicate::PageIndexStat**, int), doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*)::$_1>::_M_invoke(std::_Any_data const&, doris::vectorized::ParquetPredicate::PageIndexStat**&&, int&&) at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292
11:01:31    5# doris::InListPredicateBase<(doris::PrimitiveType)2, (doris::PredicateType)7, doris::HybridSet<(doris::PrimitiveType)2, doris::FixedContainer<bool, 1ul>, doris::vectorized::PredicateColumnType<(doris::PrimitiveType)2> > >::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/in_list_predicate.h:345
11:01:31    6# doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:148
11:01:31    7# doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31    8# doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1082
11:01:31    9# doris::vectorized::ParquetReader::_next_row_group_reader() in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31   10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:598
11:01:31   11# doris::vectorized::HudiReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/hudi_reader.cpp:29
11:01:31   12# doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/table_format_reader.h:82
11:01:31   13# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/file_scanner.cpp:472

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 30, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33923 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c2d240680e51b0ec51b97ecdefb96a235f96d7bc, data reload: false

------ Round 1 ----------------------------------
q1	17619	5142	4888	4888
q2	2036	324	211	211
q3	10259	1291	741	741
q4	10214	818	361	361
q5	7565	2449	2211	2211
q6	185	177	139	139
q7	939	809	639	639
q8	9346	1343	963	963
q9	7055	5243	5318	5243
q10	6871	2257	1811	1811
q11	517	310	301	301
q12	334	362	231	231
q13	17777	3650	2996	2996
q14	238	241	213	213
q15	574	532	515	515
q16	873	870	815	815
q17	594	722	508	508
q18	7692	7203	7082	7082
q19	1313	939	583	583
q20	331	332	233	233
q21	3588	3263	2309	2309
q22	1035	992	930	930
Total cold run time: 106955 ms
Total hot run time: 33923 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4993	4926	4914	4914
q2	342	413	324	324
q3	2163	2632	2324	2324
q4	1307	1726	1315	1315
q5	4180	4336	4481	4336
q6	221	172	134	134
q7	2115	1966	1916	1916
q8	2673	2527	2443	2443
q9	7452	7538	7447	7447
q10	3039	3363	2817	2817
q11	606	499	488	488
q12	709	723	612	612
q13	3595	3826	3321	3321
q14	301	307	281	281
q15	552	532	506	506
q16	934	955	901	901
q17	1248	1410	1350	1350
q18	7833	7628	7497	7497
q19	795	742	784	742
q20	1991	2120	2093	2093
q21	4763	4556	4256	4256
q22	1058	1050	993	993
Total cold run time: 52870 ms
Total hot run time: 51010 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181907 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c2d240680e51b0ec51b97ecdefb96a235f96d7bc, data reload: false

query1	1046	416	401	401
query2	6642	1201	1153	1153
query3	6753	229	224	224
query4	25643	23363	22828	22828
query5	5175	690	512	512
query6	352	243	247	243
query7	4670	505	303	303
query8	321	270	257	257
query9	8743	2613	2645	2613
query10	568	374	302	302
query11	15328	14942	14803	14803
query12	190	125	114	114
query13	1701	570	453	453
query14	9358	5957	6103	5957
query15	217	203	186	186
query16	7440	716	538	538
query17	1219	787	652	652
query18	2017	438	353	353
query19	224	213	191	191
query20	130	126	126	126
query21	221	140	117	117
query22	3976	4002	3842	3842
query23	32959	32068	31776	31776
query24	8504	2436	2378	2378
query25	608	525	458	458
query26	1240	285	168	168
query27	2735	497	346	346
query28	4336	2159	2133	2133
query29	808	628	490	490
query30	314	234	209	209
query31	809	723	632	632
query32	81	75	73	73
query33	596	411	336	336
query34	810	879	547	547
query35	827	840	754	754
query36	891	935	860	860
query37	124	114	88	88
query38	3873	3924	3763	3763
query39	1476	1411	1398	1398
query40	232	131	121	121
query41	65	67	66	66
query42	122	111	118	111
query43	445	435	429	429
query44	1293	778	753	753
query45	199	194	188	188
query46	879	1009	652	652
query47	1696	1724	1655	1655
query48	411	428	342	342
query49	760	506	418	418
query50	664	694	414	414
query51	3791	3912	3864	3864
query52	116	112	105	105
query53	250	258	193	193
query54	321	300	276	276
query55	101	97	97	97
query56	340	352	350	350
query57	1134	1169	1151	1151
query58	296	308	280	280
query59	2307	2397	2347	2347
query60	368	345	347	345
query61	161	157	163	157
query62	800	750	661	661
query63	235	194	194	194
query64	4549	1191	946	946
query65	4074	3970	3957	3957
query66	1217	472	352	352
query67	15429	14870	14980	14870
query68	5602	966	641	641
query69	523	347	307	307
query70	1155	1047	981	981
query71	419	357	322	322
query72	5812	4896	4897	4896
query73	675	576	341	341
query74	8803	8796	8593	8593
query75	3006	3059	2557	2557
query76	3332	1147	748	748
query77	517	403	322	322
query78	9412	9824	8929	8929
query79	1357	839	615	615
query80	691	595	518	518
query81	509	276	239	239
query82	220	141	116	116
query83	280	276	260	260
query84	261	130	99	99
query85	887	494	454	454
query86	338	280	278	278
query87	4012	4106	3898	3898
query88	2850	2317	2353	2317
query89	390	335	309	309
query90	1795	236	230	230
query91	200	176	152	152
query92	72	68	72	68
query93	1109	978	675	675
query94	674	472	356	356
query95	531	438	430	430
query96	513	545	291	291
query97	2613	2709	2614	2614
query98	249	218	215	215
query99	1427	1425	1265	1265
Total cold run time: 265414 ms
Total hot run time: 181907 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.76 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c2d240680e51b0ec51b97ecdefb96a235f96d7bc, data reload: false

query1	0.06	0.05	0.04
query2	0.10	0.04	0.04
query3	0.26	0.08	0.08
query4	1.61	0.12	0.11
query5	0.27	0.26	0.24
query6	1.17	0.63	0.62
query7	0.04	0.02	0.02
query8	0.06	0.04	0.05
query9	0.58	0.50	0.51
query10	0.55	0.54	0.55
query11	0.15	0.11	0.12
query12	0.16	0.12	0.12
query13	0.63	0.60	0.61
query14	0.99	0.98	0.99
query15	0.82	0.80	0.82
query16	0.39	0.42	0.40
query17	1.04	1.06	0.99
query18	0.25	0.22	0.22
query19	1.84	1.89	1.87
query20	0.02	0.01	0.01
query21	15.48	0.26	0.14
query22	4.97	0.05	0.05
query23	16.09	0.26	0.10
query24	1.15	0.60	1.42
query25	0.09	0.07	0.06
query26	0.14	0.14	0.13
query27	0.07	0.05	0.06
query28	5.47	1.22	1.02
query29	12.58	4.00	3.25
query30	0.27	0.14	0.11
query31	2.82	0.62	0.41
query32	3.23	0.55	0.47
query33	3.04	3.16	3.07
query34	16.97	5.18	4.55
query35	4.54	4.58	4.65
query36	0.67	0.50	0.48
query37	0.10	0.07	0.07
query38	0.08	0.04	0.04
query39	0.05	0.03	0.03
query40	0.17	0.14	0.13
query41	0.08	0.04	0.03
query42	0.04	0.03	0.02
query43	0.05	0.04	0.04
Total cold run time: 99.14 s
Total hot run time: 27.76 s

@hubgeter
Copy link
Contributor Author

hubgeter commented Dec 1, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34422 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4b8b7a7fa6784e59a7cc592b02178114caa896af, data reload: false

------ Round 1 ----------------------------------
q1	17617	5017	4916	4916
q2	2049	304	210	210
q3	10249	1307	735	735
q4	10223	882	370	370
q5	7536	2527	2243	2243
q6	191	178	144	144
q7	972	811	644	644
q8	9347	1441	1071	1071
q9	7010	5406	5310	5310
q10	6869	2281	1814	1814
q11	522	305	291	291
q12	348	399	237	237
q13	17812	3725	3082	3082
q14	240	250	218	218
q15	605	513	512	512
q16	906	876	814	814
q17	599	769	517	517
q18	7471	7124	7150	7124
q19	1101	947	573	573
q20	354	340	232	232
q21	3919	3586	2419	2419
q22	1019	986	946	946
Total cold run time: 106959 ms
Total hot run time: 34422 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4973	4953	4919	4919
q2	322	386	308	308
q3	2176	2690	2317	2317
q4	1327	1785	1286	1286
q5	4254	4445	4546	4445
q6	225	187	135	135
q7	2018	2039	1834	1834
q8	2669	2571	2457	2457
q9	7613	7490	7560	7490
q10	3072	3339	2812	2812
q11	605	535	496	496
q12	695	794	623	623
q13	3503	3987	3376	3376
q14	287	305	282	282
q15	562	502	502	502
q16	904	943	882	882
q17	1155	1354	1437	1354
q18	8151	7681	7711	7681
q19	821	779	844	779
q20	2020	1971	1811	1811
q21	4574	4295	4122	4122
q22	1076	1059	996	996
Total cold run time: 53002 ms
Total hot run time: 50907 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182327 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4b8b7a7fa6784e59a7cc592b02178114caa896af, data reload: false

query1	1054	412	431	412
query2	6573	1162	1151	1151
query3	6759	235	230	230
query4	25221	23365	22827	22827
query5	4957	672	487	487
query6	343	234	210	210
query7	4657	517	305	305
query8	300	253	234	234
query9	8762	2610	2627	2610
query10	535	352	306	306
query11	15301	15023	15088	15023
query12	194	116	113	113
query13	1692	556	459	459
query14	9414	6059	5983	5983
query15	210	200	184	184
query16	7504	710	533	533
query17	1219	780	651	651
query18	2031	452	347	347
query19	222	208	180	180
query20	132	126	123	123
query21	223	137	116	116
query22	3979	4028	3920	3920
query23	33132	31957	32377	31957
query24	8491	2458	2434	2434
query25	651	568	485	485
query26	1256	290	209	209
query27	2671	488	349	349
query28	4348	2158	2138	2138
query29	811	612	493	493
query30	304	240	214	214
query31	837	715	636	636
query32	82	72	73	72
query33	591	388	330	330
query34	821	873	539	539
query35	822	818	742	742
query36	896	919	846	846
query37	118	109	87	87
query38	3901	3849	3808	3808
query39	1465	1444	1399	1399
query40	230	143	124	124
query41	66	63	61	61
query42	126	117	110	110
query43	437	442	415	415
query44	1296	758	782	758
query45	195	196	188	188
query46	863	996	648	648
query47	1685	1725	1675	1675
query48	398	408	343	343
query49	774	508	427	427
query50	659	697	403	403
query51	4055	4043	3943	3943
query52	114	112	110	110
query53	237	276	195	195
query54	325	295	274	274
query55	96	95	91	91
query56	338	331	314	314
query57	1148	1174	1100	1100
query58	291	276	274	274
query59	2306	2404	2298	2298
query60	355	348	356	348
query61	157	156	155	155
query62	807	752	645	645
query63	222	196	199	196
query64	4600	1203	890	890
query65	4056	3966	3978	3966
query66	1147	444	335	335
query67	15231	15252	14973	14973
query68	8352	944	625	625
query69	506	338	308	308
query70	1105	1006	996	996
query71	431	338	316	316
query72	5833	4808	4843	4808
query73	655	563	344	344
query74	8715	8824	8763	8763
query75	3036	3039	2501	2501
query76	3310	1136	728	728
query77	521	403	311	311
query78	9368	9652	8931	8931
query79	2057	843	579	579
query80	617	599	487	487
query81	498	273	235	235
query82	251	138	114	114
query83	269	266	251	251
query84	255	119	99	99
query85	910	480	440	440
query86	330	295	275	275
query87	3997	4138	4044	4044
query88	3876	2294	2262	2262
query89	399	328	306	306
query90	1983	223	227	223
query91	172	196	144	144
query92	84	70	70	70
query93	1334	993	658	658
query94	765	453	346	346
query95	499	414	414	414
query96	505	561	280	280
query97	2616	2697	2605	2605
query98	244	240	212	212
query99	1303	1377	1251	1251
Total cold run time: 269489 ms
Total hot run time: 182327 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4b8b7a7fa6784e59a7cc592b02178114caa896af, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.04
query3	0.25	0.09	0.09
query4	1.61	0.12	0.11
query5	0.28	0.26	0.27
query6	1.17	0.64	0.65
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.59	0.51	0.51
query10	0.56	0.55	0.55
query11	0.17	0.10	0.12
query12	0.14	0.12	0.12
query13	0.63	0.61	0.62
query14	0.99	0.98	0.98
query15	0.81	0.80	0.80
query16	0.42	0.42	0.40
query17	1.06	1.00	1.01
query18	0.23	0.25	0.21
query19	1.81	1.83	1.87
query20	0.01	0.01	0.01
query21	15.44	0.26	0.15
query22	4.77	0.05	0.05
query23	15.97	0.26	0.10
query24	1.22	1.25	0.35
query25	0.08	0.11	0.07
query26	0.15	0.13	0.13
query27	0.06	0.05	0.05
query28	5.00	1.22	1.02
query29	12.57	3.93	3.15
query30	0.29	0.13	0.11
query31	2.82	0.61	0.41
query32	3.23	0.54	0.46
query33	2.99	3.02	3.04
query34	16.83	5.13	4.57
query35	4.61	4.60	4.58
query36	0.67	0.51	0.51
query37	0.10	0.06	0.06
query38	0.08	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 98.25 s
Total hot run time: 27.48 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 1, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

PR approved by anyone and no changes requested.

@morningman
Copy link
Contributor

run check_coverage

@morningman morningman merged commit e04c716 into apache:master Dec 1, 2025
29 of 31 checks passed
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…ition.columns prop table cause be core. (apache#58532)

### What problem does this PR solve?
Related PR: apache#57771
Problem Summary:
Fixed a core issue when reading Hudi Parquet format tables with the
`hoodie.properties`
`hoodie.datasource.write.drop.partition.columns=false`.

```
*** SIGSEGV address not mapped to object (@0x18) received by PID 12234 (TID 38368 OR 0x7f0bd279e640) from PID 24; stack trace: ***
11:01:31    0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
11:01:31    1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    3# 0x00007F18963FB520 in /lib/x86_64-linux-gnu/libc.so.6
11:01:31    4# std::_Function_handler<bool (doris::vectorized::ParquetPredicate::PageIndexStat**, int), doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*)::$_1>::_M_invoke(std::_Any_data const&, doris::vectorized::ParquetPredicate::PageIndexStat**&&, int&&) at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292
11:01:31    5# doris::InListPredicateBase<(doris::PrimitiveType)2, (doris::PredicateType)7, doris::HybridSet<(doris::PrimitiveType)2, doris::FixedContainer<bool, 1ul>, doris::vectorized::PredicateColumnType<(doris::PrimitiveType)2> > >::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/in_list_predicate.h:345
11:01:31    6# doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:148
11:01:31    7# doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31    8# doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1082
11:01:31    9# doris::vectorized::ParquetReader::_next_row_group_reader() in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31   10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:598
11:01:31   11# doris::vectorized::HudiReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/hudi_reader.cpp:29
11:01:31   12# doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/table_format_reader.h:82
11:01:31   13# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/file_scanner.cpp:472
```
github-actions bot pushed a commit that referenced this pull request Jan 12, 2026
…ition.columns prop table cause be core. (#58532)

### What problem does this PR solve?
Related PR: #57771
Problem Summary:
Fixed a core issue when reading Hudi Parquet format tables with the
`hoodie.properties`
`hoodie.datasource.write.drop.partition.columns=false`.

```
*** SIGSEGV address not mapped to object (@0x18) received by PID 12234 (TID 38368 OR 0x7f0bd279e640) from PID 24; stack trace: ***
11:01:31    0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
11:01:31    1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
11:01:31    3# 0x00007F18963FB520 in /lib/x86_64-linux-gnu/libc.so.6
11:01:31    4# std::_Function_handler<bool (doris::vectorized::ParquetPredicate::PageIndexStat**, int), doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*)::$_1>::_M_invoke(std::_Any_data const&, doris::vectorized::ParquetPredicate::PageIndexStat**&&, int&&) at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292
11:01:31    5# doris::InListPredicateBase<(doris::PrimitiveType)2, (doris::PredicateType)7, doris::HybridSet<(doris::PrimitiveType)2, doris::FixedContainer<bool, 1ul>, doris::vectorized::PredicateColumnType<(doris::PrimitiveType)2> > >::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/in_list_predicate.h:345
11:01:31    6# doris::AndBlockColumnPredicate::evaluate_and(doris::vectorized::ParquetPredicate::CachedPageIndexStat*, doris::segment_v2::RowRanges*) const at /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:148
11:01:31    7# doris::vectorized::ParquetReader::_process_page_index_filter(tparquet::RowGroup const&, doris::vectorized::RowGroupReader::RowGroupIndex const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31    8# doris::vectorized::ParquetReader::_process_min_max_bloom_filter(doris::vectorized::RowGroupReader::RowGroupIndex const&, tparquet::RowGroup const&, std::vector<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> >, std::allocator<std::unique_ptr<doris::MutilColumnBlockPredicate, std::default_delete<doris::MutilColumnBlockPredicate> > > > const&, doris::segment_v2::RowRanges*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:1082
11:01:31    9# doris::vectorized::ParquetReader::_next_row_group_reader() in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
11:01:31   10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:598
11:01:31   11# doris::vectorized::HudiReader::get_next_block_inner(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/hudi_reader.cpp:29
11:01:31   12# doris::vectorized::TableFormatReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/table/table_format_reader.h:82
11:01:31   13# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/file_scanner.cpp:472
```
yiguolei pushed a commit that referenced this pull request Jan 12, 2026
…te.drop.partition.columns prop table cause be core. #58532 (#59749)

Cherry-picked from #58532

Co-authored-by: daidai <changyuwei@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants