Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Aug 30, 2025

What problem does this PR solve?

Related PR: #45937

Problem Summary:
Fix the error case on ingestion load and the core in parquet reader.

==8898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62f0020603fc at pc 0x55f634e64ded bp 0x7fba0d03c410 sp 0x7fba0d03bbd8
READ of size 4 at 0x62f0020603fc thread T768 (PUSH-9699)
    #0 0x55f634e64dec in __asan_memcpy (/mnt/hdd01/ci/doris-deploy-branch-3.1-local/be/lib/doris_be+0x39a24dec) (BuildId: 9b04e7f7d3075dac)
    #1 0x55f634eca93f in std::char_traits::copy(char*, char const*, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/char_traits.h:409:33
    #2 0x55f634eca93f in std::__cxx11::basic_string, std::allocator>::_S_copy(char*, char const*, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:351:4
    #3 0x55f634eca93f in std::__cxx11::basic_string, std::allocator>::_S_copy_chars(char*, char const*, char const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:398:9
    #4 0x55f634eca93f in void std::__cxx11::basic_string, std::allocator>::_M_construct(char const*, char const*, std::forward_iterator_tag) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:225:6
    #5 0x55f654a4f74d in void std::__cxx11::basic_string, std::allocator>::_M_construct_aux(char const*, char const*, std::__false_type) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:247:11
    #6 0x55f654a4f74d in void std::__cxx11::basic_string, std::allocator>::_M_construct(char const*, char const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:266:4
    #7 0x55f654a4f74d in std::__cxx11::basic_string, std::allocator>::basic_string(char const*, unsigned long, std::allocator const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:513:9
    #8 0x55f654a4f74d in doris::vectorized::parse_thrift_footer(std::shared_ptr, doris::vectorized::FileMetaData**, unsigned long*, doris::io::IOContext*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/vec/exec/format/parquet/parquet_thrift_util.h:55:17

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32464 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dfeeb005950bf3e987b5097343810e954b39d6d2, data reload: false

------ Round 1 ----------------------------------
q1	17617	5591	5477	5477
q2	2046	278	167	167
q3	10441	1256	710	710
q4	10207	875	448	448
q5	7655	2366	2140	2140
q6	177	168	135	135
q7	907	741	631	631
q8	9309	1414	1171	1171
q9	5974	4893	4945	4893
q10	6747	2275	1799	1799
q11	473	277	261	261
q12	331	356	211	211
q13	17811	3637	2977	2977
q14	229	221	208	208
q15	526	477	461	461
q16	412	427	365	365
q17	580	864	362	362
q18	7029	6418	6360	6360
q19	1210	957	547	547
q20	329	348	203	203
q21	2882	2102	1932	1932
q22	1080	1077	1006	1006
Total cold run time: 103972 ms
Total hot run time: 32464 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5563	5520	5731	5520
q2	236	327	226	226
q3	2268	2647	2298	2298
q4	1361	1748	1374	1374
q5	4389	4808	5032	4808
q6	166	161	128	128
q7	2093	1986	1793	1793
q8	2612	2802	2683	2683
q9	7265	7175	7104	7104
q10	2990	3276	2778	2778
q11	560	497	489	489
q12	667	755	595	595
q13	3329	3793	3189	3189
q14	286	290	275	275
q15	522	477	461	461
q16	439	490	434	434
q17	1259	1754	1236	1236
q18	7609	7497	7218	7218
q19	777	1126	1065	1065
q20	2003	2024	1918	1918
q21	5205	4909	4604	4604
q22	1139	1140	1021	1021
Total cold run time: 52738 ms
Total hot run time: 51217 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191651 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dfeeb005950bf3e987b5097343810e954b39d6d2, data reload: false

query1	956	381	429	381
query2	6524	1957	1843	1843
query3	6712	220	220	220
query4	33757	24101	23591	23591
query5	4329	631	445	445
query6	260	193	179	179
query7	4617	488	313	313
query8	284	266	236	236
query9	9692	2595	2587	2587
query10	488	323	259	259
query11	18152	15462	15078	15078
query12	151	107	102	102
query13	1640	537	424	424
query14	9574	7487	7657	7487
query15	237	195	182	182
query16	8097	599	503	503
query17	1779	763	571	571
query18	2133	398	316	316
query19	229	177	162	162
query20	122	118	118	118
query21	208	140	105	105
query22	4637	4578	4554	4554
query23	35203	33811	33686	33686
query24	7342	2610	2621	2610
query25	514	471	429	429
query26	1205	294	167	167
query27	1985	472	344	344
query28	5145	2182	2148	2148
query29	756	579	463	463
query30	268	187	153	153
query31	979	919	785	785
query32	90	66	59	59
query33	523	361	309	309
query34	744	836	506	506
query35	768	800	728	728
query36	997	1023	940	940
query37	104	96	75	75
query38	3925	3894	3863	3863
query39	1460	1422	1435	1422
query40	206	119	108	108
query41	53	55	51	51
query42	119	105	106	105
query43	489	508	489	489
query44	1324	810	811	810
query45	184	176	173	173
query46	872	1038	682	682
query47	1888	1897	1854	1854
query48	417	435	356	356
query49	812	503	409	409
query50	672	682	430	430
query51	7174	7187	7131	7131
query52	108	107	93	93
query53	230	251	188	188
query54	546	555	494	494
query55	82	81	83	81
query56	282	278	267	267
query57	1221	1234	1151	1151
query58	250	223	230	223
query59	2886	3018	2986	2986
query60	301	294	289	289
query61	147	135	137	135
query62	811	720	670	670
query63	226	196	194	194
query64	4530	1038	626	626
query65	3324	3207	3182	3182
query66	1048	409	307	307
query67	15948	15621	15680	15621
query68	7197	843	550	550
query69	490	317	267	267
query70	1210	1115	1090	1090
query71	503	303	262	262
query72	5559	3748	3822	3748
query73	640	743	367	367
query74	10196	9349	8972	8972
query75	3184	3116	2646	2646
query76	3243	1168	777	777
query77	445	371	291	291
query78	10393	10504	9687	9687
query79	3019	893	603	603
query80	643	530	448	448
query81	499	261	223	223
query82	639	120	92	92
query83	165	165	145	145
query84	239	104	84	84
query85	801	369	309	309
query86	356	312	303	303
query87	4339	4237	4223	4223
query88	4988	2456	2405	2405
query89	413	345	299	299
query90	1891	190	188	188
query91	143	147	112	112
query92	64	56	51	51
query93	1695	931	552	552
query94	728	424	292	292
query95	338	279	268	268
query96	497	612	292	292
query97	3200	3297	3148	3148
query98	224	209	196	196
query99	1496	1412	1313	1313
Total cold run time: 292173 ms
Total hot run time: 191651 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.73 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dfeeb005950bf3e987b5097343810e954b39d6d2, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.06
query4	1.63	0.10	0.10
query5	0.53	0.52	0.52
query6	1.13	0.73	0.73
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.57	0.50	0.50
query10	0.55	0.54	0.55
query11	0.14	0.09	0.10
query12	0.13	0.11	0.11
query13	0.61	0.61	0.60
query14	0.78	0.80	0.81
query15	0.84	0.82	0.82
query16	0.41	0.42	0.37
query17	1.02	1.00	1.01
query18	0.23	0.24	0.22
query19	1.88	1.86	1.88
query20	0.01	0.01	0.02
query21	15.42	0.93	0.62
query22	0.73	0.88	0.67
query23	15.05	1.41	0.58
query24	3.58	0.64	1.23
query25	0.13	0.10	0.12
query26	0.37	0.16	0.13
query27	0.05	0.05	0.05
query28	12.95	1.05	0.44
query29	12.60	3.95	3.26
query30	0.25	0.09	0.06
query31	2.83	0.60	0.38
query32	3.22	0.55	0.45
query33	3.01	3.02	3.04
query34	16.38	5.12	4.57
query35	4.61	4.54	4.56
query36	0.65	0.50	0.48
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.16 s
Total hot run time: 28.73 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 40.00% (2/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.48% (12723/27973)
Line Coverage 36.38% (113471/311934)
Region Coverage 34.00% (64946/191011)
Branch Coverage 31.05% (34093/109814)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 40.00% (2/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 76.36% (21000/27503)
Line Coverage 69.70% (216678/310883)
Region Coverage 67.70% (129818/191749)
Branch Coverage 61.22% (67532/110308)

@morningman morningman marked this pull request as ready for review August 30, 2025 17:40
@morningman morningman requested a review from morrySnow as a code owner August 30, 2025 17:40
@hubgeter hubgeter changed the title [fix](load)fix ingestion load error case cause be core. branch-3.1:[fix](load)fix ingestion load error case cause be core. Sep 4, 2025
@morrySnow morrySnow merged commit d5577ec into apache:branch-3.1 Sep 4, 2025
22 of 24 checks passed
hubgeter added a commit to hubgeter/doris that referenced this pull request Sep 11, 2025
…pache#55500)

Related PR: apache#45937

Problem Summary:
Fix the error case on ingestion load and the core in parquet reader.

==8898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62f0020603fc at pc 0x55f634e64ded bp 0x7fba0d03c410 sp 0x7fba0d03bbd8
READ of size 4 at 0x62f0020603fc thread T768 (PUSH-9699)
    #0 0x55f634e64dec in __asan_memcpy (/mnt/hdd01/ci/doris-deploy-branch-3.1-local/be/lib/doris_be+0x39a24dec) (BuildId: 9b04e7f7d3075dac)
    apache#1 0x55f634eca93f in std::char_traits::copy(char*, char const*, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/char_traits.h:409:33
    apache#2 0x55f634eca93f in std::__cxx11::basic_string, std::allocator>::_S_copy(char*, char const*, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:351:4
    apache#3 0x55f634eca93f in std::__cxx11::basic_string, std::allocator>::_S_copy_chars(char*, char const*, char const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:398:9
    apache#4 0x55f634eca93f in void std::__cxx11::basic_string, std::allocator>::_M_construct(char const*, char const*, std::forward_iterator_tag) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:225:6
    apache#5 0x55f654a4f74d in void std::__cxx11::basic_string, std::allocator>::_M_construct_aux(char const*, char const*, std::__false_type) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:247:11
    apache#6 0x55f654a4f74d in void std::__cxx11::basic_string, std::allocator>::_M_construct(char const*, char const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:266:4
    apache#7 0x55f654a4f74d in std::__cxx11::basic_string, std::allocator>::basic_string(char const*, unsigned long, std::allocator const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:513:9
    apache#8 0x55f654a4f74d in doris::vectorized::parse_thrift_footer(std::shared_ptr, doris::vectorized::FileMetaData**, unsigned long*, doris::io::IOContext*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/vec/exec/format/parquet/parquet_thrift_util.h:55:17
morningman pushed a commit that referenced this pull request Sep 12, 2025
Related PR: #45937
branch-3.1: #55500

Problem Summary:
Fix the error case on ingestion load and the core in parquet reader.

```
==8898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62f0020603fc at pc 0x55f634e64ded bp 0x7fba0d03c410 sp 0x7fba0d03bbd8 READ of size 4 at 0x62f0020603fc thread T768 (PUSH-9699)
    #0 0x55f634e64dec in __asan_memcpy (/mnt/hdd01/ci/doris-deploy-branch-3.1-local/be/lib/doris_be+0x39a24dec) (BuildId: 9b04e7f7d3075dac)
    #1 0x55f634eca93f in std::char_traits::copy(char*, char const*, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/char_traits.h:409:33
    #2 0x55f634eca93f in std::__cxx11::basic_string, std::allocator>::_S_copy(char*, char const*, unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:351:4
    #3 0x55f634eca93f in std::__cxx11::basic_string, std::allocator>::_S_copy_chars(char*, char const*, char const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:398:9
    #4 0x55f634eca93f in void std::__cxx11::basic_string, std::allocator>::_M_construct(char const*, char const*, std::forward_iterator_tag) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:225:6
    #5 0x55f654a4f74d in void std::__cxx11::basic_string, std::allocator>::_M_construct_aux(char const*, char const*, std::__false_type) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:247:11
    #6 0x55f654a4f74d in void std::__cxx11::basic_string, std::allocator>::_M_construct(char const*, char const*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:266:4
    #7 0x55f654a4f74d in std::__cxx11::basic_string, std::allocator>::basic_string(char const*, unsigned long, std::allocator const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:513:9
    #8 0x55f654a4f74d in doris::vectorized::parse_thrift_footer(std::shared_ptr, doris::vectorized::FileMetaData**, unsigned long*, doris::io::IOContext*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/vec/exec/format/parquet/parquet_thrift_util.h:55:17
```
@morrySnow morrySnow mentioned this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants