Skip to content

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Nov 18, 2025

What problem does this PR solve?

image
doris_be: /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359: simdjson_result<std::string_view> simdjson::fallback::ondemand::json_iterator::unescape(raw_json_string, bool): Assertion `!parser->string_buffer_overflow(_string_buf_loc)' failed.
*** Query id: 24468c5d0b372cf0-3e6c777580238e96 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1763545679 (unix time) try "date -d @1763545679" if you are using GNU date ***
*** Current BE git commitID: cbc3d21884 ***
*** SIGABRT unknown detail explain (@0x3f8001618b8) received by PID 1448120 (TID 1450358 OR 0x7b0fb9f4e700) from PID 1448120; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420
 1# 0x00007F183C476D10 in /lib64/libpthread.so.0
 2# __GI_raise in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007F183BCC8E86 in /lib64/libc.so.6
 6# simdjson::fallback::ondemand::json_iterator::unescape(simdjson::fallback::ondemand::raw_json_string, bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359
 7# simdjson::fallback::ondemand::raw_json_string::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/raw_json_string-inl.h:160
 8# simdjson::simdjson_result<simdjson::fallback::ondemand::raw_json_string>::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const in /root/doris/be/output/lib/doris_be
 9# simdjson::fallback::ondemand::value_iterator::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value_iterator-inl.h:514
10# simdjson::fallback::ondemand::value::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value-inl.h:48
11# doris::Status doris::vectorized::NewJsonReader::_simdjson_write_data_to_column<false>(simdjson::fallback::ondemand::value&, std::shared_ptr<doris::vectorized::IDataType const> const&, doris::vectorized::IColumn*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<doris::vectorized::DataTypeSerDe>, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1080
12# doris::vectorized::NewJsonReader::_simdjson_write_columns_by_jsonpath(simdjson::fallback::ondemand::object*, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, doris::vectorized::Block&, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1460
13# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json_write_columns(doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:806
14# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:753
15# doris::vectorized::NewJsonReader::_read_json_column(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:485
16# doris::vectorized::NewJsonReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:209
17# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:480
18# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:417
19# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:113
20# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:85
21# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
22# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:96
23# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:95

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 18, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33954 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cbc3d21884da9b98cd4d1fc943ef3da7efb551c0, data reload: false

------ Round 1 ----------------------------------
q1	17582	5044	4920	4920
q2	2068	311	200	200
q3	10249	1279	720	720
q4	10220	873	372	372
q5	7501	2380	2333	2333
q6	185	169	134	134
q7	911	773	617	617
q8	9327	1310	1070	1070
q9	6941	5080	5048	5048
q10	6862	2219	1828	1828
q11	500	308	290	290
q12	366	371	243	243
q13	17786	3693	3042	3042
q14	231	236	217	217
q15	584	508	500	500
q16	991	1023	933	933
q17	603	868	349	349
q18	7379	7190	6992	6992
q19	1100	969	582	582
q20	358	349	227	227
q21	3691	3213	2359	2359
q22	1068	1017	978	978
Total cold run time: 106503 ms
Total hot run time: 33954 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4993	5015	4975	4975
q2	254	328	235	235
q3	2230	2645	2327	2327
q4	1346	1752	1340	1340
q5	4201	4238	4487	4238
q6	210	171	132	132
q7	2039	1959	1829	1829
q8	2773	2574	2634	2574
q9	7188	7290	7245	7245
q10	3117	3256	2817	2817
q11	584	520	502	502
q12	710	762	633	633
q13	3703	4012	3340	3340
q14	278	303	280	280
q15	547	496	528	496
q16	1071	1103	1063	1063
q17	1157	1529	1382	1382
q18	7787	7794	7484	7484
q19	854	856	828	828
q20	2033	1941	1791	1791
q21	4669	4255	4356	4255
q22	1078	1076	1003	1003
Total cold run time: 52822 ms
Total hot run time: 50769 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188440 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cbc3d21884da9b98cd4d1fc943ef3da7efb551c0, data reload: false

query1	1010	405	395	395
query2	6580	1673	1650	1650
query3	6752	228	235	228
query4	26060	23246	22950	22950
query5	4441	700	533	533
query6	370	254	238	238
query7	4672	502	315	315
query8	318	309	254	254
query9	8695	2996	2930	2930
query10	501	342	299	299
query11	15783	14925	15118	14925
query12	184	120	117	117
query13	1673	558	454	454
query14	10839	9158	9016	9016
query15	209	186	178	178
query16	7369	671	482	482
query17	1229	754	600	600
query18	1986	419	331	331
query19	218	208	190	190
query20	134	127	125	125
query21	219	139	117	117
query22	4101	4160	4026	4026
query23	34477	33062	32939	32939
query24	8233	2413	2396	2396
query25	648	570	503	503
query26	1246	277	169	169
query27	2737	509	368	368
query28	4415	2272	2241	2241
query29	846	671	530	530
query30	304	225	205	205
query31	922	830	731	731
query32	94	81	89	81
query33	605	401	356	356
query34	794	851	521	521
query35	836	869	751	751
query36	971	1021	930	930
query37	140	127	102	102
query38	3497	3522	3460	3460
query39	1491	1431	1412	1412
query40	239	146	126	126
query41	66	69	65	65
query42	135	122	120	120
query43	484	468	456	456
query44	1267	823	815	815
query45	193	185	176	176
query46	887	990	649	649
query47	1756	1803	1707	1707
query48	404	428	330	330
query49	777	547	437	437
query50	649	687	422	422
query51	3835	3892	3860	3860
query52	124	123	110	110
query53	272	279	206	206
query54	353	336	349	336
query55	93	93	89	89
query56	360	353	351	351
query57	1175	1206	1119	1119
query58	302	291	293	291
query59	2641	2679	2580	2580
query60	370	360	359	359
query61	164	161	155	155
query62	815	693	661	661
query63	224	199	193	193
query64	4480	1157	919	919
query65	4011	3971	3938	3938
query66	1155	448	351	351
query67	15043	14978	14972	14972
query68	8597	1018	644	644
query69	517	347	292	292
query70	1380	1243	1249	1243
query71	476	356	333	333
query72	5502	4910	4814	4814
query73	723	586	366	366
query74	8880	8923	8914	8914
query75	3945	3356	2820	2820
query76	3664	1129	725	725
query77	824	407	334	334
query78	9502	9696	8925	8925
query79	2110	874	616	616
query80	664	599	539	539
query81	492	272	226	226
query82	444	168	140	140
query83	270	269	262	262
query84	254	117	90	90
query85	945	502	455	455
query86	367	322	298	298
query87	3701	3679	3591	3591
query88	3112	2275	2232	2232
query89	390	338	307	307
query90	1959	230	239	230
query91	163	169	136	136
query92	89	80	75	75
query93	1218	1024	683	683
query94	721	423	335	335
query95	420	349	337	337
query96	491	586	284	284
query97	2902	2957	2867	2867
query98	245	220	216	216
query99	1315	1361	1270	1270
Total cold run time: 274155 ms
Total hot run time: 188440 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.83 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cbc3d21884da9b98cd4d1fc943ef3da7efb551c0, data reload: false

query1	0.06	0.06	0.05
query2	0.09	0.05	0.04
query3	0.26	0.09	0.08
query4	1.60	0.11	0.12
query5	0.27	0.25	0.25
query6	1.20	0.65	0.64
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.59	0.52	0.52
query10	0.58	0.57	0.57
query11	0.15	0.11	0.12
query12	0.16	0.12	0.12
query13	0.63	0.61	0.59
query14	1.02	0.98	0.99
query15	0.83	0.81	0.84
query16	0.40	0.39	0.38
query17	1.02	1.04	1.00
query18	0.22	0.20	0.20
query19	1.91	1.78	1.82
query20	0.02	0.02	0.01
query21	15.44	0.21	0.13
query22	4.92	0.06	0.04
query23	15.69	0.27	0.10
query24	2.67	0.87	0.95
query25	0.07	0.06	0.06
query26	0.15	0.13	0.14
query27	0.05	0.06	0.06
query28	5.05	1.17	0.95
query29	12.65	4.01	3.24
query30	0.29	0.14	0.11
query31	2.82	0.60	0.39
query32	3.24	0.54	0.47
query33	2.97	3.09	3.06
query34	15.60	5.15	4.52
query35	4.59	4.54	4.56
query36	0.66	0.51	0.49
query37	0.09	0.07	0.07
query38	0.07	0.04	0.04
query39	0.04	0.03	0.04
query40	0.18	0.15	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.03	0.03
Total cold run time: 98.51 s
Total hot run time: 27.83 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.73% (18270/34650)
Line Coverage 38.10% (166007/435703)
Region Coverage 33.02% (129026/390704)
Branch Coverage 33.81% (55387/163809)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.47% (24334/34050)
Line Coverage 57.93% (252819/436385)
Region Coverage 53.18% (210742/396286)
Branch Coverage 54.52% (89907/164892)

1 similar comment
@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.47% (24334/34050)
Line Coverage 57.93% (252819/436385)
Region Coverage 53.18% (210742/396286)
Branch Coverage 54.52% (89907/164892)

std::string_view value_string = value.get_string();
const auto cache_key = value.raw_json().value();
std::string_view value_string;
if (_cached_string_values.contains(cache_key)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map will lead to performance issue, and this is the critical part.maybe only modify _simdjson_write_columns_by_jsonpath

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.45% (24328/34050)
Line Coverage 57.93% (252785/436385)
Region Coverage 53.19% (210779/396286)
Branch Coverage 54.51% (89890/164892)

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 19, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34143 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9c155eb64effca21b2ecc8f24e85a76dae23ba4c, data reload: false

------ Round 1 ----------------------------------
q1	17622	5089	4949	4949
q2	2040	305	223	223
q3	10257	1298	730	730
q4	10221	898	353	353
q5	7521	2395	2314	2314
q6	186	165	135	135
q7	915	758	616	616
q8	9343	1329	1015	1015
q9	7137	5436	5350	5350
q10	6802	2254	1820	1820
q11	482	299	283	283
q12	329	378	233	233
q13	17766	3643	3025	3025
q14	234	233	210	210
q15	571	501	489	489
q16	1009	1003	936	936
q17	574	869	356	356
q18	7376	7248	7020	7020
q19	1093	973	570	570
q20	369	333	222	222
q21	3744	3171	2310	2310
q22	1067	1037	984	984
Total cold run time: 106658 ms
Total hot run time: 34143 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4926	4939	4915	4915
q2	334	398	315	315
q3	2162	2649	2300	2300
q4	1318	1747	1315	1315
q5	4173	4348	4660	4348
q6	213	173	133	133
q7	2045	1932	1833	1833
q8	2644	2701	2588	2588
q9	7590	7631	7471	7471
q10	3080	3226	2825	2825
q11	581	527	522	522
q12	718	778	601	601
q13	3559	3891	3637	3637
q14	294	313	277	277
q15	534	516	514	514
q16	1050	1091	1061	1061
q17	1171	1561	1410	1410
q18	7897	7772	7685	7685
q19	798	897	1073	897
q20	1996	2048	1934	1934
q21	5082	4420	4256	4256
q22	1141	1036	994	994
Total cold run time: 53306 ms
Total hot run time: 51831 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188237 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9c155eb64effca21b2ecc8f24e85a76dae23ba4c, data reload: false

query1	1032	445	403	403
query2	6565	1650	1693	1650
query3	6775	232	235	232
query4	26228	23293	22562	22562
query5	4397	673	515	515
query6	337	243	232	232
query7	4642	505	302	302
query8	298	263	250	250
query9	8638	2934	2954	2934
query10	499	387	303	303
query11	15535	15025	14802	14802
query12	177	122	119	119
query13	1671	566	451	451
query14	10393	9130	9173	9130
query15	207	195	176	176
query16	7419	647	514	514
query17	1232	781	625	625
query18	2003	450	357	357
query19	234	207	189	189
query20	135	128	131	128
query21	219	134	116	116
query22	4122	4218	4030	4030
query23	34112	33090	33139	33090
query24	8183	2366	2387	2366
query25	621	531	448	448
query26	1240	279	170	170
query27	2746	517	364	364
query28	4424	2250	2234	2234
query29	806	654	499	499
query30	296	224	201	201
query31	892	800	678	678
query32	99	83	82	82
query33	588	401	355	355
query34	787	865	533	533
query35	824	843	750	750
query36	952	1000	916	916
query37	131	111	104	104
query38	3547	3553	3432	3432
query39	1468	1407	1438	1407
query40	227	140	127	127
query41	71	64	63	63
query42	130	118	115	115
query43	463	497	470	470
query44	1254	804	807	804
query45	188	185	175	175
query46	887	998	643	643
query47	1787	1786	1755	1755
query48	391	424	333	333
query49	755	502	414	414
query50	641	686	401	401
query51	3869	3932	3929	3929
query52	117	116	114	114
query53	257	269	206	206
query54	333	315	332	315
query55	98	97	96	96
query56	359	354	346	346
query57	1202	1178	1104	1104
query58	305	293	288	288
query59	2552	2631	2523	2523
query60	382	379	365	365
query61	194	187	193	187
query62	805	717	670	670
query63	234	198	201	198
query64	4686	1303	1018	1018
query65	4016	3948	3943	3943
query66	1214	466	377	377
query67	15350	14915	15057	14915
query68	8016	943	648	648
query69	520	346	312	312
query70	1282	1277	1318	1277
query71	456	356	341	341
query72	6051	5059	4880	4880
query73	648	574	368	368
query74	9003	9150	8967	8967
query75	3315	3278	2738	2738
query76	3277	1161	741	741
query77	528	404	343	343
query78	9603	9718	8848	8848
query79	1857	860	610	610
query80	686	598	529	529
query81	505	258	235	235
query82	234	167	137	137
query83	277	271	247	247
query84	251	109	91	91
query85	888	500	456	456
query86	371	327	288	288
query87	3656	3708	3651	3651
query88	3519	2273	2251	2251
query89	395	326	289	289
query90	2022	242	236	236
query91	165	166	141	141
query92	98	78	76	76
query93	1820	1004	687	687
query94	729	424	336	336
query95	414	348	346	346
query96	485	592	280	280
query97	2912	2957	2858	2858
query98	254	230	228	228
query99	1310	1429	1259	1259
Total cold run time: 272788 ms
Total hot run time: 188237 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9c155eb64effca21b2ecc8f24e85a76dae23ba4c, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.04	0.05
query3	0.26	0.08	0.08
query4	1.62	0.12	0.11
query5	0.26	0.25	0.26
query6	1.17	0.66	0.63
query7	0.04	0.03	0.02
query8	0.06	0.04	0.04
query9	0.59	0.53	0.51
query10	0.58	0.57	0.58
query11	0.16	0.11	0.10
query12	0.15	0.12	0.12
query13	0.63	0.61	0.61
query14	1.00	1.00	0.99
query15	0.84	0.84	0.83
query16	0.39	0.39	0.39
query17	1.01	0.99	1.02
query18	0.22	0.19	0.20
query19	1.86	1.73	1.86
query20	0.01	0.02	0.01
query21	15.47	0.21	0.14
query22	5.01	0.07	0.04
query23	15.66	0.25	0.11
query24	2.11	0.52	0.42
query25	0.08	0.07	0.06
query26	0.14	0.12	0.14
query27	0.07	0.06	0.06
query28	4.55	1.18	0.97
query29	12.59	3.84	3.20
query30	0.28	0.13	0.14
query31	2.81	0.57	0.38
query32	3.23	0.54	0.47
query33	2.98	3.07	3.10
query34	15.80	5.16	4.53
query35	4.58	4.53	4.56
query36	0.68	0.50	0.48
query37	0.09	0.06	0.06
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 97.57 s
Total hot run time: 27.33 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/27) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.73% (18277/34660)
Line Coverage 38.10% (166073/435849)
Region Coverage 33.06% (129205/390857)
Branch Coverage 33.82% (55412/163851)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.89% (24/27) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.47% (24341/34060)
Line Coverage 57.95% (252977/436518)
Region Coverage 53.26% (211132/396433)
Branch Coverage 54.62% (90079/164930)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.89% (24/27) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.47% (24344/34060)
Line Coverage 57.96% (253004/436518)
Region Coverage 53.26% (211154/396433)
Branch Coverage 54.61% (90076/164930)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.89% (24/27) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.46% (24340/34060)
Line Coverage 57.94% (252915/436518)
Region Coverage 53.21% (210947/396433)
Branch Coverage 54.59% (90037/164930)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 20, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit adb8074 into apache:master Nov 20, 2025
26 of 28 checks passed
github-actions bot pushed a commit that referenced this pull request Nov 20, 2025
…ple calls to the get_string method (#58107)

### What problem does this PR solve?

<img width="2118" height="1222" alt="image"
src="https://github.com/user-attachments/assets/aeefb709-f59d-406e-824f-0c9b250bb5af"
/>

```text
doris_be: /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359: simdjson_result<std::string_view> simdjson::fallback::ondemand::json_iterator::unescape(raw_json_string, bool): Assertion `!parser->string_buffer_overflow(_string_buf_loc)' failed.
*** Query id: 24468c5d0b372cf0-3e6c777580238e96 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1763545679 (unix time) try "date -d @1763545679" if you are using GNU date ***
*** Current BE git commitID: cbc3d21 ***
*** SIGABRT unknown detail explain (@0x3f8001618b8) received by PID 1448120 (TID 1450358 OR 0x7b0fb9f4e700) from PID 1448120; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420
 1# 0x00007F183C476D10 in /lib64/libpthread.so.0
 2# __GI_raise in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007F183BCC8E86 in /lib64/libc.so.6
 6# simdjson::fallback::ondemand::json_iterator::unescape(simdjson::fallback::ondemand::raw_json_string, bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359
 7# simdjson::fallback::ondemand::raw_json_string::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/raw_json_string-inl.h:160
 8# simdjson::simdjson_result<simdjson::fallback::ondemand::raw_json_string>::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const in /root/doris/be/output/lib/doris_be
 9# simdjson::fallback::ondemand::value_iterator::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value_iterator-inl.h:514
10# simdjson::fallback::ondemand::value::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value-inl.h:48
11# doris::Status doris::vectorized::NewJsonReader::_simdjson_write_data_to_column<false>(simdjson::fallback::ondemand::value&, std::shared_ptr<doris::vectorized::IDataType const> const&, doris::vectorized::IColumn*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<doris::vectorized::DataTypeSerDe>, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1080
12# doris::vectorized::NewJsonReader::_simdjson_write_columns_by_jsonpath(simdjson::fallback::ondemand::object*, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, doris::vectorized::Block&, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1460
13# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json_write_columns(doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:806
14# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:753
15# doris::vectorized::NewJsonReader::_read_json_column(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:485
16# doris::vectorized::NewJsonReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:209
17# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:480
18# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:417
19# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:113
20# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:85
21# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
22# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:96
23# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:95
```

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@mrhhsg mrhhsg deleted the fix_json_ub branch November 20, 2025 09:07
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Nov 20, 2025
…ple calls to the get_string method (apache#58107)

<img width="2118" height="1222" alt="image"
src="https://github.com/user-attachments/assets/aeefb709-f59d-406e-824f-0c9b250bb5af"
/>

```text
doris_be: /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359: simdjson_result<std::string_view> simdjson::fallback::ondemand::json_iterator::unescape(raw_json_string, bool): Assertion `!parser->string_buffer_overflow(_string_buf_loc)' failed.
*** Query id: 24468c5d0b372cf0-3e6c777580238e96 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1763545679 (unix time) try "date -d @1763545679" if you are using GNU date ***
*** Current BE git commitID: cbc3d21 ***
*** SIGABRT unknown detail explain (@0x3f8001618b8) received by PID 1448120 (TID 1450358 OR 0x7b0fb9f4e700) from PID 1448120; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420
 1# 0x00007F183C476D10 in /lib64/libpthread.so.0
 2# __GI_raise in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007F183BCC8E86 in /lib64/libc.so.6
 6# simdjson::fallback::ondemand::json_iterator::unescape(simdjson::fallback::ondemand::raw_json_string, bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359
 7# simdjson::fallback::ondemand::raw_json_string::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/raw_json_string-inl.h:160
 8# simdjson::simdjson_result<simdjson::fallback::ondemand::raw_json_string>::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const in /root/doris/be/output/lib/doris_be
 9# simdjson::fallback::ondemand::value_iterator::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value_iterator-inl.h:514
10# simdjson::fallback::ondemand::value::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value-inl.h:48
11# doris::Status doris::vectorized::NewJsonReader::_simdjson_write_data_to_column<false>(simdjson::fallback::ondemand::value&, std::shared_ptr<doris::vectorized::IDataType const> const&, doris::vectorized::IColumn*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<doris::vectorized::DataTypeSerDe>, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1080
12# doris::vectorized::NewJsonReader::_simdjson_write_columns_by_jsonpath(simdjson::fallback::ondemand::object*, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, doris::vectorized::Block&, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1460
13# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json_write_columns(doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:806
14# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:753
15# doris::vectorized::NewJsonReader::_read_json_column(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:485
16# doris::vectorized::NewJsonReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:209
17# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:480
18# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:417
19# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:113
20# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:85
21# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
22# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()apache#1}::operator()() const::{lambda()apache#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:96
23# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()apache#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:95
```

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

None

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
yiguolei pushed a commit that referenced this pull request Nov 20, 2025
…sed by multiple calls to the get_string method #58107 (#58185)

Cherry-picked from #58107

Co-authored-by: Jerry Hu <hushenggang@selectdb.com>
morrySnow pushed a commit that referenced this pull request Nov 25, 2025
…sed by multiple calls to the get_string method #58107 (#58192)

picked from #58107
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…ple calls to the get_string method (apache#58107)

### What problem does this PR solve?

<img width="2118" height="1222" alt="image"
src="https://github.com/user-attachments/assets/aeefb709-f59d-406e-824f-0c9b250bb5af"
/>

```text
doris_be: /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359: simdjson_result<std::string_view> simdjson::fallback::ondemand::json_iterator::unescape(raw_json_string, bool): Assertion `!parser->string_buffer_overflow(_string_buf_loc)' failed.
*** Query id: 24468c5d0b372cf0-3e6c777580238e96 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1763545679 (unix time) try "date -d @1763545679" if you are using GNU date ***
*** Current BE git commitID: cbc3d21 ***
*** SIGABRT unknown detail explain (@0x3f8001618b8) received by PID 1448120 (TID 1450358 OR 0x7b0fb9f4e700) from PID 1448120; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420
 1# 0x00007F183C476D10 in /lib64/libpthread.so.0
 2# __GI_raise in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007F183BCC8E86 in /lib64/libc.so.6
 6# simdjson::fallback::ondemand::json_iterator::unescape(simdjson::fallback::ondemand::raw_json_string, bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/json_iterator-inl.h:359
 7# simdjson::fallback::ondemand::raw_json_string::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/raw_json_string-inl.h:160
 8# simdjson::simdjson_result<simdjson::fallback::ondemand::raw_json_string>::unescape(simdjson::fallback::ondemand::json_iterator&, bool) const in /root/doris/be/output/lib/doris_be
 9# simdjson::fallback::ondemand::value_iterator::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value_iterator-inl.h:514
10# simdjson::fallback::ondemand::value::get_string(bool) at /root/doris/thirdparty/installed/include/simdjson/generic/ondemand/value-inl.h:48
11# doris::Status doris::vectorized::NewJsonReader::_simdjson_write_data_to_column<false>(simdjson::fallback::ondemand::value&, std::shared_ptr<doris::vectorized::IDataType const> const&, doris::vectorized::IColumn*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<doris::vectorized::DataTypeSerDe>, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1080
12# doris::vectorized::NewJsonReader::_simdjson_write_columns_by_jsonpath(simdjson::fallback::ondemand::object*, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, doris::vectorized::Block&, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:1460
13# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json_write_columns(doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:806
14# doris::vectorized::NewJsonReader::_simdjson_handle_flat_array_complex_json(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:753
15# doris::vectorized::NewJsonReader::_read_json_column(doris::RuntimeState*, doris::vectorized::Block&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:485
16# doris::vectorized::NewJsonReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /root/doris/be/src/vec/exec/format/json/new_json_reader.cpp:209
17# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:480
18# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/file_scanner.cpp:417
19# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:113
20# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/scan/scanner.cpp:85
21# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
22# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()apache#1}::operator()() const::{lambda()apache#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:96
23# doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_0::operator()() const::{lambda()apache#1}::operator()() const at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:95
```

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.4-merged dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants