Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](orc-reader) Fix StringRef nullptr data in orc-reader. #40857

Merged

Conversation

kaka11chen
Copy link
Contributor

@kaka11chen kaka11chen commented Sep 14, 2024

Proposed changes

Issue

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9: runtime error: reference binding to null pointer of type 'doris::StringRef'
    #0 0x55ee63eb0418 in std::vector<doris::StringRef, std::allocator<doris::StringRef>>::operator[](unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:2
    #1 0x55ee63eb0418 in doris::Status doris::vectorized::OrcReader::_decode_string_non_dict_encoded_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> const&, orc::TypeKind const&, orc::EncodedStringVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1172:39
    #2 0x55ee63ea2685 in doris::Status doris::vectorized::OrcReader::_decode_string_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> const&, orc::TypeKind const&, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1124:16
    #3 0x55ee63e97e7a in doris::Status doris::vectorized::OrcReader::_fill_doris_data_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1365:16
    #4 0x55ee63b0e450 in doris::Status doris::vectorized::OrcReader::_orc_column_to_doris_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::immutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1532:5
    #5 0x55ee63e99622 in doris::Status doris::vectorized::OrcReader::_fill_doris_data_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1410:9
    #6 0x55ee63b0e450 in doris::Status doris::vectorized::OrcReader::_orc_column_to_doris_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::immutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1532:5
    #7 0x55ee63ad4f86 in doris::vectorized::OrcReader::get_next_block_impl(doris::vectorized::Block*, unsigned long*, bool*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1714:13
    #8 0x55ee63ad093b in doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1547:5

Solution

[Fix] (orc-reader) Fix StringRef nullptr data in orc-reader. When string is empty in orc row batch, the data can point anything, maybe nullptr, StringRef has undefined behavior when data is nullptr.

Related with #37845.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the fix_vorc_reader_string_ref_nullptr branch from f2752aa to fe3b88f Compare September 14, 2024 06:54
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 43254 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit fe3b88ffe7e0be6154e0cc5bb7bedac69e0bb402, data reload: false

------ Round 1 ----------------------------------
q1	17577	7300	7223	7223
q2	2032	182	183	182
q3	10556	1301	1440	1301
q4	10340	1074	1052	1052
q5	7724	3201	3169	3169
q6	243	156	152	152
q7	1041	641	611	611
q8	9448	2006	2041	2006
q9	6708	6337	6313	6313
q10	7055	2529	2588	2529
q11	435	249	255	249
q12	410	236	228	228
q13	17750	3042	3043	3042
q14	287	246	253	246
q15	572	529	532	529
q16	502	452	434	434
q17	1012	950	971	950
q18	7388	6915	6921	6915
q19	1386	1243	1231	1231
q20	613	335	333	333
q21	3933	3572	3563	3563
q22	1097	996	1003	996
Total cold run time: 108109 ms
Total hot run time: 43254 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7253	7205	7248	7205
q2	362	241	239	239
q3	3126	3149	3078	3078
q4	2134	2109	2026	2026
q5	5720	5596	5711	5596
q6	245	151	156	151
q7	2153	1813	1773	1773
q8	3380	3458	3445	3445
q9	8838	8912	8841	8841
q10	3523	3598	3596	3596
q11	598	484	506	484
q12	831	645	634	634
q13	9343	3234	3220	3220
q14	320	282	270	270
q15	583	528	552	528
q16	502	490	464	464
q17	1793	1745	1727	1727
q18	8369	7888	8136	7888
q19	1821	1767	1743	1743
q20	2110	1875	1891	1875
q21	5953	5650	5607	5607
q22	1202	1064	1033	1033
Total cold run time: 70159 ms
Total hot run time: 61423 ms

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 14, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.32% (9577/25662)
Line Coverage: 28.69% (79099/275720)
Region Coverage: 28.18% (40974/145410)
Branch Coverage: 24.80% (20883/84212)
Coverage Report: http://coverage.selectdb-in.cc/coverage/fe3b88ffe7e0be6154e0cc5bb7bedac69e0bb402_fe3b88ffe7e0be6154e0cc5bb7bedac69e0bb402/report/index.html

@yiguolei yiguolei force-pushed the fix_vorc_reader_string_ref_nullptr branch from fe3b88f to a514d7d Compare September 16, 2024 01:07
@yiguolei
Copy link
Contributor

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.32% (9577/25662)
Line Coverage: 28.70% (79137/275740)
Region Coverage: 28.18% (40979/145429)
Branch Coverage: 24.80% (20884/84214)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a514d7dea7816e0f3f14269a04972c3f274bed54_a514d7dea7816e0f3f14269a04972c3f274bed54/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 41265 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a514d7dea7816e0f3f14269a04972c3f274bed54, data reload: false

------ Round 1 ----------------------------------
q1	17581	7335	7265	7265
q2	2053	160	154	154
q3	10915	1087	1143	1087
q4	10582	789	771	771
q5	7770	3040	3051	3040
q6	229	145	146	145
q7	991	629	601	601
q8	9430	2038	2055	2038
q9	6812	6400	6439	6400
q10	6993	2322	2292	2292
q11	435	247	246	246
q12	398	211	211	211
q13	17785	2999	2964	2964
q14	240	217	219	217
q15	574	530	537	530
q16	488	425	414	414
q17	960	810	794	794
q18	7358	6767	6723	6723
q19	1389	1086	1001	1001
q20	582	305	286	286
q21	4016	3093	3186	3093
q22	1101	1041	993	993
Total cold run time: 108682 ms
Total hot run time: 41265 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7217	7468	7234	7234
q2	320	220	223	220
q3	2995	2948	2969	2948
q4	2010	1847	1869	1847
q5	5624	5691	5614	5614
q6	225	145	147	145
q7	2169	1810	1761	1761
q8	3307	3429	3426	3426
q9	8790	8909	8770	8770
q10	3546	3387	3458	3387
q11	579	492	472	472
q12	796	617	620	617
q13	9859	3211	3183	3183
q14	310	270	277	270
q15	559	522	531	522
q16	527	529	466	466
q17	1821	1562	1562	1562
q18	8264	7707	7855	7707
q19	1718	1614	1596	1596
q20	2117	1878	1878	1878
q21	5471	5272	5483	5272
q22	1154	1079	1029	1029
Total cold run time: 69378 ms
Total hot run time: 59926 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198422 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a514d7dea7816e0f3f14269a04972c3f274bed54, data reload: false

query1	1274	928	850	850
query2	6327	2002	1977	1977
query3	10814	3926	3905	3905
query4	64212	29035	23486	23486
query5	5022	458	461	458
query6	401	181	157	157
query7	5467	314	293	293
query8	315	223	213	213
query9	8246	2597	2594	2594
query10	427	294	269	269
query11	17134	15190	15585	15190
query12	156	104	99	99
query13	1444	419	413	413
query14	10536	7488	7650	7488
query15	220	176	177	176
query16	6747	476	517	476
query17	1131	631	598	598
query18	1549	310	308	308
query19	208	151	158	151
query20	130	112	110	110
query21	208	101	102	101
query22	4664	4361	4784	4361
query23	34632	33992	34104	33992
query24	6020	2903	2885	2885
query25	509	400	406	400
query26	648	162	160	160
query27	1671	281	286	281
query28	3858	2146	2096	2096
query29	668	424	424	424
query30	231	151	149	149
query31	988	744	806	744
query32	76	53	55	53
query33	450	290	300	290
query34	914	498	478	478
query35	843	739	731	731
query36	1065	939	910	910
query37	142	83	84	83
query38	3979	3999	3929	3929
query39	1474	1398	1392	1392
query40	208	95	95	95
query41	49	46	47	46
query42	111	99	100	99
query43	525	492	492	492
query44	1167	807	797	797
query45	196	165	163	163
query46	1131	743	728	728
query47	1924	1790	1777	1777
query48	466	359	362	359
query49	702	412	390	390
query50	833	392	400	392
query51	7119	6989	6802	6802
query52	98	83	86	83
query53	249	181	178	178
query54	566	451	450	450
query55	73	72	76	72
query56	268	248	243	243
query57	1206	1054	1070	1054
query58	216	228	254	228
query59	3175	3021	2967	2967
query60	303	250	267	250
query61	102	99	110	99
query62	748	657	656	656
query63	210	182	181	181
query64	1359	639	659	639
query65	3269	3185	3143	3143
query66	679	310	298	298
query67	16126	15486	15369	15369
query68	1302	862	873	862
query69	437	349	342	342
query70	1201	1158	1176	1158
query71	344	326	323	323
query72	6124	3384	3350	3350
query73	589	568	574	568
query74	9331	9025	9003	9003
query75	2945	2918	2861	2861
query76	1149	856	845	845
query77	397	346	362	346
query78	9722	9335	9259	9259
query79	912	891	868	868
query80	580	556	560	556
query81	447	238	241	238
query82	191	193	193	193
query83	161	156	166	156
query84	309	101	93	93
query85	660	363	356	356
query86	309	314	310	310
query87	4481	4409	4387	4387
query88	4311	4002	3987	3987
query89	365	358	357	357
query90	1529	303	304	303
query91	160	159	157	157
query92	73	70	72	70
query93	917	890	891	890
query94	549	360	361	360
query95	421	418	439	418
query96	479	468	478	468
query97	3117	3105	3142	3105
query98	233	230	229	229
query99	1431	1299	1284	1284
Total cold run time: 303054 ms
Total hot run time: 198422 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a514d7dea7816e0f3f14269a04972c3f274bed54, data reload: false

query1	0.04	0.05	0.05
query2	0.07	0.03	0.03
query3	0.23	0.06	0.06
query4	1.64	0.10	0.10
query5	0.52	0.52	0.51
query6	1.13	0.72	0.72
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.56	0.50	0.48
query10	0.55	0.56	0.54
query11	0.15	0.10	0.10
query12	0.14	0.11	0.11
query13	0.61	0.58	0.57
query14	1.39	1.42	1.43
query15	0.84	0.81	0.82
query16	0.39	0.38	0.37
query17	1.05	1.05	1.02
query18	0.20	0.18	0.19
query19	1.88	1.87	1.74
query20	0.01	0.01	0.01
query21	15.39	0.60	0.60
query22	2.76	2.81	2.52
query23	17.16	0.81	0.90
query24	2.48	1.19	0.66
query25	0.34	0.09	0.09
query26	0.30	0.14	0.13
query27	0.04	0.04	0.03
query28	11.15	1.09	1.08
query29	12.56	3.28	3.29
query30	0.25	0.06	0.05
query31	2.88	0.37	0.38
query32	3.31	0.45	0.46
query33	2.97	3.04	3.00
query34	17.11	4.42	4.32
query35	4.41	4.39	4.44
query36	0.68	0.47	0.49
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.15	0.13	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.69 s
Total hot run time: 31.41 s

@morningman morningman merged commit d5133be into apache:master Sep 18, 2024
22 of 26 checks passed
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Sep 20, 2024
…0857)

## Proposed changes

### Issue
```
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9: runtime error: reference binding to null pointer of type 'doris::StringRef'
    #0 0x55ee63eb0418 in std::vector<doris::StringRef, std::allocator<doris::StringRef>>::operator[](unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:2
    #1 0x55ee63eb0418 in doris::Status doris::vectorized::OrcReader::_decode_string_non_dict_encoded_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> const&, orc::TypeKind const&, orc::EncodedStringVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1172:39
    apache#2 0x55ee63ea2685 in doris::Status doris::vectorized::OrcReader::_decode_string_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> const&, orc::TypeKind const&, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1124:16
    apache#3 0x55ee63e97e7a in doris::Status doris::vectorized::OrcReader::_fill_doris_data_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1365:16
    apache#4 0x55ee63b0e450 in doris::Status doris::vectorized::OrcReader::_orc_column_to_doris_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::immutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1532:5
    apache#5 0x55ee63e99622 in doris::Status doris::vectorized::OrcReader::_fill_doris_data_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1410:9
    apache#6 0x55ee63b0e450 in doris::Status doris::vectorized::OrcReader::_orc_column_to_doris_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::immutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1532:5
    apache#7 0x55ee63ad4f86 in doris::vectorized::OrcReader::get_next_block_impl(doris::vectorized::Block*, unsigned long*, bool*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1714:13
    apache#8 0x55ee63ad093b in doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1547:5
```
### Solution
[Fix] (orc-reader) Fix StringRef nullptr data in orc-reader. When string
is empty in orc row batch, the data can point anything, maybe nullptr,
StringRef has undefined behavior when data is nullptr.

Related with apache#37845.
morningman pushed a commit that referenced this pull request Sep 20, 2024
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Sep 25, 2024
…0857)

## Proposed changes

### Issue
```
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9: runtime error: reference binding to null pointer of type 'doris::StringRef'
    #0 0x55ee63eb0418 in std::vector<doris::StringRef, std::allocator<doris::StringRef>>::operator[](unsigned long) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:2
    #1 0x55ee63eb0418 in doris::Status doris::vectorized::OrcReader::_decode_string_non_dict_encoded_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> const&, orc::TypeKind const&, orc::EncodedStringVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1172:39
    apache#2 0x55ee63ea2685 in doris::Status doris::vectorized::OrcReader::_decode_string_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> const&, orc::TypeKind const&, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1124:16
    apache#3 0x55ee63e97e7a in doris::Status doris::vectorized::OrcReader::_fill_doris_data_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1365:16
    apache#4 0x55ee63b0e450 in doris::Status doris::vectorized::OrcReader::_orc_column_to_doris_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::immutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1532:5
    apache#5 0x55ee63e99622 in doris::Status doris::vectorized::OrcReader::_fill_doris_data_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1410:9
    apache#6 0x55ee63b0e450 in doris::Status doris::vectorized::OrcReader::_orc_column_to_doris_column<false>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, COW<doris::vectorized::IColumn>::immutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const> const&, orc::Type const*, orc::ColumnVectorBatch*, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1532:5
    apache#7 0x55ee63ad4f86 in doris::vectorized::OrcReader::get_next_block_impl(doris::vectorized::Block*, unsigned long*, bool*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1714:13
    apache#8 0x55ee63ad093b in doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1547:5
```
### Solution
[Fix] (orc-reader) Fix StringRef nullptr data in orc-reader. When string
is empty in orc row batch, the data can point anything, maybe nullptr,
StringRef has undefined behavior when data is nullptr.

Related with apache#37845.
morningman pushed a commit that referenced this pull request Sep 26, 2024
morningman pushed a commit that referenced this pull request Oct 21, 2024
…alues empty. (#42061)

## Proposed changes

Added solution to #40857, fix StringRef nullptr data by add checking
string_values empty in orc reader.
morningman pushed a commit to morningman/doris that referenced this pull request Oct 21, 2024
…alues empty. (apache#42061)

## Proposed changes

Added solution to apache#40857, fix StringRef nullptr data by add checking
string_values empty in orc reader.
morningman pushed a commit to morningman/doris that referenced this pull request Oct 21, 2024
…alues empty. (apache#42061)

## Proposed changes

Added solution to apache#40857, fix StringRef nullptr data by add checking
string_values empty in orc reader.
morningman pushed a commit to morningman/doris that referenced this pull request Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.2-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants