Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Dec 1, 2025

What problem does this PR solve?

Problem Summary:
fix :

LOAD LABEL labelx
(
    DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet")
    INTO TABLE tableName
    FORMAT AS "parquet"
    (k2)
    COLUMNS FROM PATH AS (k1,k3)
)
WITH S3
( xxx)

and k1 or k3 column exists in test.parquet file.

error:

type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412]
assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int>

        0#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193
        1#  doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39
        2#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0
        3#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72
        4#  doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108
        5#  doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        6#  doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        7#  doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567
        8#  doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        9#  doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520
        10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87
        14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 1, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

hubgeter commented Dec 1, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34154 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2fa74897de525b188f198175cc6a85007f654b94, data reload: false

------ Round 1 ----------------------------------
q1	17632	5071	4998	4998
q2	2107	314	213	213
q3	10210	1349	730	730
q4	10237	932	334	334
q5	7609	2492	2140	2140
q6	210	177	138	138
q7	947	782	641	641
q8	9351	1409	1045	1045
q9	6922	5299	5342	5299
q10	6786	2208	1833	1833
q11	522	328	290	290
q12	340	366	232	232
q13	17782	3681	3043	3043
q14	234	241	221	221
q15	596	532	509	509
q16	892	879	844	844
q17	589	805	487	487
q18	7489	7281	7039	7039
q19	1096	967	604	604
q20	346	346	234	234
q21	3946	3284	2323	2323
q22	1003	983	957	957
Total cold run time: 106846 ms
Total hot run time: 34154 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5019	5000	4987	4987
q2	322	390	306	306
q3	2149	2726	2291	2291
q4	1314	1757	1328	1328
q5	4213	4449	4723	4449
q6	222	174	132	132
q7	2118	2027	1871	1871
q8	2625	2497	2532	2497
q9	7549	7415	7617	7415
q10	2950	3249	2872	2872
q11	592	519	504	504
q12	727	757	612	612
q13	3636	3971	3491	3491
q14	295	314	282	282
q15	545	535	521	521
q16	920	911	896	896
q17	1174	1523	1455	1455
q18	7932	7824	7615	7615
q19	834	841	863	841
q20	2052	2079	1951	1951
q21	4979	4603	4147	4147
q22	1108	1053	964	964
Total cold run time: 53275 ms
Total hot run time: 51427 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182769 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2fa74897de525b188f198175cc6a85007f654b94, data reload: false

query1	1071	415	406	406
query2	6598	1200	1150	1150
query3	6741	238	225	225
query4	25809	23031	23047	23031
query5	5675	656	494	494
query6	359	247	249	247
query7	4666	526	306	306
query8	311	257	244	244
query9	8728	2657	2663	2657
query10	571	370	302	302
query11	15340	14954	15207	14954
query12	184	121	113	113
query13	1713	568	447	447
query14	10594	6175	6113	6113
query15	216	204	188	188
query16	7669	698	533	533
query17	1633	796	646	646
query18	2059	447	343	343
query19	234	220	184	184
query20	127	126	122	122
query21	224	136	114	114
query22	3823	3879	3886	3879
query23	32935	32207	32162	32162
query24	8271	2452	2414	2414
query25	658	554	507	507
query26	1253	295	173	173
query27	2688	508	356	356
query28	4324	2194	2166	2166
query29	845	664	527	527
query30	310	236	212	212
query31	814	717	646	646
query32	88	78	127	78
query33	598	394	338	338
query34	838	890	546	546
query35	786	831	743	743
query36	891	931	832	832
query37	130	123	90	90
query38	3839	3916	3829	3829
query39	1473	1406	1405	1405
query40	223	133	122	122
query41	70	64	65	64
query42	123	113	114	113
query43	446	456	422	422
query44	1314	759	755	755
query45	204	200	193	193
query46	941	1007	644	644
query47	1695	1737	1609	1609
query48	396	425	342	342
query49	811	508	411	411
query50	691	710	432	432
query51	3841	3936	3970	3936
query52	115	114	106	106
query53	253	261	191	191
query54	327	306	297	297
query55	103	91	88	88
query56	327	348	352	348
query57	1143	1189	1105	1105
query58	294	288	302	288
query59	2288	2427	2255	2255
query60	375	357	332	332
query61	167	162	204	162
query62	783	714	677	677
query63	235	201	193	193
query64	4497	1219	925	925
query65	4076	3999	3978	3978
query66	1072	441	341	341
query67	15090	14857	14777	14777
query68	8495	968	623	623
query69	544	343	318	318
query70	1112	1005	1019	1005
query71	527	352	321	321
query72	5806	4932	4815	4815
query73	692	578	352	352
query74	8786	8741	8656	8656
query75	3761	3047	2542	2542
query76	3840	1144	747	747
query77	822	415	341	341
query78	9553	9606	8798	8798
query79	1986	860	575	575
query80	654	588	506	506
query81	510	272	243	243
query82	506	142	114	114
query83	283	272	263	263
query84	263	111	99	99
query85	910	511	461	461
query86	343	293	307	293
query87	4039	4098	4036	4036
query88	3801	2314	2302	2302
query89	399	338	296	296
query90	1984	232	226	226
query91	174	176	139	139
query92	91	73	70	70
query93	1472	1025	667	667
query94	724	461	343	343
query95	503	440	417	417
query96	517	567	297	297
query97	2634	2718	2591	2591
query98	246	219	214	214
query99	1295	1399	1271	1271
Total cold run time: 274258 ms
Total hot run time: 182769 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.27 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2fa74897de525b188f198175cc6a85007f654b94, data reload: false

query1	0.06	0.04	0.04
query2	0.10	0.04	0.05
query3	0.27	0.09	0.09
query4	1.60	0.11	0.11
query5	0.27	0.27	0.25
query6	1.20	0.64	0.64
query7	0.04	0.03	0.02
query8	0.06	0.04	0.05
query9	0.58	0.52	0.50
query10	0.56	0.56	0.56
query11	0.16	0.11	0.12
query12	0.15	0.11	0.12
query13	0.63	0.60	0.61
query14	1.00	1.01	0.97
query15	0.83	0.80	0.80
query16	0.40	0.39	0.42
query17	1.07	1.08	1.04
query18	0.24	0.22	0.21
query19	1.89	1.81	1.81
query20	0.02	0.01	0.01
query21	15.45	0.27	0.14
query22	4.84	0.06	0.05
query23	16.14	0.27	0.10
query24	0.94	0.67	0.23
query25	0.11	0.06	0.05
query26	0.14	0.13	0.13
query27	0.07	0.06	0.04
query28	3.05	1.23	1.02
query29	12.58	3.87	3.19
query30	0.28	0.14	0.12
query31	2.82	0.62	0.39
query32	3.23	0.54	0.46
query33	3.07	3.08	3.13
query34	17.01	5.25	4.53
query35	4.51	4.55	4.62
query36	0.65	0.50	0.48
query37	0.10	0.07	0.07
query38	0.08	0.04	0.04
query39	0.05	0.03	0.03
query40	0.18	0.15	0.14
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 96.6 s
Total hot run time: 27.27 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 60.00% (3/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.82% (18492/35009)
Line Coverage 38.35% (169339/441560)
Region Coverage 33.12% (131364/396609)
Branch Coverage 34.09% (56596/166021)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (5/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.66% (24598/34326)
Line Coverage 58.21% (256804/441166)
Region Coverage 53.37% (214237/401420)
Branch Coverage 54.91% (91661/166918)

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Dec 2, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (5/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.66% (24598/34326)
Line Coverage 58.21% (256804/441166)
Region Coverage 53.37% (214237/401420)
Branch Coverage 54.91% (91661/166918)

@morningman morningman merged commit 7bebd3d into apache:master Dec 2, 2025
30 of 32 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 2, 2025
… already exists in file. (#58579)

### What problem does this PR solve?

Problem Summary:
fix :
```
LOAD LABEL labelx
(
    DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet")
    INTO TABLE tableName
    FORMAT AS "parquet"
    (k2)
    COLUMNS FROM PATH AS (k1,k3)
)
WITH S3
( xxx)
```

and k1 or k3 column exists in test.parquet file.

error:

```
type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412]
assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int>

        0#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193
        1#  doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39
        2#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0
        3#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72
        4#  doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108
        5#  doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        6#  doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        7#  doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567
        8#  doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        9#  doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520
        10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87
        14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
```
yiguolei pushed a commit that referenced this pull request Dec 2, 2025
…path> column already exists in file. #58579 (#58621)

Cherry-picked from #58579

Co-authored-by: daidai <changyuwei@selectdb.com>
hubgeter added a commit to hubgeter/doris that referenced this pull request Dec 10, 2025
… already exists in file. (apache#58579)

Problem Summary:
fix :
```
LOAD LABEL labelx
(
    DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet")
    INTO TABLE tableName
    FORMAT AS "parquet"
    (k2)
    COLUMNS FROM PATH AS (k1,k3)
)
WITH S3
( xxx)
```

and k1 or k3 column exists in test.parquet file.

error:

```
type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412]
assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int>

        0#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193
        1#  doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39
        2#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)apache#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0
        3#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72
        4#  doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108
        5#  doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        6#  doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        7#  doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567
        8#  doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        9#  doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520
        10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87
        14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
```
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
… already exists in file. (apache#58579)

### What problem does this PR solve?

Problem Summary:
fix :
```
LOAD LABEL labelx
(
    DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet")
    INTO TABLE tableName
    FORMAT AS "parquet"
    (k2)
    COLUMNS FROM PATH AS (k1,k3)
)
WITH S3
( xxx)
```

and k1 or k3 column exists in test.parquet file.

error:

```
type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412]
assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int>

        0#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193
        1#  doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39
        2#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)apache#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0
        3#  doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72
        4#  doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108
        5#  doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        6#  doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        7#  doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567
        8#  doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        9#  doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520
        10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525
        13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87
        14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182
```
morrySnow pushed a commit that referenced this pull request Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.x dev/3.1.4-merged dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants