-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](broker load)fix broker load fail when <column from path> column already exists in file. #58579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… already exists in file.
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 34154 ms |
TPC-DS: Total hot run time: 182769 ms |
ClickBench: Total hot run time: 27.27 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
… already exists in file. (#58579) ### What problem does this PR solve? Problem Summary: fix : ``` LOAD LABEL labelx ( DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet") INTO TABLE tableName FORMAT AS "parquet" (k2) COLUMNS FROM PATH AS (k1,k3) ) WITH S3 ( xxx) ``` and k1 or k3 column exists in test.parquet file. error: ``` type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412] assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int> 0# doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193 1# doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39 2# doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0 3# doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72 4# doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108 5# doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 6# doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 7# doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567 8# doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 9# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520 10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87 14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182 ```
… already exists in file. (apache#58579) Problem Summary: fix : ``` LOAD LABEL labelx ( DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet") INTO TABLE tableName FORMAT AS "parquet" (k2) COLUMNS FROM PATH AS (k1,k3) ) WITH S3 ( xxx) ``` and k1 or k3 column exists in test.parquet file. error: ``` type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412] assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int> 0# doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193 1# doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39 2# doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)apache#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0 3# doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72 4# doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108 5# doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 6# doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 7# doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567 8# doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 9# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520 10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87 14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182 ```
… already exists in file. (apache#58579) ### What problem does this PR solve? Problem Summary: fix : ``` LOAD LABEL labelx ( DATA INFILE("s3://bucket/load/k1=10/k3=30/test.parquet") INTO TABLE tableName FORMAT AS "parquet" (k2) COLUMNS FROM PATH AS (k1,k3) ) WITH S3 ( xxx) ``` and k1 or k3 column exists in test.parquet file. error: ``` type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = (127.0.0.1)[E-7412] assert cast err:[E-7412] Bad cast from type:doris::vectorized::ColumnVector<(doris::PrimitiveType)5> to doris::vectorized::ColumnStr<unsigned int> 0# doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&, bool) at /mnt/disk1/changyuwei/ldb_toolchain/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/unique_ptr.h:193 1# doris::Exception::Exception(doris::Status const&) at /mnt/disk1/changyuwei/doris/be/src/common/exception.h:39 2# doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&)::{lambda(auto:1&&)apache#1}::operator()<doris::vectorized::IColumn&>(doris::vectorized::IColumn&) const at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:0 3# doris::vectorized::ColumnStr<unsigned int>& assert_cast<doris::vectorized::ColumnStr<unsigned int>&, (TypeCheckOnRelease)1, doris::vectorized::IColumn&>(doris::vectorized::IColumn&) at /mnt/disk1/changyuwei/doris/be/src/vec/common/assert_cast.h:72 4# doris::vectorized::DataTypeStringSerDeBase<doris::vectorized::ColumnStr<unsigned int> >::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/vec/data_types/serde/data_type_string_serde.cpp:108 5# doris::vectorized::DataTypeNullableSerDe::deserialize_one_cell_from_json(doris::vectorized::IColumn&, doris::Slice&, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 6# doris::vectorized::DataTypeNullableSerDe::deserialize_column_from_fixed_json(doris::vectorized::IColumn&, doris::Slice&, unsigned long, unsigned long*, doris::vectorized::DataTypeSerDe::FormatOptions const&) const at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 7# doris::vectorized::RowGroupReader::_fill_partition_columns(doris::vectorized::Block*, unsigned long, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::SlotDescriptor const*> > > > const&) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:567 8# doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 9# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:520 10# doris::vectorized::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 11# doris::vectorized::FileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 12# doris::vectorized::Scanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/common/status.h:525 13# doris::vectorized::Scanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner.cpp:87 14# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/changyuwei/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:182 ```
What problem does this PR solve?
Problem Summary:
fix :
and k1 or k3 column exists in test.parquet file.
error:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)