-
Notifications
You must be signed in to change notification settings - Fork 75
[NSE-848] Optimize performance for Column2Row #908
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/native-sql-engine/issues Then could you also rename commit message and pull request title in the following format?
See also: |
native-sql-engine/cpp/src/operators/columnar_to_row_converter.cc
Outdated
Show resolved
Hide resolved
const std::vector<int64_t>& GetOffsets() { return offsets_; } | ||
const std::vector<int64_t>& GetLengths() { return lengths_; } | ||
const std::vector<int32_t>& GetOffsets() { return offsets_; } | ||
const std::vector<int32_t, boost::alignment::aligned_allocator<int32_t, 32>>& |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the second template parameter with value 32 is misspelled.
boost::alignment::aligned_allocator<int32_t, 32>>
It should be 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used here for 256 bit aligned.
gazelle_plugin/native-sql-engine/cpp/src/operators/columnar_to_row_converter_avx512.cc
Line 161 in 47af257
__m256i dst_length_8x = _mm256_loadu_si256((__m256i*)length_data); |
@zhixingheyi-tian here should be _mm256_loada_si256
not laodu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used here for 256 bit aligned.
gazelle_plugin/native-sql-engine/cpp/src/operators/columnar_to_row_converter_avx512.cc
Line 161 in 47af257
__m256i dst_length_8x = _mm256_loadu_si256((__m256i*)length_data); @zhixingheyi-tian here should be
_mm256_loada_si256
not laodu
Yes, updated in #937.
Use _mm256_load_si256 instead of _mm256_loadu_si256.
What changes were proposed in this pull request?
Optimize C2R performance:
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)