Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-848] Optimize performance for Column2Row #908

Merged
merged 29 commits into from
May 18, 2022

Conversation

zhixingheyi-tian
Copy link
Collaborator

@zhixingheyi-tian zhixingheyi-tian commented May 9, 2022

What changes were proposed in this pull request?

Optimize C2R performance:

  • Avoid branch prediction
  • Inline small functions
  • Use AVX2, AVX512 instructions
  • CPU cache prefetch
  • Instruction cache

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@github-actions
Copy link

github-actions bot commented May 9, 2022

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/native-sql-engine/issues

Then could you also rename commit message and pull request title in the following format?

[NSE-${ISSUES_ID}] ${detailed message}

See also:

@zhouyuan zhouyuan merged commit 4b2a9df into oap-project:main May 18, 2022
const std::vector<int64_t>& GetOffsets() { return offsets_; }
const std::vector<int64_t>& GetLengths() { return lengths_; }
const std::vector<int32_t>& GetOffsets() { return offsets_; }
const std::vector<int32_t, boost::alignment::aligned_allocator<int32_t, 32>>&
Copy link

@copperybean copperybean May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second template parameter with value 32 is misspelled.
boost::alignment::aligned_allocator<int32_t, 32>>
It should be 4?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used here for 256 bit aligned.

__m256i dst_length_8x = _mm256_loadu_si256((__m256i*)length_data);

@zhixingheyi-tian here should be _mm256_loada_si256 not laodu

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your explanation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used here for 256 bit aligned.

__m256i dst_length_8x = _mm256_loadu_si256((__m256i*)length_data);

@zhixingheyi-tian here should be _mm256_loada_si256 not laodu

Yes, updated in #937.
Use _mm256_load_si256 instead of _mm256_loadu_si256.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants