Skip to content

re-implement crc32c merge algorithm with SIMD #701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

lihuiba
Copy link
Collaborator

@lihuiba lihuiba commented Jan 21, 2025

to reduce memory & cache footprint, as well as lines of code.

@lihuiba lihuiba requested review from beef9999 and Coldwings January 21, 2025 03:06
crc = crc32c(crc2, ptr[blksz_8 * 3 - 1] ^ t);
auto k = &crc32_merge_table_pclmulqdq[(blksz/8 - 1)*2];
__m128i c0 = {(long)crc}, c1 = {(long)crc1};
auto t = SSE::pclmulqdq<0x00>(c0, k) ^
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using

constexpr auto k = crc_merge_table_pclmulqdq[(blksz/8 - 1)*2];
__m128i c0 = {(long)crc}, c1 = {(long)crc1};
auto t = SSE::pclmulqdq<0x00>(c0, &k) ^ SSE::pclmulqdq<0x10>(c1, &k);

make k as a constexpr uint64_t, not a reference,
should be better showing k will be a compile time const and crc32_merge_table_pclmulqdq will be drop by optimization.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blksz_8 could also be constexpr

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using

constexpr auto k = crc_merge_table_pclmulqdq[(blksz/8 - 1)*2];
__m128i c0 = {(long)crc}, c1 = {(long)crc1};
auto t = SSE::pclmulqdq<0x00>(c0, &k) ^ SSE::pclmulqdq<0x10>(c1, &k);

make k as a constexpr uint64_t, not a reference, should be better showing k will be a compile time const and crc32_merge_table_pclmulqdq will be drop by optimization.

Seems crc_merge_table_pclmulqdq is a uint64_t table, but actually used as a uint128 table.
so constexpr is not a available choice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs the address of the elements in the table (as uint64_t*), and loads 2 uint64_ts from that address, so we can not take the address of local variable k.

@lihuiba lihuiba merged commit 1ed0b2e into alibaba:main Jan 22, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants