Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

understanding SIMD intrinsics #62

Closed
CeleritasCelery opened this issue Aug 28, 2020 · 2 comments
Closed

understanding SIMD intrinsics #62

CeleritasCelery opened this issue Aug 28, 2020 · 2 comments

Comments

@CeleritasCelery
Copy link

I read the blog post about fast byte counting and I was reading through the code to better understand the SIMD intrinsic. I was mostly able to follow along but I got confused at this part

counts = _mm256_sub_epi8(
counts,
_mm256_cmpeq_epi8(mm256_from_offset(haystack, offset), needles)
);

I don't understand why we are subtracting the results of the compare instead of adding them. It seems like this will give incorrect results, but obviously doesn't, and I can't explain why. Could someone explain it to me?

@llogiq
Copy link
Owner

llogiq commented Aug 28, 2020

The reason for that is that the result of _mm256_cmpeq_epi8 is either all 1s or all 0s. And an i8 with all 1s is -1. So we need to subtract that to add 1 wherever the byte is equal.

@CeleritasCelery
Copy link
Author

Thanks, that makes sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants