Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64 __popcnt intrinsics #4683

Closed
StephanTLavavej opened this issue May 22, 2024 · 0 comments · Fixed by #4695
Closed

ARM64 __popcnt intrinsics #4683

StephanTLavavej opened this issue May 22, 2024 · 0 comments · Fixed by #4695
Labels
ARM64 Related to the ARM64 architecture fixed Something works now, yay! performance Must go faster

Comments

@StephanTLavavej
Copy link
Member

In VS 2022 17.11 Preview 1, MSVC-PR-530436 updated intrin0.inl.h to extend the previously x86/x64 __popcnt intrinsic family to ARM64:

__MACHINEX86_X64_ARM64(unsigned int __popcnt(unsigned int))
__MACHINEX86_X64_ARM64(unsigned short __popcnt16(unsigned short))
__MACHINEARM64_X64(unsigned __int64 __popcnt64(unsigned __int64))

It's unclear to me whether it would be simpler and/or faster to replace our existing ARM64 codepath:

#if _HAS_NEON_INTRINSICS
_NODISCARD inline int _Arm64_popcount(const unsigned long long _Val) noexcept {
const __n64 _Temp = neon_cnt(__uint64ToN64_v(_Val));
return neon_addv8(_Temp).n8_i8[0];
}
#endif // _HAS_NEON_INTRINSICS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ARM64 Related to the ARM64 architecture fixed Something works now, yay! performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant