Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Windows does not define __ARM_NEON #363

Merged
merged 3 commits into from
Sep 5, 2024
Merged

Conversation

JVital2013
Copy link
Contributor

On MSVC for ARM/ARM64, __ARM_NEON is not defined. However, Windows requires NEON on ARM, so it should be safe to assume that if CPU_FEATURES_ARCH_ANY_ARM is defined and we're in MSVC, NEON is also available.

There may be a better way to do this, but this patch has helped get my project running at full speed on Windows ARM64 machines and wanted to share. Thanks for this great library!

However, Windows requires NEON to work, so assume it is there if build with MSVC
@@ -235,7 +235,7 @@
#endif // defined(CPU_FEATURES_ARCH_X86)

#if defined(CPU_FEATURES_ARCH_ANY_ARM)
#if defined(__ARM_NEON)
#if defined(__ARM_NEON) || defined(_MSC_VER)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few notes about arm versions and Windows support:

Armv8-a and higher

NEON/AdvSIMD is MANDATORY for armv8-a and higher:

FP/SIMD must be implemented on all Armv8.0 implementations, but implementations targeting specialized markets may support the following combinations:

  • No NEON or floating-point.
  • Full floating-point and SIMD support with exception trapping.
  • Full floating-point and SIMD support without exception trapping.

Armv7

According to ARM Architecture Reference Manual ARMv7-A and ARMv7-R A1.4.1 Instruction set architecture extensions:

Advanced SIMDv1:

  • It is an OPTIONAL extension to the ARMv7-A and ARMv7-R profiles

Hence, Arm CPU may not support NEON instructions for Armv7.

The likely target platform that can be used for Windows Arm 32-bit is UWP. (
Universal Windows Platform). However, Microsoft announced that Arm32 UWP is deprecated and will be removed

Windows devices running on an Arm processor (for example, Snapdragon processors from Qualcomm) will no longer support AArch32 (Arm32). This change impacts Universal Windows Platform apps that presently target AArch32 (Arm32). Support for 32-bit Arm versions of applications will be removed in a future release of Windows 11.. System binaries for ARM32 support (present in the sysarm32 folder) will also be removed. After this change, for the small number of applications affected, app features might be different and you might notice a difference in performance. Therefore, we recommend updating your targeted platforms to AArch64 (Arm64), which is supported on all Windows on Arm devices, as soon as possible in order to ensure your customers can continue to enjoy the best possible experience. Follow the guidance on this page to update your applications to AArch64 (Arm64).

ref:
https://www.microsoft.com/en-us/windows/windows-11-specifications?r=1#table3
https://learn.microsoft.com/en-us/windows/arm/arm32-to-arm64

Therefore, there is unlikely to be a case of using an AARCH32 state application on Windows Armv8+.
Also, please note that we do not have support Windows ARM, only ARM64.

So, I think it is safe to add this patch.

@gchatelet, @Mizux, any objections?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! Thanks for the positive review. It is a good point that NEON is technically not required on armv7, from an architecture perspective.

From my research, though, Windows on ARM does seem to require NEON to be present on armv7 despite the architecture's requirements.

Just my two cents on why I was comfortable with this code change. I welcome other thoughts!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the rationale of the fix.
Can we rewrite it like the following? I think it would be clearer

// Note: MSVC targeting ARM does not define `__ARM_NEON` but Windows on ARM requires it.
// In that case we force NEON detection.
#if defined(__ARM_NEON) || (DEFINED(CPU_FEATURES_COMPILER_MSC) && defined(CPU_FEATURES_ARCH_ANY_ARM))
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 1
#else
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 0
#endif

Copy link
Contributor Author

@JVital2013 JVital2013 Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any problem with this change; I changed it

@@ -235,7 +235,7 @@
#endif // defined(CPU_FEATURES_ARCH_X86)

#if defined(CPU_FEATURES_ARCH_ANY_ARM)
#if defined(__ARM_NEON)
#if defined(__ARM_NEON) || defined(_MSC_VER)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the rationale of the fix.
Can we rewrite it like the following? I think it would be clearer

// Note: MSVC targeting ARM does not define `__ARM_NEON` but Windows on ARM requires it.
// In that case we force NEON detection.
#if defined(__ARM_NEON) || (DEFINED(CPU_FEATURES_COMPILER_MSC) && defined(CPU_FEATURES_ARCH_ANY_ARM))
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 1
#else
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 0
#endif

@gchatelet
Copy link
Collaborator

@JVital2013 do you mind clang-formating the patch?

@JVital2013
Copy link
Contributor Author

@gchatelet I fixed the formatting, sorry about that!

@gchatelet gchatelet merged commit 6aecde5 into google:main Sep 5, 2024
31 checks passed
@gchatelet gchatelet added the bug Something isn't working label Sep 5, 2024
@gchatelet gchatelet added this to the v0.10.0 milestone Sep 5, 2024
gchatelet added a commit that referenced this pull request Sep 10, 2024
This is a fix for #363 . The preprocessor check must be done in two steps.
gchatelet added a commit that referenced this pull request Sep 10, 2024
This is a fix for #363 . The preprocessor check must be done in two steps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants