Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ARM Neon compilation and MulAdd implementation #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Rogiel
Copy link

@Rogiel Rogiel commented Dec 26, 2023

This fixes an incorrect implementation for FMA on NEON.

I also had issues compiling the library due to the the enable_if constraint on NEON f32 register constructor. Looking at the other implementations, they don't have the enable_if, but I wasn't entirely sure what it's purpose was.

Here's a before an after for comparison:
image
image

@Auburn
Copy link
Owner

Auburn commented Dec 29, 2023

Thanks, did you run the tests on MacOS ARM? I don't have a Mac device to test it myself

@Rogiel
Copy link
Author

Rogiel commented Dec 29, 2023

Yes, I ran FastNoise2 (NewFastSIMD branch) on a MacBook with the M1 CPU.

@Auburn
Copy link
Owner

Auburn commented Jan 3, 2024

The enable_if on the mask register is to avoid ambiguous implicit conversions. The AVX512 m32 also uses it, without it there are compiler errors. What error are you getting with the enable_if?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants