Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M2 Compatibility #5

Closed
sfjohnson opened this issue Mar 8, 2023 · 5 comments
Closed

M2 Compatibility #5

sfjohnson opened this issue Mar 8, 2023 · 5 comments
Assignees

Comments

@sfjohnson
Copy link

Hi,

On my M2 (2022 MacBook Air) I'm getting the following:

AMX_LDX: fail
AMX_LDY: fail
AMX_LDZ: pass
AMX_LDZI: pass
AMX_STX: pass
AMX_STY: pass
AMX_STZ: pass
AMX_STZI: pass
AMX_EXTRX: fail
AMX_EXTRY: fail
AMX_MAC16: pass
AMX_FMA16: pass
AMX_FMA32: pass
AMX_FMA64: pass
AMX_FMS16: pass
AMX_FMS32: pass
AMX_FMS64: pass
AMX_VECINT: fail
AMX_VECFP: fail
AMX_MATINT: pass
AMX_MATFP: fail
AMX_GENLUT: fail

This is with clang 14.0.0 on macOS 13.1. I could be doing something wrong here, or there might be a minor architectural difference between AMX on M1 and M2.

I'm going to investigate further to see if I can get everything to pass on M2, but first I was wondering if there has been any existing work done on M2 yet?

Thanks.

@corsix
Copy link
Owner

corsix commented Mar 8, 2023

I've just rented an M2 machine in the cloud for a month, and I see the same thing. My money would be on "minor architectural difference"...

@corsix corsix self-assigned this Mar 8, 2023
@sfjohnson
Copy link
Author

Ok cool, I'll explore a bit and let you know if I find anything!

@corsix
Copy link
Owner

corsix commented Mar 10, 2023

Some of the changes seem to be:

  • extrh, extrv, vecfp, matfp, genlut gaining bf16 modes (the CPU cores also gain BF16 support in M2)
  • extrh and extrv gaining some f32 -> f16 (or bf16) mixed lane-width modes
  • ldx and ldy gaining "load four at a time" mode
  • extrh, extrv, vecint, vecfp gaining support for operating on two or four vectors in a single instruction, rather than just a single vector
  • vecint and vecfp gaining some new ALU modes (albeit not particularly interesting modes, e.g. z = x * y, z = z + x, z = z + y)

@corsix
Copy link
Owner

corsix commented Mar 12, 2023

Commits pushed to reflect the changes in the above comment, in both the documentation and the emulation code.

@corsix corsix closed this as completed Mar 12, 2023
@56789KD
Copy link

56789KD commented Sep 13, 2024

Add SIMD comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants