-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM64] Incorrect HFA/HVA property calculation #35144
Comments
@AntonLapounov and I spent a lot of time discussing this earlier this week. I believe his assertions above are correct. It should also be noted that @AntonLapounov compared VC++ compiler and our JIT regarding handling doubles and vector types to confirm our analysis. |
@CarolEidt needs to be aware of this. |
For the // Correctly treated as HVA(simd16): passed in q0, q1, q2
struct S3 { Vector128<byte> x; WrappedVector256 y; } |
Thanks @davidwrighton - I saw this. This is unfortunate as it will require a more complex interaction between the VM and the JIT. I suspect it needs some VM/JIT collaboration to address. |
For reference, the C++ definitions below are equivalent to the #include <arm_neon.h>
// HVA(__n64): passed in d0, d1 registers
struct S1 {
struct { uint8x8_t z; } x;
uint8x8_t y;
};
// Non-HFA/HVA: passed in x0, x1 registers
struct S2 {
struct { uint8x8_t z; } x;
double y;
};
void Foo(S1 x) { }
void Foo(S2 x) { } |
Yes, but that might be not future-proof if one day Neon registers are extended to 256 bits. |
Possibly true. Although we have Arm64 feature bits to controll what architecture we are targeting.
It is unlikely. The next generation SIMD for ARM is the scalable vector extension. It generalizes the handling of vectors. It is designed to allow hardware to add support for longer vectors (up to 2048 bits). The extending neon registers would require new instructions. As neon encodes the vector length in the instruction. |
@sdmaclea Sounds reasonable. Anyway, we still have a bug that the two structs below are handled differently at present. That is caused by ignoring the size of the nested // Treated as HVA(simd16): passed by value in q0, q1, q2
struct S3 { Vector128<byte> x; WrappedVector256 y; }
// Treated as non-HVA: passed by reference in x0
struct S4 { Vector128<byte> x; Vector256<byte> y; } |
Calling conventions for Vector256 is quite complex. For instance on X64, the appropriate calling convention changes based on whether or not the architecture has AVX support. I'm slowing working on a fix to make the runtime/JIT handle this correctly, but its slow going. |
* Fix HFA/HVA classification Fix #35144
While reviewing changes in dotnet/coreclr#23675, I noticed that the code added to
EEClass::CheckForHFA
does not handle wrappedVector64
,Vector128
, andVector256
intrinsic types correctly. For instance, it does not distinguish a wrappedVector64
and adouble
. Moreover, theelemSize
check is skipped for wrappedVector128
andVector256
types. As a result, the HFA/HVA property may be calculated incorrectly. For instance (here{ }
denotes a struct):{{Vector64}, Vector64}
is incorrectly treated as non-HVA.{{Vector64}, double}
is incorrectly treated as HFA(double).{Vector128, {Vector256}}
is incorrectly treated as HVA(simd16).The code in question:
runtime/src/coreclr/src/vm/class.cpp
Lines 1345 to 1381 in ce416f4
The repro program (
set COMPlus_JITDisasm=C::*
to see HFA/HVA properties and registers used):@CarolEidt @echesakovMSFT @sdmaclea
The text was updated successfully, but these errors were encountered: