-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64][FMV] Incorrect SVE detection on SVE2 cores with FMV #93651
Comments
@llvm/issue-subscribers-backend-aarch64 Author: Kinoshita Kotaro (kinoshita-fj)
# Problem
Compiling the following program using FMV with #include <stdio.h>
#include <stdlib.h>
__attribute__ ((target_version("sve")))
void func1(void){
printf("SVE\n");
}
__attribute__ ((target_version("default")))
void func1(void){
printf("not SVE\n");
}
int main(int argc, char **argv){
func1();
return 0;
} Expected behaviorOn SVE2 cores and SVE cores: The func1 with Actual behaviorOn SVE2 cores: The func1 with On SVE cores: The func1 with EnvironmentAt least Clang 18.1.2 and the latest main branch (commit b80d982). CauseThe most likely cause is that the conditions to determine SVE support are incorrect. They are used at
Possible FixCores that support SVE2 also support SVE. The code should be modified as follows. // ID_AA64PFR0_EL1.SVE != 0b0000
if (extractBits(ftr, 32, 4) != 0x0) {
// get ID_AA64ZFR0_EL1, that name supported
// if sve enabled only
getCPUFeature(S3_0_C0_C4_4, ftr);
- // ID_AA64ZFR0_EL1.SVEver == 0b0000
- if (extractBits(ftr, 0, 4) == 0x0)
setCPUFeature(FEAT_SVE);
// ID_AA64ZFR0_EL1.SVEver == 0b0001
if (extractBits(ftr, 0, 4) == 0x1)
setCPUFeature(FEAT_SVE2);
// ID_AA64ZFR0_EL1.BF16 != 0b0000
if (extractBits(ftr, 20, 4) != 0x0)
setCPUFeature(FEAT_SVE_BF16);
} A similar issue is posted in the ACLE, but it is not yet fixed. |
Agreed - one must never use equality when comparing CPUID registers since newer versions imply older versions, so the correct sequence would be like:
However this code should simply use the HWCAPS rather than trying to interpret CPUID registers incorrectly. |
Thank you for report, HWCAP detection is used in __init_cpu_features_constructor when available and corresponds to FMV feature. Seems we need to amend ACLE FMV specification first and then fix detection in runtime libraries. |
@ilinpv Thank you for your comment. I will wait for amending ACLE specification and runtime libraries. |
ACLE specification amendment on review ARM-software/acle#322 |
See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115342 - this has now been fixed in GCC for all currently specified features. Additionally the initialization must be done atomically to avoid race conditions. |
To detect features we either use HWCAPs or directly extract system register bitfields and compare with a value. In many cases equality comparisons give wrong results for example FEAT_SVE is not set if SVE2 is available (see the issue llvm#93651). I am also making the access to __aarch64_cpu_features atomic. The corresponding PR for the ACLE specification is ARM-software/acle#322.
To detect features we either use HWCAPs or directly extract system register bitfields and compare with a value. In many cases equality comparisons give wrong results for example FEAT_SVE is not set if SVE2 is available (see the issue #93651). I am also making the access to __aarch64_cpu_features atomic. The corresponding PR for the ACLE specification is ARM-software/acle#322.
@labrinea I confirmed that SVE detection was fixed. Thank you. |
To detect features we either use HWCAPs or directly extract system register bitfields and compare with a value. In many cases equality comparisons give wrong results for example FEAT_SVE is not set if SVE2 is available (see the issue llvm#93651). I am also making the access to __aarch64_cpu_features atomic. The corresponding PR for the ACLE specification is ARM-software/acle#322.
Problem
Compiling the following program using FMV with
-rtlib=compiler-rt
and running it on cores supporting SVE2 results in the execution of the default function.This indicates that the
target_version("SVE")
attribute is not being correctly detected on SVE2 cores, because SVE2 contains SVE. However, the detection appears to be accurate on SVE-only cores.Expected behavior
On SVE2 cores and SVE cores:
The func1 with
target_version("sve")
should be executed. And it should printSVE
.Actual behavior
On SVE2 cores:
The func1 with
target_version("default")
is executed instead. And it printednot SVE
.On SVE cores:
The func1 with
target_version("sve")
is executed correctly. And it printedSVE
.Environment
At least Clang 18.1.2 and the latest main branch (commit b80d982).
Cause
The most likely cause is that the conditions to determine SVE support are incorrect.
They are used at
__init_cpu_features_constructor
function which is generated when FMV features.llvm-project/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/mrs.inc
Possible Fix
Cores that support SVE2 also support SVE. The code should be modified as follows.
A similar issue is posted in the ACLE, but it is not yet fixed.
The text was updated successfully, but these errors were encountered: