-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std_detect: Use riscv_hwprobe on RISC-V Linux/Android #1762
Conversation
c29e969
to
41e579e
Compare
41e579e
to
3203fd5
Compare
// FIXME: e is not exposed in any of asm/hwcap.h, uapi/asm/hwcap.h, uapi/asm/hwprobe.h | ||
#[cfg(target_arch = "riscv32")] | ||
enable_feature( | ||
&mut value, | ||
Feature::rv32e, | ||
bit::test(auxv.hwcap, (b'e' - b'a').into()), | ||
); | ||
// FIXME: h is not exposed in uapi/asm/hwcap.h and uapi/asm/hwprobe.h | ||
enable_feature( | ||
&mut value, | ||
Feature::h, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is no chance for these to be enabled in the current Linux user mode, although...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that all the supervisor features should probably just be removed from feature detection and from the compiler. They are not relevant for the language/compiler.
flags: libc::c_uint, | ||
) -> libc::c_long { | ||
unsafe { | ||
libc::syscall( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use inline assembly for Linux here (like portable-atomic does), but since this module is currently behind the libc feature, and since std currently always depends on libc on all Linux targets, there is no need to complicate the code here.
3203fd5
to
1ece49d
Compare
That's pretty much the same as what I've testing and thanks for making progress earlier than me. |
We need to discuss about one topic before merging this: the fact that Linux supports disabling vector support per thread (current one / next context created by following The point is, support for vector extensions retrieved by a IMHO, the solution here would be, checking (on the first feature test call) whether vector extensions for the current thread are enabled by |
Given the structure of Rust/LLVM's target feature, I wonder if it is UB to call code that is not written in (inline or external) assembly in the time between disallowing vector instructions and re-allowing it. For example, if I compile code with |
My reading of https://docs.kernel.org/arch/riscv/vector.html is that it's mainly intended for system-level management of vector availability. They explicitly say that:
So we should just do that: hwcaps will reflect whether vector support is enabled for the current process. Note that hwprobe does not disable V if vector support is disabled for the process, so we need to use hwcaps in this case (or manually check with |
@Amanieu I confirmed that |
@taiki-e I think that's correct and I'm not worrying about static feature checking (if we use |
525a848
to
e26a29f
Compare
It reflects Vector enablement status, unlike hwprobe.
e26a29f
to
1474c1d
Compare
Updated PR (mostly based on feedback).
I thought that the use of hwprobe and prctl is probably more preferable because hwcap only supports detection of V extension, but hwprobe also supports detection of cases where only subset of V extension are available. However, qemu-user 9.2.1 does not seem to support that, so for now I implemented it using hwcap.
|
Let me give a time to test that (I'm struggling to prepare RISC-V 32-bit Linux environment with QEMU system emulation while preparing 64-bit counterpart worked well). |
(I tested this with qemu-riscv32 (user mode) using toolchain used in setup-cross-toolchain-action, and updated PR description to include the result.) |
Okay, finally I've built full Linux 6.14 + Buildroot + glibc RISC-V environments (32/64) + Rust toolchain and tested your code is working as expected (I could not accept your changes while no real Linux environment is tested). I will likely make some tidying but after you make most of the changes (because I found your Recommendation: Approval Sidenote: not just changing the |
// const RISCV_HWPROBE_EXT_SUPM: u64 = 1 << 49; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, Linux 6.15 will also add the following.
(Although we cannot include them in this PR because 6.15 is not yet released and could be changed.)
const RISCV_HWPROBE_EXT_ZICNTR: u64 = 1 << 50;
const RISCV_HWPROBE_EXT_ZIHPM: u64 = 1 << 51;
const RISCV_HWPROBE_EXT_ZFBFMIN: u64 = 1 << 52;
const RISCV_HWPROBE_EXT_ZVFBFMIN: u64 = 1 << 53;
const RISCV_HWPROBE_EXT_ZVFBFWMA: u64 = 1 << 54;
const RISCV_HWPROBE_EXT_ZICBOM: u64 = 1 << 55;
const RISCV_HWPROBE_EXT_ZAAMO: u64 = 1 << 56;
const RISCV_HWPROBE_EXT_ZALRSC: u64 = 1 << 57;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I'm writing full riscv_hwprobe
feature probing (including OS-independent logic for stdarch
) and related feature sets (for Rust) and... I think that will exceed 93, the current limit of the 12-byte cache.
// FIXME: we can implement this by getting the current vlen | ||
// zvl*b: Minimum Vector Length Standard Extensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite that I can allow this comment alone (so that you can keep up with this PR), I disagree (actually) adding all Zvl*b
extensions (in total of 12; *
being a power of two between 32
and 65536
inclusive) for being too redundant (e.g. 256
implies 128
, 128
implies 64
and so on) and either:
- Unlikely to be referred by real-world programs (due to the software-side nature of RVV) or
- Likely that (for specialized programs) just knowing the vector length is not enough (it is said that implementing RVV to the hardware is harder than implementing regular SIMD instructions and some reports suggested that the actual performance boost heavily depends on the actual vector engine implementation) so that adding
Zvl*b
alone will not help them much.
I acknowledge that there's still a need to retrieve the vector register size and I think an intrinsic (which returns the vector register size) would be a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It indeed makes more sense to have it as a core::arch intrinsic, since nothing OS-specific is needed to get the VLEN, and the actual length information can be obtained in a single call.
Removed that code comment.
// const RISCV_HWPROBE_EXT_SUPM: u64 = 1 << 49; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I'm writing full riscv_hwprobe
feature probing (including OS-independent logic for stdarch
) and related feature sets (for Rust) and... I think that will exceed 93, the current limit of the 12-byte cache.
I noticed that I haven't pressed the "Submit review" button. |
} | ||
}; | ||
if out[0].key != -1 { | ||
let ima_ext_0 = out[0].value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems using this key alone is not robust enough (checking the IMA
base is necessary).
I opened #1770 for my similar proposal with OS-independent extension implication logic and with fixes including this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Earlier versions of this PR actually included that check, but it was accidentally removed when I changed to use hwcap.
3203fd5#diff-dfab5b675012c4be84a409e8c4f92c8e172e797caf07a23516d1fda390fb9818R152
I opened #1770 to shape the best feature detection logic in my mind. |
Interestingly, in the process reviewing your PR, I found that your direct |
Closing in favor of #1770. |
On RISC-V, detection using auxv only supports single-letter extensions.
So, this PR uses riscv_hwprobe that supports multi-letter extensions if available to support more target features. https://www.kernel.org/doc/html/latest/arch/riscv/hwprobe.html
I had originally planned to open a PR after the nightly release that included rust-lang/rust#138742 merged today, but there was a question about this (rust-lang/rust#139139), so I opened it earlier.
Closes rust-lang/rust#139139
Tested with QEMU 9.2.1 user mode:
riscv64gc-unknown-linux-gnu
riscv32gc-unknown-linux-gnu