Skip to content

x86_64 intrinsic missing #1009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
raphaelcohn opened this issue Feb 17, 2021 · 6 comments
Closed

x86_64 intrinsic missing #1009

raphaelcohn opened this issue Feb 17, 2021 · 6 comments

Comments

@raphaelcohn
Copy link

The intrinsic _mm512_cvtsi512_si32 is missing from x86_64 std::arch.

I suspect this is because it wasn't always present; for example, this godbolt example (from this stack overflow answer defines it if missing).

I'm in no place to implement this myself, so it needs to go a backlog list somewhere.

@Lokathor
Copy link
Contributor

All 512 intrinsics are being implemented still. We'd be happy to have a PR if you have the time.

@minybot
Copy link
Contributor

minybot commented Feb 17, 2021

The intrinsic _mm512_cvtsi512_si32 is missing from x86_64 std::arch.

I suspect this is because it wasn't always present; for example, this godbolt example (from this stack overflow answer defines it if missing).

I'm in no place to implement this myself, so it needs to go a backlog list somewhere.

I will implement it and it wll be available on next merge.
Thanks for your suggestion I missed that instruction.

@minybot
Copy link
Contributor

minybot commented Feb 17, 2021

You can check https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/avx512f.md
It shows what are implemented and what we cannot do currently.

@raphaelcohn
Copy link
Author

@minybot - that's very kind of you. It seems a reasonable overight - googling reveals that it was omitted from llvm as well, so if one was working from a particular llvm revision's C intrinsic headers, it would have been absent. We may need to be careful if we add this intrinsic in case it changes the minimum supported version of llvm we need to use.

The list in your link omits intrinsics like _ktestc_mask8_u8() (see https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_ktest) which I think might need what you call on that list i1 - although Intel document it as returning an unsigned char, it actually sets one of the X86 flags (probably ZF, but I'm guessing).

As an aside, Intel's intrinsics aren't that well designed or named IMV, and they make it impossible to use the variants of the underlying assembler that let one load from memory directly to do an operation (eg a parallel population count) rather than loading via a register first. Whilst there's a very good argument in maintaining naming and function signature compatibility with them, it does means one doesn't get the full benefit of the underlying architecture.

@minybot
Copy link
Contributor

minybot commented Feb 18, 2021

The list in your link omits intrinsics like _ktestc_mask8_u8() (see https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_ktest) which I think might need what you call on that list i1 - although Intel document it as returning an unsigned char, it actually sets one of the X86 flags (probably ZF, but I'm guessing).

_ktestc_mask8_u8 is avx512DQ which is not considered as avx512f here.
According to LLVM "call i32 @llvm.x86.avx512.ktestc.b(<8 x i1> %1, <8 x i1> %2) ; ", it requires i1 which Rust does not support currently. The community is trying to solve this issue. ##989

@raphaelcohn
Copy link
Author

@minybot ta. I realised after I read your reply above you'd got more than one list. I'll look at implementing the _ktest intrinsic for my own needs for now in assembler as a call to KTEST followed by SETcc. Fugly. And then I'll discover I want to use a different algorithm and it won't matter!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants