Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86::bmi2::_mulx_u32 doesn't lower to mulx #27

Closed
gnzlbg opened this issue Sep 19, 2017 · 4 comments
Closed

x86::bmi2::_mulx_u32 doesn't lower to mulx #27

gnzlbg opened this issue Sep 19, 2017 · 4 comments

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 19, 2017

This intrinsic currently generates imulq instruction instead of a mulx instruction.

The mulx instruction can operate on both 32 and 64 bit registers, e.g., see here: http://www.felixcloutier.com/x86/MULX.html

The 64 bit version works just fine.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Sep 20, 2017

So this is llvm bug 34232. Whether imulq is faster than mulx for 32 bit integers is unclear at the moment.

Since in this case it is not a clear cut, and might change with time, I think the only way to guarantee a mulx would be to use inline assembly.

@gnzlbg gnzlbg changed the title x86::bmi2::_mulx_u32 x86::bmi2::_mulx_u32 doesn't lower to mulx Sep 20, 2017
@BurntSushi
Copy link
Member

@gnzlbg That's interesting. Is that linked llvm bug report saying that they will use imulq because it's faster even though the vendor intrinsic API says it should emit a mulx instruction? If so, should we be adopting a similar policy or do we want to treat "vendor intrinsic always maps to X" as the supreme directive in all things?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Sep 26, 2017

Do the vendor intrinsics in clang/gcc etc. guarantee that any concrete intrinsic will be generated?

I think this is something they would actually actively avoid to guarantee since it allows using more modern instructions in modern processors for some operations.

If so, should we be adopting a similar policy or do we want to treat "vendor intrinsic always maps to X" as the supreme directive in all things?

I think we should not guarantee any concrete assembly to allow these optimizations (but we should maybe have a general discussion about this on its own issue). Those who want a concrete assembly instruction should use asm!.

We have the code-gen tests on CI, and a comment next to the code-gen test linking to the LLVM bug, so I am going to close this. If LLVM changes its behavior we will know, and the people from the future will deal with it.

@gnzlbg gnzlbg closed this as completed Sep 26, 2017
@BurntSushi
Copy link
Member

@gnzlbg Punting to the future sounds great. :-) Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants