x86::bmi2::_mulx_u32 doesn't lower to mulx #27

gnzlbg · 2017-09-19T21:03:25Z

This intrinsic currently generates imulq instruction instead of a mulx instruction.

The mulx instruction can operate on both 32 and 64 bit registers, e.g., see here: http://www.felixcloutier.com/x86/MULX.html

The 64 bit version works just fine.

The text was updated successfully, but these errors were encountered:

gnzlbg · 2017-09-20T08:18:14Z

So this is llvm bug 34232. Whether imulq is faster than mulx for 32 bit integers is unclear at the moment.

Since in this case it is not a clear cut, and might change with time, I think the only way to guarantee a mulx would be to use inline assembly.

BurntSushi · 2017-09-25T21:10:15Z

@gnzlbg That's interesting. Is that linked llvm bug report saying that they will use imulq because it's faster even though the vendor intrinsic API says it should emit a mulx instruction? If so, should we be adopting a similar policy or do we want to treat "vendor intrinsic always maps to X" as the supreme directive in all things?

gnzlbg · 2017-09-26T09:56:51Z

Do the vendor intrinsics in clang/gcc etc. guarantee that any concrete intrinsic will be generated?

I think this is something they would actually actively avoid to guarantee since it allows using more modern instructions in modern processors for some operations.

If so, should we be adopting a similar policy or do we want to treat "vendor intrinsic always maps to X" as the supreme directive in all things?

I think we should not guarantee any concrete assembly to allow these optimizations (but we should maybe have a general discussion about this on its own issue). Those who want a concrete assembly instruction should use asm!.

We have the code-gen tests on CI, and a comment next to the code-gen test linking to the LLVM bug, so I am going to close this. If LLVM changes its behavior we will know, and the people from the future will deal with it.

BurntSushi · 2017-09-26T17:04:30Z

@gnzlbg Punting to the future sounds great. :-) Thanks!

gnzlbg changed the title ~~x86::bmi2::_mulx_u32~~ x86::bmi2::_mulx_u32 doesn't lower to mulx Sep 20, 2017

gnzlbg closed this as completed Sep 26, 2017

gnzlbg mentioned this issue Nov 2, 2017

Minimal path to stabilization #159

Closed

3 tasks

jethrogb mentioned this issue Apr 20, 2019

Check for adcx instruction in the addcarryx intrinsics instead of just adc #666

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x86::bmi2::_mulx_u32 doesn't lower to mulx #27

x86::bmi2::_mulx_u32 doesn't lower to mulx #27

gnzlbg commented Sep 19, 2017

gnzlbg commented Sep 20, 2017

BurntSushi commented Sep 25, 2017

gnzlbg commented Sep 26, 2017

BurntSushi commented Sep 26, 2017

x86::bmi2::_mulx_u32 doesn't lower to mulx #27

x86::bmi2::_mulx_u32 doesn't lower to mulx #27

Comments

gnzlbg commented Sep 19, 2017

gnzlbg commented Sep 20, 2017

BurntSushi commented Sep 25, 2017

gnzlbg commented Sep 26, 2017

BurntSushi commented Sep 26, 2017