Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

[RyuJIT] Correct names of AVX that stand for AVX2 #13879

Closed
wants to merge 1 commit into from

Conversation

fiigii
Copy link

@fiigii fiigii commented Sep 8, 2017

Now, the name of AVX actually stands for AVX2 instruction set somewhere in RyuJIT. Because we introduced System.Runtime.Intrinsics.X86 in #13576 , RyuJIT should distinguish AVX and AVX2 ISA.

Currently, RyuJIT SIMD and floating point codegen emit AVX instructions (VEX encoding) when AVX2 is available, so RyuJIT uses "AVX" and "AVX2" interchangeably which can cause confusion with Avx intrinsics be introduced. But intrinsics in class System.Runtime.Intrinsics.X86.Avx may change this strategy, so we should distinguish AVX and AVX2 in RyuJIT (names in vm side are correct).

During implementing System.Runtime.Intrinsics.X86, I found InstructionSet_AVX has to be changed to InstructionSet_AVX2 (a new InstructionSet_AVX will be added). I am not sure if any other name of "AVX" should be changed to "AVX2" (e.g., canUseAVX), so we can discuss in this PR.

@fiigii
Copy link
Author

fiigii commented Sep 8, 2017

/cc @russellhadley @jkotas @CarolEidt

@fiigii fiigii changed the title [RyuJIT] Correct names of AVX that stands for AVX2 [RyuJIT] Correct names of AVX that stand for AVX2 Sep 8, 2017
Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I don't think the JIT ever generates AVX (but not AVX2) instructions currently, and certainly not for SIMD.

@CarolEidt comments?

@fiigii
Copy link
Author

fiigii commented Sep 15, 2017

I don't think the JIT ever generates AVX (but not AVX2) instructions currently, and certainly not for SIMD.

@BruceForstall I think this codegen strategy should be changed when we enable Intel hardware intrinsics, which allows developers to use AVX intrinsics on non-AVX2 machine (Sandy Bridge and Ivy Bridge). Under the current codegen strategy, mixing scalar floating-point calculations with AVX intrinsics may trigger AVX-SSE transition penalties.

@CarolEidt
Copy link

As I mentioned in the intrinsics discussion, we will have to change our strategy. We chose to use AVX2 rather than AVX because the AVX2 support for 256-bit vectors was more complete. However, if we are going to broadly support AVX intrinsics, then I believe that we need to:

  • Base the encoding selection on whether we are on an AVX-capable system, and
  • Base the size of Vector<T> on whether we are on an AVX2-capable system.

I'm not comfortable with this change as it is, because 1) I don't think it improves anything, and 2) it just leads to further confusion.

What we need is to clarify (and separately address) the question of the encoding and available instruction set (i.e. the actual target hardware) vs. the size of Vector<T>

@fiigii
Copy link
Author

fiigii commented Sep 15, 2017

However, if we are going to broadly support AVX intrinsics, then I believe that we need to:

  • Base the encoding selection on whether we are on an AVX-capable system, and
  • Base the size of Vector<T> on whether we are on an AVX2-capable system.

@CarolEidt Agree with this new codegen strategy. We should generate VEX-encoding instructions that operate over xmm registers once AVX is supported by underlying hardware. But only enable 256-bit Vector<T> when AVX2 is available. I will open a new issue and make the change of encoding selection.

I'm not comfortable with this change as it is, because 1) I don't think it improves anything, and 2) it just leads to further confusion.

We have to change something because the current RyuJIT uses "AVX" and "AVX2" interchangeably and #14020 requires InstructionSet_AVX and InstructionSet_AVX2 both.

@@ -1158,7 +1158,7 @@ void CodeGen::genSIMDIntrinsic32BitConvert(GenTreeSIMD* simdNode)
getEmitter()->emitIns_R_R_I(INS_pinsrw, emitTypeSize(TYP_INT), tmpReg, tmpIntReg, 3);
}
#endif
if (compiler->getSIMDInstructionSet() == InstructionSet_AVX)
if (compiler->getSIMDInstructionSet() == InstructionSet_AVX2)
{
inst_RV_RV(INS_vpbroadcastd, tmpReg, tmpReg, targetType, emitActualTypeSize(targetType));
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarolEidt For example, once we distinguish InstructionSet_AVX and InstructionSet_AVX2, we will have to change here because vpbroadcastd is an AVX2 instruction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarolEidt ping?

@CarolEidt
Copy link

We have to change something because the current RyuJIT uses "AVX" and "AVX2" interchangeably and #14020 requires InstructionSet_AVX and InstructionSet_AVX2 both.

I agree that changes are needed, but I don't think this is an appropriate change. Instead, we should:

  • Add InstructionSet_AVX2 (instead of just replacing InstructionSet_AVX
  • Leave the code as-is that determines whether/when to emit vzeroupper instructions
  • Modify the code that determines the size of Vector<T> to check for InstructionSet_AVX2 before setting to 32 bytes.
  • Ensure that we are using AVX encodings when compiler->getSIMDInstructionSet() >= InstructionSet_AVX

There may be other changes required as well.

@fiigii
Copy link
Author

fiigii commented Sep 18, 2017

@CarolEidt Got it, I will close this PR and provide a better solution.

@fiigii
Copy link
Author

fiigii commented Sep 19, 2017

Move to #14065

@fiigii fiigii closed this Sep 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants