Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Implement SSE4.1 Insert/Extract and simplify SSE/SSE2 intrinsic #16646

Merged
merged 3 commits into from
Mar 2, 2018

Conversation

fiigii
Copy link

@fiigii fiigii commented Feb 28, 2018

This PR

@CarolEidt @tannergooding PTAL

// Select base type using argument type
HW_Flag_BaseTypeFromArg = 0x400,
// Select base type using the first argument type
HW_Flag_BaseTypeFromFirstArg = 0x400,
Copy link
Author

@fiigii fiigii Feb 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, all the intrinsics (except MemoryStore) that return a scalar value (e.g., bool) should explicitly have flag HW_Flag_BaseTypeFromFirstArg or HW_Flag_BaseTypeFromSecondArg. cc @4creators

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of asserts (which we were a few hours ago on another thread), this seems like something that could be asserted somewhere - e.g. a pass over the table that validates these kinds of properties. There are probably other consistency checks across signature, category and flag that could be checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fiigii, could you log an issue tracking this?

@@ -3125,6 +3117,7 @@ class Compiler
GenTree* impNonConstFallback(NamedIntrinsic intrinsic, var_types simdType, var_types baseType);
static bool isImmHWIntrinsic(NamedIntrinsic intrinsic, GenTree* lastOp);
GenTree* addRangeCheckIfNeeded(NamedIntrinsic intrinsic, GenTree* lastOp, bool mustExpand);
bool signatureTypeSupproted(var_types retType, CORINFO_SIG_INFO* sig, HWIntrinsicFlag flags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Supproted->Supported

//
// Return Value:
// On 32-bit platforms, return true if the intrinsic does not use any 64-bit registers (r64)
bool Compiler::signatureTypeSupproted(var_types retType, CORINFO_SIG_INFO* sig, HWIntrinsicFlag flags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These needs to have HWIntrinsic in the name somewhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

// 64-bit intrinsics
// Intrinsics that operate over 64-bit general purpose registers are not supported on 32-bit platform
HW_Flag_64BitOnly = 0x4000,
HW_Flag_firstArgMaybe64Bit = 0x8000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Fix casing on First

HARDWARE_INTRINSIC(SSE_ConvertToInt32, "ConvertToInt32", SSE, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtss2si, INS_invalid}, HW_Category_SIMDScalar, HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(SSE_ConvertToInt64, "ConvertToInt64", SSE, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtss2si, INS_invalid}, HW_Category_SIMDScalar, HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(SSE_ConvertToSingle, "ConvertToSingle", SSE, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_movss, INS_invalid}, HW_Category_Helper, HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(SSE_ConvertScalarToVector128Single, "ConvertScalarToVector128Single", SSE, -1, 16, 2, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtsi2ss, INS_invalid}, HW_Category_SIMDScalar, HW_Flag_SecondArgMaybe64Bit|HW_Flag_BaseTypeFromFirstArg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be marked as CopyUpperBits

HARDWARE_INTRINSIC(SSE2_ConvertScalarToVector128Double, "ConvertScalarToVector128Double", SSE2, -1, 16, 2, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtsi2sd, INS_invalid, INS_cvtsi2sd, INS_invalid, INS_cvtss2sd, INS_invalid}, HW_Category_Special, HW_Flag_NoFlag)
HARDWARE_INTRINSIC(SSE2_ConvertToVector128Int32, "ConvertToVector128Int32", SSE2, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtps2dq, INS_cvtpd2dq}, HW_Category_SimpleSIMD, HW_Flag_BaseTypeFromArg)
HARDWARE_INTRINSIC(SSE2_ConvertToDouble, "ConvertToDouble", SSE2, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_movsd}, HW_Category_Helper, HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(SSE2_ConvertToInt32, "ConvertToInt32", SSE2, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_mov_xmm2i, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtsd2si}, HW_Category_SIMDScalar, HW_Flag_BaseTypeFromFirstArg|HW_Flag_MultiIns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this one (and the following) MultiIns?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because these intrinsics need special CodeGen (but can be table-driven in the importer). Probably, I should add a new flag?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, I should add a new flag

I think so. I don't believe we want to use MultiIns for anything other than intrinsics which generate multiple instructions

HARDWARE_INTRINSIC(SSE2_ConvertToUInt32, "ConvertToUInt32", SSE2, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_mov_xmm2i, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_SIMDScalar, HW_Flag_BaseTypeFromFirstArg|HW_Flag_MultiIns)
HARDWARE_INTRINSIC(SSE2_ConvertToUInt64, "ConvertToUInt64", SSE2, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_mov_xmm2i, INS_invalid, INS_invalid}, HW_Category_SIMDScalar, HW_Flag_BaseTypeFromFirstArg|HW_Flag_MultiIns)
HARDWARE_INTRINSIC(SSE2_ConvertToVector128Double, "ConvertToVector128Double", SSE2, -1, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtdq2pd, INS_invalid, INS_invalid, INS_invalid, INS_cvtps2pd, INS_invalid}, HW_Category_SimpleSIMD, HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(SSE2_ConvertScalarToVector128Double, "ConvertScalarToVector128Double", SSE2, -1, 16, 2, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_cvtsi2sd, INS_invalid, INS_cvtsi2sd, INS_invalid, INS_cvtss2sd, INS_invalid}, HW_Category_Special, HW_Flag_SecondArgMaybe64Bit|HW_Flag_MultiIns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You changed the corresponding SSE one off of Special, why not this one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Will fix.

@fiigii fiigii force-pushed the insertextract branch 2 times, most recently from 87b2e82 to 556065f Compare February 28, 2018 11:33
@fiigii fiigii changed the title Update the table-driven framework and simplify SSE/SSE2 intrinsic Implement SSE4.1 Insert/Extract and simplify SSE/SSE2 intrinsic Feb 28, 2018
@fiigii
Copy link
Author

fiigii commented Feb 28, 2018

Implemented SSE4.1 Insert and Extract. Since these two intrinsics rely on the above work (64-bit intrinsics), so push the implementation in this PR to avoid blocking.

@CarolEidt @tannergooding PTAL

("InsertScalarTest.template",new string[] { "Sse2", "Sse2", "Insert", "Vector128", "Int16", "Vector128", "Int16", "1", "(short)2", "16", "(short)0", "(i == 1 ? result[i] != 2 : result[i] != 0)"}),
("InsertScalarTest.template",new string[] { "Sse2", "Sse2", "Insert", "Vector128", "UInt16", "Vector128", "UInt16", "1", "(ushort)2","16", "(ushort)0", "(i == 1 ? result[i] != 2 : result[i] != 0)"}),
("InsertScalarTest.template",new string[] { "Sse2", "Sse2", "Insert", "Vector128", "Int16", "Vector128", "Int16", "129","(short)2", "16", "(short)0", "(i == 1 ? result[i] != 2 : result[i] != 0)"}),
("InsertScalarTest.template",new string[] { "Sse2", "Sse2", "Insert", "Vector128", "UInt16", "Vector128", "UInt16", "129","(ushort)2","16", "(ushort)0", "(i == 1 ? result[i] != 2 : result[i] != 0)"}),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made two new templates for insert and extract.
@4creators I implemented SSE2 insert and extract in this PR as well, if you have done on them, I can remove here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fiigii I have implemented them as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@4creators Sounds Good, could you submit your current work or open an issue to keep sync with us?

HARDWARE_INTRINSIC(SSE41_Floor, "Floor", SSE41, 9, 16, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_roundps, INS_roundpd}, HW_Category_SimpleSIMD, HW_Flag_NoRMWSemantics)
HARDWARE_INTRINSIC(SSE41_FloorScalar, "FloorScalar", SSE41, 9, 16, -1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_roundss, INS_roundsd}, HW_Category_Special, HW_Flag_CopyUpperBits)
HARDWARE_INTRINSIC(SSE41_Insert, "Insert", SSE41, -1, 16, 3, {INS_pinsrb, INS_pinsrb, INS_invalid, INS_invalid, INS_pinsrd, INS_pinsrd, INS_pinsrq, INS_pinsrq, INS_insertps, INS_invalid}, HW_Category_IMM, HW_Flag_FullRangeIMM|HW_Flag_SecondArgMaybe64Bit)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding Shall I give Insert a HW_Flag_NoRMWSemantics as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only be on instructions that don't need hasDelayFreeSrc on op2.

I think pinsrb/pinsrd/pinsrq qualify here (since op2 is an integer reg and taget is an xmm reg).

However, insertps shouldn't have it, since op2 is also an xmm reg and we need to ensure that op2 is different from targetReg

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, looks like Insert should not have it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to ensure that pinsrb/d/q are specially handled

@fiigii
Copy link
Author

fiigii commented Mar 1, 2018

@CarolEidt could you please take a look at this PR?

@CarolEidt
Copy link

@fiigii I am just getting on a plane. I should be able to review it in a few hours. Sorry for the delay.

@fiigii
Copy link
Author

fiigii commented Mar 1, 2018

@CarolEidt No worries, thank you 😄

#ifndef LEGACY_BACKEND
if (ins == INS_extractps)
{
mReg = id->idReg1();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we handle this in codegen?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm..we can. The special case extractps uses ModRM:r/m for the dst register (store semantics).
I will handle this in codgen, thanks!

case NI_SSE41_Extract:
if (intrinsicTree->gtSIMDBaseType == TYP_FLOAT)
{
info->internalIntCount += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to get the value back in an XMM register, correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the original extractps instruction writes the result (a floating-point number) into a GPR, but our API returns a float. So we need the int-register as the dst of extractps.

INST3( pextrb, "pextrb" , 0, IUM_WR, 0, 0, SSE3A(0x14), BAD_CODE, BAD_CODE) // Extract Byte
INST3( pextrd, "pextrd" , 0, IUM_WR, 0, 0, SSE3A(0x16), BAD_CODE, BAD_CODE) // Extract Dword
INST3( pextrq, "pextrq" , 0, IUM_WR, 0, 0, SSE3A(0x16), BAD_CODE, BAD_CODE) // Extract Qword
INST3( extractps, "extractps" , 0, IUM_WR, 0, 0, BAD_CODE, BAD_CODE, SSE3A(0x17)) // Extract float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I believe most of these names have been copied out of the Arch manual... So this should probably be Extract Packed Floating-Point Values

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@@ -677,17 +702,25 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,
// table-driven importer of simple intrinsics
if (impIsTableDrivenHWIntrinsic(category, flags))
{
if (!varTypeIsSIMD(retType) || ((flags & HW_Flag_BaseTypeFromArg) != 0))
if (category == HW_Category_MemoryStore ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: I had to move this block in my AVX PR (#16655) in order to deal with the intrinsics that have both generic and non-generic overloads.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but needs minor changes: comments & parentheses.
I mention a suggestion for a consistency check - but that's not something needed for this PR.

// Select base type using argument type
HW_Flag_BaseTypeFromArg = 0x400,
// Select base type using the first argument type
HW_Flag_BaseTypeFromFirstArg = 0x400,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of asserts (which we were a few hours ago on another thread), this seems like something that could be asserted somewhere - e.g. a pass over the table that validates these kinds of properties. There are probably other consistency checks across signature, category and flag that could be checked.

// the intrinsics need special rules in CodeGen,
// but can be table-driven in the front-end
HW_Flag_SpecialCodeGen = 0x20000,

// No Read/Modify/Write Semantics
// the intrinsic does not have read/modify/write semantics and doesn't need

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre-existing, but this comment isn't complete.
I would qualify it with "the intrinsic doesn't have read/modify/write semantics in two-operand form." (or non-AVX, or whatever the appropriate characterization is).

@@ -494,6 +493,7 @@ bool Compiler::isFullyImplmentedISAClass(InstructionSet isa)
case InstructionSet_SSE:
case InstructionSet_SSE3:
case InstructionSet_SSSE3:
case InstructionSet_SSE41:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more moved to the happy state! :-)

// flags - flags of the intrinsics
//
// Return Value:
// On 32-bit platforms, return true if the intrinsic does not use any 64-bit registers (r64)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description doesn't reflect the name or the real purpose. I think it should say:

// Return Value:
//    Returns true iff the given type signature is supported
//
// Notes:
//    - This is only used on 32-bit systems to determine whether the signature uses no 64-bit registers.
//    - The `retType` is passed to avoid another call to the type system, as it has already been retrieved.

@@ -677,17 +702,25 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,
// table-driven importer of simple intrinsics
if (impIsTableDrivenHWIntrinsic(category, flags))
{
if (!varTypeIsSIMD(retType) || ((flags & HW_Flag_BaseTypeFromArg) != 0))
if (category == HW_Category_MemoryStore ||

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expression needs more parentheses to be fully-parenthesized.

assert(category == HW_Category_MemoryStore);
baseType =
getBaseTypeOfSIMDType(info.compCompHnd->getArgClass(sig, info.compCompHnd->getArgNext(sig->args)));
assert(category == HW_Category_MemoryStore || ((flags & HW_Flag_BaseTypeFromSecondArg) != 0));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs more parens.

@@ -2323,7 +2324,7 @@ void LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree)
assert(numArgs != 0);
assert(numArgs != 1);

if (info->srcCount >= 2)
if (info->srcCount >= 2 && (genActualType(intrinsicTree->TypeGet()) == genActualType(op2->TypeGet())))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding I think we can have a stricter check to avoid handling special cases like Insert.

Copy link
Author

@fiigii fiigii Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, we cannot have this check.

@fiigii
Copy link
Author

fiigii commented Mar 2, 2018

test Windows_NT x64 Checked jitincompletehwintrinsic
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2
test Windows_NT x64 Checked jitx86hwintrinsicnosimd
test Windows_NT x64 Checked jitnox86hwintrinsic

test Windows_NT x86 Checked jitincompletehwintrinsic
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx2
test Windows_NT x86 Checked jitx86hwintrinsicnosimd
test Windows_NT x86 Checked jitnox86hwintrinsic

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx2
test Ubuntu x64 Checked jitx86hwintrinsicnosimd
test Ubuntu x64 Checked jitnox86hwintrinsic

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

@fiigii
Copy link
Author

fiigii commented Mar 2, 2018

CI seems to have some IO problems. All tests passed on my local machines.

@fiigii
Copy link
Author

fiigii commented Mar 2, 2018

@tannergooding @CarolEidt Addressed all the feedback, does the current code look good to you?

@tannergooding
Copy link
Member

CC. @dotnet/dnceng, are the Mac machines in some bad state?

if ((ins == INS_pextrq || ins == INS_pinsrq) && !UseVEXEncoding())
{
assert(UseSSE4());
sz += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate issue: We really need to cleanup the code size estimation code (CC. @CarolEidt).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me just remind you that mis-estimates can lead to odd CHK/RET codegen diffs (#13399).

@4creators
Copy link

I have logged issue #16696 due to failing OSX jobs

@mmitche
Copy link
Member

mmitche commented Mar 2, 2018

@tannergooding I think there is something wrong in the lab. This is happening across multiple jenkins instances.

@@ -593,17 +598,6 @@ void CodeGen::genSSEIntrinsic(GenTreeHWIntrinsic* node)

switch (intrinsicID)
{
case NI_SSE_ConvertScalarToVector128Single:
Copy link
Member

@tannergooding tannergooding Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm. This is now considered table driven, correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

case NI_SSE41_Extract:
if (baseType == TYP_FLOAT)
{
info->internalIntCount += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we explicitly set the internal candidates, as we have done elsewhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No necessary, this internal register can be any GPR.

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me overall.

I would like to spend some time cleaning up this code (as it is a bit messy right now) after we get the currently approved intrinsics in.

@fiigii
Copy link
Author

fiigii commented Mar 2, 2018

I would like to spend some time cleaning up this code

Don't you mean to clean up the code-size estimation code?

@tannergooding
Copy link
Member

Don't you mean to clean up the code-size estimation code?

I meant the intrinsic code in general. I've gone over it a couple times and it looks like there is some room for cleanup/improvement in a few places.

@CarolEidt
Copy link

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

@CarolEidt
Copy link

test OSX10.12 x64 Checked Innerloop Build and Test

@fiigii
Copy link
Author

fiigii commented Mar 2, 2018

Can we merge this PR?

@CarolEidt
Copy link

Can we merge this PR?

I will go ahead and merge. I think the odds of breaking something on OSX with this change but not the other OS's is pretty small.

@CarolEidt CarolEidt merged commit 1116aa7 into dotnet:master Mar 2, 2018
@CarolEidt
Copy link

It seems that we had a bad merge with PR #16669. @fiigii @tannergooding could one of you ensure consistent use of HW_Flag_BaseTypeFromArg and HW_Flag_BaseTypeFromFirstArg / HW_Flag_BaseTypeFromSecondArg? SSE2_MaskMove is using the former.

@fiigii
Copy link
Author

fiigii commented Mar 2, 2018

@CarolEidt I have changed HW_Flag_BaseTypeFromArg to HW_Flag_BaseTypeFromFirstArg, let me fix SSE2_MaskMove. Thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RyuJIT] Update the table-driven framework to unify throwing PNSE from r64 instructions on 32-bit platform
7 participants