Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Implement scalar Sse2 hardware intrinsics #16237

Merged
merged 2 commits into from
Feb 13, 2018

Conversation

4creators
Copy link

No description provided.

INST3( movd, "movd" , 0, IUM_WR, 0, 0, PCKDBL(0x7E), BAD_CODE, PCKDBL(0x6E))
INST3( movq, "movq" , 0, IUM_WR, 0, 0, PCKDBL(0xD6), BAD_CODE, SSEFLT(0x7E))
INST3( movd, "movd" , 0, IUM_WR, 0, 0, PCKDBL(0x6E), BAD_CODE, PCKDBL(0xFE)) // Move doubleword from r/m32 to xmm or move doubleword from xmm register to r/m32
INST3( movq, "movq" , 0, IUM_WR, 0, 0, PCKDBL(0xD6), BAD_CODE, SSEFLT(0x7E)) // Move quadword from r/m64 to xmm or move quadword from xmm register to r/m64
Copy link
Author

@4creators 4creators Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge - will fix

@tannergooding
Copy link
Member

Just a general nit. It is helpful to break these PRs into two logical commits (one containing the product code changes and one containing the test changes).

The GitHub UI (and even various external code review tools) tend to not do so well with large commits, and making it easier to review the product code is generally desirable (IMO).

@@ -267,9 +268,12 @@ bool emitter::Is4ByteSSE4Instruction(instruction ins)
bool emitter::TakesVexPrefix(instruction ins)
{
// special case vzeroupper as it requires 2-byte VEX prefix
if (ins == INS_vzeroupper)
switch (ins)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be in a weird state. The sfence and prefetch instructions are missing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, rebasing to latest head which includes commit containing memory instructions - will fix

{
return true;
case INS_cvttsd2si:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cvttss2si is missing.

Additionally, given that you aren't adding anything new here, it would be generally desirable to move it into its own refactoring PR.

break;
}

case NI_SSE2_CompareEqualOrderedScalar:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be merged with the SSE implementations.

Copy link
Author

@4creators 4creators Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but since there quite a lot refactoring opportunities still coming I have created issue to track this work item #16014 and I plan to work on it once all SSE2 and other similar SSE intrinsics are in.

baseType = JITtype2varType(corType);

#ifdef _TARGET_X86_
if (varTypeIsLong(JITtype2varType(corType)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case should already be handled by impHWIntrinsic

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case should already be handled by impHWIntrinsic

Currently, impHWIntrinsic only handle r64 for return types. Maybe we can make the change latter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a bug tracking this yet?

Do we also have any mechanism for determining when the intrinsic takes/returns a 64-bit value but also works on 32-bit machines (there is at least a couple of them)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a bug tracking this yet?

I do not think this is a bug. We can introduce more flags to hnit the framework that the intrinsic may have an r64 oprand on first/second/.. position.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make this change after this PR merged.

@4creators
Copy link
Author

Just a general nit. It is helpful to break these PRs into two logical commits

Do you prefer to break this PR or next ones?

@tannergooding
Copy link
Member

tannergooding commented Feb 6, 2018

Do you prefer to break this PR or next ones?

Breaking this one into two should be fairly trivial, since the split is only along a directory:

git fetch --all
git reset --hard dotnet/master
git cherry-pick --no-commit c797d8632e992603e0d5b37d2fb2a130065bf1c7
<resolve conflicts>
git reset
git add src
git commit
git add tests
git commit

@4creators 4creators force-pushed the sse2scalar branch 2 times, most recently from 43bf732 to dc4e7cd Compare February 6, 2018 20:27
=======
// these are the only ISA extension instructions taking zero operands
|| ins == INS_vzeroupper
>>>>>>> Implement scalar Sse2 hardware intrinsics
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please solve this conflict.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will fix - next bad merge

baseType = JITtype2varType(corType);

#ifdef _TARGET_X86_
if (varTypeIsLong(JITtype2varType(corType)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case should already be handled by impHWIntrinsic

Currently, impHWIntrinsic only handle r64 for return types. Maybe we can make the change latter.


if (baseType == TYP_STRUCT)
{
baseType = getBaseTypeOfSIMDType(argClass);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the baseType is always double on Sse2.ConvertScalarToVector128Double.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the case however I try to write code which will be easy to refactor or abstract to handle more cases later see issue #16014. If it is acceptable I would prefer to defer to refactoring PR

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. CVTSS2SD xmm, xmm/m32 and CVTSI2SD xmm, reg/m64 need different encoding.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I use argType as baseType to use it later during instruction emission

CorInfoType corType =
strip(info.compCompHnd->getArgType(sig, argList, &argClass)); // type of the second argument

baseType = getBaseTypeOfSIMDType(argClass);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The baseType is float (retType's base type) on ConvertScalarToVector128Single.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need arg types to use correct encoding later (r64 versus r32) during instruction emitting.

Copy link

@fiigii fiigii Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about ConvertScalarToVector128Double?

Sse2.ConvertScalarToVector128Single just has one signature Vector128<float> ConvertScalarToVector128Single(Vector128<float> upper, Vector128<double> value)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I will work on refactoring I will have overloaded - where I need to use second argType as base type
ConvertScalarToVector128Single(Vector128<float> upper, int value)
ConvertScalarToVector128Single(Vector128<float> upper, long value)

If you prefer to simplify it for this PR and not defer changes to refactoring I will do it.

Copy link

@fiigii fiigii Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. However, I do not think we should combine Sse.ConvertScalarToVector128Single and Sse2.ConvertScalarToVector128Single because they have different signatures and codgen consideration.

Furthermore, combining HW intrinsic code cross-ISA should rely on the current category/flag mechanism rather than "intrinsic names". Meanwhile, a little bit duplicated code is fine to me, it may be not worthwhile to complicate the framework just for a few special cases. We can implement these "special cases" individually, then add categories/flags to combine them if there are some obvious patterns.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK than I will simplify it and leave any changes during refactoring to the result of discussion of its scope.

assert(sig->numArgs == 1);
op1 = impSIMDPopStack(TYP_SIMD16);
retType = JITtype2varType(sig->retType);
baseType = getBaseTypeOfSIMDType(info.compCompHnd->getArgClass(sig, sig->args));
Copy link

@fiigii fiigii Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can directly use the return type as base type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are overloads where it is necessary to select different instruction based on argument type - I use baseType to handle this. AFAIR during debugging I have seen problems with baseType detection and that's why I have explicitly recovered it from sig->retType.

Copy link

@fiigii fiigii Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see there are two special guys

 /// <summary>
        /// int _mm_cvtsd_si32 (__m128d a)
        ///   CVTSD2SI r32, xmm/m64
        /// </summary>
        public static int ConvertToInt32(Vector128<double> value) => ConvertToInt32(value);
        /// <summary>
        /// int _mm_cvtsi128_si32 (__m128i a)
        ///   MOVD reg/m32, xmm
        /// </summary>
        public static int ConvertToInt32(Vector128<int> value) => ConvertToInt32(value);
        /// <summary>
        /// __int64 _mm_cvtsd_si64 (__m128d a)
        ///   CVTSD2SI r64, xmm/m64
        /// </summary>
        public static long ConvertToInt64(Vector128<double> value) => ConvertToInt64(value);
        /// <summary>
        /// __int64 _mm_cvtsi128_si64 (__m128i a)
        ///   MOVQ reg/m64, xmm
        /// </summary>
        public static long ConvertToInt64(Vector128<long> value) => ConvertToInt64(value);

Perhasps, we can separate these two groups

case NI_SSE2_ConvertToInt32:
case NI_SSE2_ConvertToInt64:
    // get baseType from arg

case NI_SSE2_ConvertToUInt32:
case NI_SSE2_ConvertToUInt64:
    // use the return type as base type

because calling getBaseTypeOfSIMDType and info.compCompHnd->getArgClass are relatively expensive

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right. Will fix

@4creators 4creators force-pushed the sse2scalar branch 6 times, most recently from 1b76e1a to b7d95c7 Compare February 7, 2018 20:50
case NI_SSE2_ConvertScalarToVector128Int64:
case NI_SSE2_ConvertScalarToVector128UInt64:
{
assert(sig->numArgs == 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to above I have had to handle above conversion intrinsics as HW_Category_Special

@4creators 4creators force-pushed the sse2scalar branch 2 times, most recently from a7c3f15 to 14fab5b Compare February 8, 2018 01:10
@4creators
Copy link
Author

@tannergooding There are strange test results. Could you help verify if I should report them as issues to resolve or dig in and look for the reasons myself?

All OSX and Ubuntu stress jobs failed due to the following failure:

A downstream run failed

Miscellaneous build error - see console log for others

D:\j\workspace\checked_windo---2b2fae86\bin\tests\Windows_NT.x64.Checked\TestWrappers\JIT.Performance\JIT.Performance.XUnitWrapper.cs(5777): error : JIT_Performance._CodeQuality_V8_Richards_Richards_Richards_._CodeQuality_V8_Richards_Richards_Richards_cmd [FAIL] [D:\j\workspace\checked_windo---2b2fae86\tests\runtest.proj]

While Windows_NT x64 Checked Build and Test (Jit - EnableIncompleteISAClass=1 FeatureSIMD=0) failure is quite strange:

Test Result (28 failures / +28)
baseservices_exceptions._generics_try_fault_struct01_try_fault_struct01_._generics_try_fault_struct01_try_fault_struct01_cmd
CoreMangLib_cti._system_collections_generic_queueenumerator_EnumeratorCurrent_EnumeratorCurrent_._system_collections_generic_queueenumerator_EnumeratorCurrent_EnumeratorCurrent_cmd
CoreMangLib_cti._system_indexoutofrangeexception_IndexOutOfRangeExceptionctor1_IndexOutOfRangeExceptionctor1_._system_indexoutofrangeexception_IndexOutOfRangeExceptionctor1_IndexOutOfRangeExceptionctor1_cmd
GC_Scenarios._GCSimulator_GCSimulator_300_GCSimulator_300_._GCSimulator_GCSimulator_300_GCSimulator_300_cmd
GC_Scenarios._GCSimulator_GCSimulator_418_GCSimulator_418_._GCSimulator_GCSimulator_418_GCSimulator_418_cmd
GC_Scenarios._GCSimulator_GCSimulator_356_GCSimulator_356_._GCSimulator_GCSimulator_356_GCSimulator_356_cmd
GC_Scenarios._GCSimulator_GCSimulator_342_GCSimulator_342_._GCSimulator_GCSimulator_342_GCSimulator_342_cmd
GC_Scenarios._GCSimulator_GCSimulator_172_GCSimulator_172_._GCSimulator_GCSimulator_172_GCSimulator_172_cmd
GC_Scenarios._GCSimulator_GCSimulator_89_GCSimulator_89_._GCSimulator_GCSimulator_89_GCSimulator_89_cmd
GC_Scenarios._GCSimulator_GCSimulator_58_GCSimulator_58_._GCSimulator_GCSimulator_58_GCSimulator_58_cmd
GC_Scenarios._GCSimulator_GCSimulator_351_GCSimulator_351_._GCSimulator_GCSimulator_351_GCSimulator_351_cmd
GC_Scenarios._GCSimulator_GCSimulator_344_GCSimulator_344_._GCSimulator_GCSimulator_344_GCSimulator_344_cmd
GC_Scenarios._GCSimulator_GCSimulator_162_GCSimulator_162_._GCSimulator_GCSimulator_162_GCSimulator_162_cmd
GC_Scenarios._StringCreator_stringcreator_stringcreator_._StringCreator_stringcreator_stringcreator_cmd
GC_Scenarios._THDChaos_thdchaos_thdchaos_._THDChaos_thdchaos_thdchaos_cmd
GC_Scenarios._GCSimulator_GCSimulator_123_GCSimulator_123_._GCSimulator_GCSimulator_123_GCSimulator_123_cmd
GC_Scenarios._GCSimulator_GCSimulator_397_GCSimulator_397_._GCSimulator_GCSimulator_397_GCSimulator_397_cmd
GC_Scenarios._GCSimulator_GCSimulator_318_GCSimulator_318_._GCSimulator_GCSimulator_318_GCSimulator_318_cmd
GC_Scenarios._GCSimulator_GCSimulator_350_GCSimulator_350_._GCSimulator_GCSimulator_350_GCSimulator_350_cmd
GC_Scenarios._GCSimulator_GCSimulator_141_GCSimulator_141_._GCSimulator_GCSimulator_141_GCSimulator_141_cmd
JIT_HardwareIntrinsics._X86_Sse2_Average_r_Average_r_._X86_Sse2_Average_r_Average_r_cmd
JIT_HardwareIntrinsics._X86_Sse2_UnpackHigh_r_UnpackHigh_r_._X86_Sse2_UnpackHigh_r_UnpackHigh_r_cmd
JIT_jit64._localloc_call_call04_small_call04_small_._localloc_call_call04_small_call04_small_cmd
JIT_Methodical._MDArray_DataTypes_uint_cs_r_uint_cs_r_._MDArray_DataTypes_uint_cs_r_uint_cs_r_cmd
JIT_Performance._CodeQuality_Roslyn_CscBench_CscBench_._CodeQuality_Roslyn_CscBench_CscBench_cmd
JIT_Performance._CodeQuality_Benchstones_BenchF_SqMtx_SqMtx_SqMtx_._CodeQuality_Benchstones_BenchF_SqMtx_SqMtx_SqMtx_cmd
Regressions_coreclr._9414_readonlyPrefix_readonlyPrefix_._9414_readonlyPrefix_readonlyPrefix_cmd
Regressions_coreclr._1549_Test1549_Test1549_._1549_Test1549_Test1549_cmd

@tannergooding
Copy link
Member

Most of these appear to be Test Infrastructure Failure: The paging file is too small for this operation to complete.

@jkotas, are you aware of any changes that went in recently that could impact this?

@jkotas
Copy link
Member

jkotas commented Feb 8, 2018

No idea what can be causing this.

@4creators 4creators force-pushed the sse2scalar branch 2 times, most recently from 8a1c97f to 0f0e6a4 Compare February 9, 2018 01:00
@4creators
Copy link
Author

@dotnet-bot test Alpine.3.6 x64 Debug Build please

@4creators
Copy link
Author

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test please

emit->emitIns_SIMD_R_R_R_I(ins, emitTypeSize(TYP_SIMD16), targetReg, op1Reg, op2Reg, ival);

break;
}

case NI_SSE2_CompareEqualOrderedScalar:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have an explicit bug that tracks merging this with the SSE implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue #16330 tracks this work item. Could you assign it to me within hardware intrinsics project?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

assert(op2 != nullptr);
op2Reg = op2->gtRegNum;
instruction ins = Compiler::insOfHWIntrinsic(intrinsicID, baseType);
emit->emitIns_SIMD_R_R_R(ins, emitTypeSize(TYP_SIMD16), targetReg, op1Reg, op2Reg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this one go through genHWIntrinsic_R_R_RM?

assert(op2 != nullptr);
op2Reg = op2->gtRegNum;
instruction ins = Compiler::insOfHWIntrinsic(intrinsicID, baseType);
emit->emitIns_SIMD_R_R_R(ins, emitTypeSize(TYP_SIMD16), targetReg, op1Reg, op2Reg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, genHWIntrinsic_R_R_RM?

}
else
{
emit->emitIns_R_R(INS_mov_xmm2i, emitActualTypeSize(baseType), op1Reg, targetReg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this one using insOfHWIntrinsic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's due to #16322 which I plan to fix in #16329

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. We are using INS_mov_xmm2i in both the TYP_LONG and TYP_ULONG case, so it shouldn't matter which base type is used in the lookup...

That is, if only TYP_LONG is being set, we can fill that column of the ConvertToUInt64 table entry with a TODO-XArch-Bug note that it is a workaround.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if only TYP_LONG is being set, we can fill that column of the ConvertToUInt64 table entry with a TODO-XArch-Bug note that it is a workaround

It should be done for TYP_UINT as well. In principle proposed change is equivalent workaround to that used in current code as we always select INS_mov_xmm2i. The workaround currently requires one change but in case we would do it in table we should change instruction selection for multiple intrinsics having unsigned integral base types.

If you agree that current solution is simpler I will add TODO-XArch-Bug comment here.

Copy link

@fiigii fiigii Feb 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this problem is from baseType = JITtype2varType(sig->retType);. We should have a new sign-awared function (e.g., JITtype2BaseVarType, or other name) that only used for HW intrinsics.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a new sign-awared function (e.g., JITtype2BaseVarType, or other name) that only used for HW intrinsics.

Yes, it is tracked by #16329

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding Could you reply to my comment #16237 (comment)

@@ -2283,9 +2283,13 @@ void LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree)
switch (intrinsicID)
{
case NI_SSE_CompareEqualOrderedScalar:
case NI_SSE2_CompareEqualOrderedScalar:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It might be better to keep these grouped as NI_SSE, then NI_SSE2

@4creators 4creators force-pushed the sse2scalar branch 2 times, most recently from 2947cb9 to 8d8ea6b Compare February 11, 2018 19:35
@4creators
Copy link
Author

@dotnet-bot test Tizen armel Cross Checked Innerloop Build and Test please

break;
}

case NI_SSE2_ConvertScalarToVector128Single:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This path could be merged with ConvertScalarToVector128Double above. The only difference is the baseType assertion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

}
#endif // _TARGET_X86_

if (baseType == TYP_STRUCT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could simplify this to use getArgForHWIntrinsic

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, getArgForHWIntrinsic does not handle this logic properly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about it isn't being handled properly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns TYP_STRUCT instead of TYP_FLOAT - didn't dig deeper to solve it. Plan to have closer look while working on #16329

@tannergooding
Copy link
Member

Couple more pieces of feedback/questions. Will probably be good for sign-off after the next round.

@4creators
Copy link
Author

All OSX and Ubuntu jitstress jobs timed out.

@4creators
Copy link
Author

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnosimd

@4creators
Copy link
Author

Again 2 Ubuntu jitstress jobs timeouts

@@ -33,7 +33,7 @@ static const wchar_t *coreCLRDll = W("CoreCLR.dll");
static const wchar_t *coreCLRInstallDirectory = W("%windir%\\system32\\");

// Encapsulates the environment that CoreCLR will run in, including the TPALIST
class HostEnvironment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file only contains whitespace changes and isn't related to the PR. It should probably be dropped.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is my workflow error, as I use DebugBreak() in corerun.cpp to start debugging and my editor automatically corrects all whitespace errors. Just forgot to undo changes before commit. Will revert.

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this LGTM.

It would be good to undo the corerun.cpp changes, since they are unrelated and whitespace changes only.

It would also probably be good (long term) to have explicit comments as to why particular intrinsics aren't/can't be table driven.

@4creators
Copy link
Author

test Windows_NT x64 Checked jitincompletehwintrinsic
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2
test Windows_NT x64 Checked jitx86hwintrinsicnosimd
test Windows_NT x64 Checked jitnox86hwintrinsic

test Windows_NT x86 Checked jitincompletehwintrinsic
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx2
test Windows_NT x86 Checked jitx86hwintrinsicnosimd
test Windows_NT x86 Checked jitnox86hwintrinsic

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx2
test Ubuntu x64 Checked jitx86hwintrinsicnosimd
test Ubuntu x64 Checked jitnox86hwintrinsic

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @4creators and also thanks @tannergooding for the reviews!

Copy link

@fiigii fiigii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the work. I logged the PNSE issue at https://github.com/dotnet/coreclr/issues/16342.

@4creators
Copy link
Author

test Windows_NT x86 Checked jitx86hwintrinsicnosimd

@4creators
Copy link
Author

test Windows_NT x64 Checked jitx86hwintrinsicnosimd

@tannergooding tannergooding merged commit e6d3bd9 into dotnet:master Feb 13, 2018
@tannergooding
Copy link
Member

Thanks @4creators

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants