Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardware instruction set support for crossgen2 #33274

Merged
merged 48 commits into from
Apr 3, 2020
Merged
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
ad99cce
Progress towards -instruction-set command line argument
davidwrighton Feb 25, 2020
b3231bd
Specified instruction set support
davidwrighton Feb 28, 2020
79f6040
Specification of instruction set on command line
davidwrighton Mar 4, 2020
8427ef9
Take advantage of InstructionSetSupport class to reduce hardcoding th…
davidwrighton Mar 4, 2020
f3afae3
Funnel baseline instruction set support to R2R image so that the imag…
davidwrighton Mar 4, 2020
eff6176
Basic infra for detecting advanced instruction set usage instead of p…
davidwrighton Mar 27, 2020
2c1052a
R2RDump Support for ReadyToRunFixupKind.Check_InstructionSetSupport
davidwrighton Mar 5, 2020
e3bc096
Tweak build until it actually builds
davidwrighton Mar 5, 2020
51583ef
Fix build build in r2rdump
davidwrighton Mar 5, 2020
923c53b
Tweaks to JIT handling of simd vector info so that code which only in…
davidwrighton Mar 6, 2020
28e7800
Emit CHECK_InstructionSetSupport in correct format
davidwrighton Mar 6, 2020
3481eb7
Replace existing scheme for generating hardware intrinsics into CoreL…
davidwrighton Mar 6, 2020
5b09c68
Update ReadyToRun format description for Check_InstructionSetSupport
davidwrighton Mar 6, 2020
dafa2a2
Fix Linux build break
davidwrighton Mar 6, 2020
75d25db
Address feedback from Michal
davidwrighton Mar 6, 2020
4e73d85
Fix runtime handling of InstructionSetSupport to actually work
davidwrighton Mar 7, 2020
e8d8cae
Fix formatting issues
davidwrighton Mar 27, 2020
c505b2c
New BotR chapter on vectors and intrinsics
davidwrighton Mar 16, 2020
7347efc
Update to latest names for internal compiler behavior that is instruc…
davidwrighton Mar 16, 2020
ab4a9bb
Update JITEEVersionIdentifier
davidwrighton Mar 16, 2020
4146867
Managed side changes for rebase
davidwrighton Mar 27, 2020
8f763bf
Rework native side of changes after rebase
davidwrighton Mar 27, 2020
d0eb39d
Fix arm build failure
davidwrighton Mar 20, 2020
e011f1b
Address formatting issues
davidwrighton Mar 20, 2020
d6dfaa9
Fix two issues
davidwrighton Mar 20, 2020
b45ac9b
Fix formatting
davidwrighton Mar 22, 2020
d0c54ab
Finish up changes to BotR chapter
davidwrighton Mar 23, 2020
c7875f8
Fix formatting
davidwrighton Mar 23, 2020
93f44a2
Add notifyInstructionSetUsage JITDUMP
davidwrighton Mar 27, 2020
8bb2d1c
Apply suggestions from code review
davidwrighton Mar 26, 2020
113173d
Update docs/design/coreclr/botr/vectors-and-intrinsics.md
davidwrighton Mar 26, 2020
c6662ef
Remove unnecessary using changes
davidwrighton Mar 27, 2020
e9c19a0
Remove aggressive optimization support
davidwrighton Mar 27, 2020
0d7d587
Remove unneeded modified line
davidwrighton Mar 27, 2020
f68ac24
Fixup aggressive optimization handling removal
davidwrighton Mar 28, 2020
3e998f5
Michal's feedback
davidwrighton Mar 31, 2020
617c2ac
Code review feedback
davidwrighton Apr 1, 2020
670d042
Improve clarity around computing if a type is a SIMD type
davidwrighton Apr 1, 2020
f200242
Fix x86 and arm build break
davidwrighton Apr 1, 2020
4138f7c
Update BotR chapter in response to feedback
davidwrighton Apr 1, 2020
4814331
Code Review feedback from Carol
davidwrighton Apr 2, 2020
0ba367f
Fix formatting error
davidwrighton Apr 2, 2020
e535940
Michal feedback
davidwrighton Apr 2, 2020
b8e07bc
Last nits
davidwrighton Apr 2, 2020
9feef51
Fix uninitialized field
davidwrighton Apr 2, 2020
0b8b9f4
Tweak additional discovered difference
davidwrighton Apr 3, 2020
f399ccd
#ifdef
davidwrighton Apr 3, 2020
67ed9e4
Use conservative fix for isSubRegisterSIMDType
davidwrighton Apr 3, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/design/coreclr/botr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Below is a table of contents.
- [Cross-platform Minidumps](xplat-minidump-generation.md)
- [Mixed Mode Assemblies](mixed-mode.md)
- [Guide For Porting](guide-for-porting.md)
- [Vectors and Intrinsics](vectors-and-intrinsics.md)


It may be possible that this table is not complete. You can get a complete list
Expand Down
1 change: 1 addition & 0 deletions docs/design/coreclr/botr/readytorun-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,7 @@ fixup kind, the rest of the signature varies based on the fixup kind.
| READYTORUN_FIXUP_DeclaringTypeHandle | 0x2D | Dictionary lookup for method declaring type. Followed by the type signature.
| READYTORUN_FIXUP_IndirectPInvokeTarget | 0x2E | Target (indirect) of an inlined PInvoke. Followed by method signature.
| READYTORUN_FIXUP_PInvokeTarget | 0x2F | Target of an inlined PInvoke. Followed by method signature.
| READYTORUN_FIXUP_Check_InstructionSetSupport | 0x30 | Specify the instruction sets that must be supported/unsupported to use the R2R code associated with the fixup.
| READYTORUN_FIXUP_ModuleOverride | 0x80 | When or-ed to the fixup ID, the fixup byte in the signature is followed by an encoded uint with assemblyref index, either within the MSIL metadata of the master context module for the signature or within the manifest metadata R2R header table (used in cases inlining brings in references to assemblies not seen in the input MSIL).

#### Method Signatures
Expand Down
189 changes: 189 additions & 0 deletions docs/design/coreclr/botr/vectors-and-intrinsics.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions src/coreclr/src/inc/corcompile.h
Original file line number Diff line number Diff line change
Expand Up @@ -691,6 +691,8 @@ enum CORCOMPILE_FIXUP_BLOB_KIND
ENCODE_INDIRECT_PINVOKE_TARGET, /* For calling a pinvoke method ptr indirectly */
ENCODE_PINVOKE_TARGET, /* For calling a pinvoke method ptr */

ENCODE_CHECK_INSTRUCTION_SET_SUPPORT, /* Define the set of instruction sets that must be supported/unsupported to use the fixup */

ENCODE_MODULE_HANDLE = 0x50, /* Module token */
ENCODE_STATIC_FIELD_ADDRESS, /* For accessing a static field */
ENCODE_MODULE_ID_FOR_STATICS, /* For accessing static fields */
Expand Down
61 changes: 61 additions & 0 deletions src/coreclr/src/inc/corinfoinstructionset.h
Original file line number Diff line number Diff line change
Expand Up @@ -428,4 +428,65 @@ inline const char *InstructionSetToString(CORINFO_InstructionSet instructionSet)
#endif
}

inline CORINFO_InstructionSet InstructionSetFromR2RInstructionSet(ReadyToRunInstructionSet r2rSet)
{
#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable: 4065) // disable warning for switch statement with only default label.
#endif

switch (r2rSet)
{
#ifdef TARGET_ARM64
case READYTORUN_INSTRUCTION_ArmBase: return InstructionSet_ArmBase;
case READYTORUN_INSTRUCTION_AdvSimd: return InstructionSet_AdvSimd;
case READYTORUN_INSTRUCTION_Aes: return InstructionSet_Aes;
case READYTORUN_INSTRUCTION_Crc32: return InstructionSet_Crc32;
case READYTORUN_INSTRUCTION_Sha1: return InstructionSet_Sha1;
case READYTORUN_INSTRUCTION_Sha256: return InstructionSet_Sha256;
case READYTORUN_INSTRUCTION_Atomics: return InstructionSet_Atomics;
#endif // TARGET_ARM64
#ifdef TARGET_AMD64
case READYTORUN_INSTRUCTION_Sse: return InstructionSet_SSE;
case READYTORUN_INSTRUCTION_Sse2: return InstructionSet_SSE2;
case READYTORUN_INSTRUCTION_Sse3: return InstructionSet_SSE3;
case READYTORUN_INSTRUCTION_Ssse3: return InstructionSet_SSSE3;
case READYTORUN_INSTRUCTION_Sse41: return InstructionSet_SSE41;
case READYTORUN_INSTRUCTION_Sse42: return InstructionSet_SSE42;
case READYTORUN_INSTRUCTION_Avx: return InstructionSet_AVX;
case READYTORUN_INSTRUCTION_Avx2: return InstructionSet_AVX2;
case READYTORUN_INSTRUCTION_Aes: return InstructionSet_AES;
case READYTORUN_INSTRUCTION_Bmi1: return InstructionSet_BMI1;
case READYTORUN_INSTRUCTION_Bmi2: return InstructionSet_BMI2;
case READYTORUN_INSTRUCTION_Fma: return InstructionSet_FMA;
case READYTORUN_INSTRUCTION_Lzcnt: return InstructionSet_LZCNT;
case READYTORUN_INSTRUCTION_Pclmulqdq: return InstructionSet_PCLMULQDQ;
case READYTORUN_INSTRUCTION_Popcnt: return InstructionSet_POPCNT;
#endif // TARGET_AMD64
#ifdef TARGET_X86
case READYTORUN_INSTRUCTION_Sse: return InstructionSet_SSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may have addressed this in an earlier discussion, but I believe that these need to be the same as the TARGET_AMD64 ones - is there a reason they can't share code here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are autogenerated, and it would add a fair amount of complexity to have the autogenerator write them out once. In this particular part of the code, they don't actually have to be the same (that's restricted to a small bit of logic that is in crossgen2, which have explicit asserts for the details that must remain the same.) At the same time, the way they are authored in the InstructionSetDesc.txt, the autogenerator will happen to keep things the same and as long as new code follows the pattern in there, stuff will remain the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I recall that now. Seems reasonable.

case READYTORUN_INSTRUCTION_Sse2: return InstructionSet_SSE2;
case READYTORUN_INSTRUCTION_Sse3: return InstructionSet_SSE3;
case READYTORUN_INSTRUCTION_Ssse3: return InstructionSet_SSSE3;
case READYTORUN_INSTRUCTION_Sse41: return InstructionSet_SSE41;
case READYTORUN_INSTRUCTION_Sse42: return InstructionSet_SSE42;
case READYTORUN_INSTRUCTION_Avx: return InstructionSet_AVX;
case READYTORUN_INSTRUCTION_Avx2: return InstructionSet_AVX2;
case READYTORUN_INSTRUCTION_Aes: return InstructionSet_AES;
case READYTORUN_INSTRUCTION_Bmi1: return InstructionSet_BMI1;
case READYTORUN_INSTRUCTION_Bmi2: return InstructionSet_BMI2;
case READYTORUN_INSTRUCTION_Fma: return InstructionSet_FMA;
case READYTORUN_INSTRUCTION_Lzcnt: return InstructionSet_LZCNT;
case READYTORUN_INSTRUCTION_Pclmulqdq: return InstructionSet_PCLMULQDQ;
case READYTORUN_INSTRUCTION_Popcnt: return InstructionSet_POPCNT;
#endif // TARGET_X86

default:
return InstructionSet_ILLEGAL;
}
#ifdef _MSC_VER
#pragma warning(pop)
#endif
}

#endif // CORINFOINSTRUCTIONSET_H
5 changes: 5 additions & 0 deletions src/coreclr/src/inc/corjitflags.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,11 @@ class CORJIT_FLAGS
instructionSetFlags.AddInstructionSet(instructionSet);
}

bool IsSet(CORINFO_InstructionSet instructionSet) const
{
return instructionSetFlags.HasInstructionSet(instructionSet);
}

void Clear(CORINFO_InstructionSet instructionSet)
{
instructionSetFlags.RemoveInstructionSet(instructionSet);
Expand Down
4 changes: 4 additions & 0 deletions src/coreclr/src/inc/readytorun.h
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,8 @@ enum ReadyToRunFixupKind

READYTORUN_FIXUP_IndirectPInvokeTarget = 0x2E, /* Target (indirect) of an inlined pinvoke */
READYTORUN_FIXUP_PInvokeTarget = 0x2F, /* Target of an inlined pinvoke */

READYTORUN_FIXUP_Check_InstructionSetSupport= 0x30, /* Define the set of instruction sets that must be supported/unsupported to use the fixup */
};

//
Expand Down Expand Up @@ -359,6 +361,8 @@ enum ReadyToRunHelper
READYTORUN_HELPER_StackProbe = 0x111,
};

#include "readytoruninstructionset.h"

//
// Exception info
//
Expand Down
4 changes: 2 additions & 2 deletions src/coreclr/src/jit/codegenarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2723,7 +2723,7 @@ void CodeGen::genLockedInstructions(GenTreeOp* treeNode)

emitAttr dataSize = emitActualTypeSize(data);

if (compiler->compSupports(InstructionSet_Atomics))
if (compiler->compOpportunisticallyDependsOn(InstructionSet_Atomics))
{
assert(!data->isContainedIntOrIImmed());

Expand Down Expand Up @@ -2860,7 +2860,7 @@ void CodeGen::genCodeForCmpXchg(GenTreeCmpXchg* treeNode)
genConsumeRegs(data);
genConsumeRegs(comparand);

if (compiler->compSupports(InstructionSet_Atomics))
if (compiler->compOpportunisticallyDependsOn(InstructionSet_Atomics))
{
emitAttr dataSize = emitActualTypeSize(data);

Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/src/jit/codegenxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7256,7 +7256,7 @@ void CodeGen::genSSE2BitwiseOp(GenTree* treeNode)
void CodeGen::genSSE41RoundOp(GenTreeOp* treeNode)
{
// i) SSE4.1 is supported by the underlying hardware
assert(compiler->compSupports(InstructionSet_SSE41));
assert(compiler->compIsaSupportedDebugOnly(InstructionSet_SSE41));

// ii) treeNode oper is a GT_INTRINSIC
assert(treeNode->OperGet() == GT_INTRINSIC);
Expand Down
5 changes: 3 additions & 2 deletions src/coreclr/src/jit/compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2194,10 +2194,11 @@ void Compiler::compSetProcessor()
#endif // TARGET_X86

CORINFO_InstructionSetFlags instructionSetFlags = jitFlags.GetInstructionSetFlags();
opts.compSupportsISA = 0;
opts.compSupportsISAReported = 0;

#ifdef TARGET_XARCH
// Instruction set flags for Intel hardware intrinsics
opts.compSupportsISA = 0;
bool avxSupported = false;

if (JitConfig.EnableHWIntrinsic())
{
Expand Down
103 changes: 81 additions & 22 deletions src/coreclr/src/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -7635,12 +7635,12 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SIMDLevel getSIMDSupportLevel()
{
#if defined(TARGET_XARCH)
if (compSupports(InstructionSet_AVX2))
if (compOpportunisticallyDependsOn(InstructionSet_AVX2))
{
return SIMD_AVX2_Supported;
}

if (compSupports(InstructionSet_SSE42))
if (compOpportunisticallyDependsOn(InstructionSet_SSE42))
{
return SIMD_SSE4_Supported;
}
Expand Down Expand Up @@ -7786,7 +7786,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
unreached();
}
}
assert(emitTypeSize(simdType) <= maxSIMDStructBytes());
assert(emitTypeSize(simdType) <= largestEnregisterableStructSize());
switch (simdBaseType)
{
case TYP_FLOAT:
Expand Down Expand Up @@ -8029,26 +8029,18 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

GenTree* getOp1ForConstructor(OPCODE opcode, GenTree* newobjThis, CORINFO_CLASS_HANDLE clsHnd);

// Whether SIMD vector occupies part of SIMD register.
// SSE2: vector2f/3f are considered sub register SIMD types.
// AVX: vector2f, 3f and 4f are all considered sub register SIMD types.
bool isSubRegisterSIMDType(CORINFO_CLASS_HANDLE typeHnd)
{
unsigned sizeBytes = 0;
var_types baseType = getBaseTypeAndSizeOfSIMDType(typeHnd, &sizeBytes);
return (baseType == TYP_FLOAT) && (sizeBytes < getSIMDVectorRegisterByteLength());
}

bool isSubRegisterSIMDType(GenTreeSIMD* simdNode)
{
return (simdNode->gtSIMDSize < getSIMDVectorRegisterByteLength());
return (simdNode->gtSIMDSize < maxSIMDStructBytes());
}

// Get the type for the hardware SIMD vector.
// This is the maximum SIMD type supported for this target.
var_types getSIMDVectorType()
{
#if defined(TARGET_XARCH)
// Fix the exact level of simd vector support by using compExactlyDependsOn on the Avx2 isa
compExactlyDependsOn(InstructionSet_AVX2);
if (getSIMDSupportLevel() == SIMD_AVX2_Supported)
{
return TYP_SIMD32;
Expand Down Expand Up @@ -8095,6 +8087,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
else
{
assert(getSIMDSupportLevel() >= SIMD_SSE2_Supported);
compExactlyDependsOn(InstructionSet_AVX2); // Record that AVX2 isn't supported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really feels uncomfortable. Might it be worth adding a compExactlyForbids? I know we've been over this before, but this looks like an assertion that it is present, not an assertion that it is not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't been over this particular usage before, as it might have been obscured by some of the other changes you were looking at. I share your discomfort here, Perhaps

// Ensure that code will not execute if an instruction set is useable. Call only
// if the instruction set has previously reported as unuseable, but when
// that that status has not yet been recorded to the AOT compiler
void compVerifyInstructionSetUnuseable(CORINFO_InstructionSet isa)
{
    // use compExactlyDependsOn to ensure that the exact state of the useability of isa is recorded
    bool isaUseable = compExactlyDependsOn(isa);
    // Assert that the useability of the isa is false. If true, this function should never be called.
    assert(!isaUseable);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwrighton - that sounds like a great alternative; wordy but certainly a context where a little wordiness is probably justified.

return XMM_REGSIZE_BYTES;
}
#elif defined(TARGET_ARM64)
Expand All @@ -8115,7 +8108,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
unsigned int maxSIMDStructBytes()
{
#if defined(FEATURE_HW_INTRINSICS) && defined(TARGET_XARCH)
if (compSupports(InstructionSet_AVX))
if (compOpportunisticallyDependsOn(InstructionSet_AVX))
{
return YMM_REGSIZE_BYTES;
}
Expand All @@ -8128,6 +8121,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
return getSIMDVectorRegisterByteLength();
#endif
}

unsigned int minSIMDStructBytes()
{
return emitTypeSize(TYP_SIMD8);
Expand Down Expand Up @@ -8187,14 +8181,40 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
unsigned largestEnregisterableStructSize()
{
#ifdef FEATURE_SIMD
unsigned vectorRegSize = getSIMDVectorRegisterByteLength();
#if defined(FEATURE_HW_INTRINSICS) && defined(TARGET_XARCH)
if (opts.IsReadyToRun())
{
// Return constant instead of maxSIMDStructBytes, as maxSIMDStructBytes performs
// checks that are effected by the current level of instruction set support would
// otherwise cause the highest level of instruction set support to be reported to crossgen2.
// and this api is only ever used as an optimization or assert, so no reporting should
// ever happen.
return YMM_REGSIZE_BYTES;
}
#endif // defined(FEATURE_HW_INTRINSICS) && defined(TARGET_XARCH)
unsigned vectorRegSize = maxSIMDStructBytes();
assert(vectorRegSize >= TARGET_POINTER_SIZE);
return vectorRegSize;
#else // !FEATURE_SIMD
return TARGET_POINTER_SIZE;
#endif // !FEATURE_SIMD
}

// Use to determine if a struct *might* be a SIMD type. As this function only takes a size, many
// structs will fit the criteria.
bool structSizeMightRepresentSIMDType(size_t structSize)
{
#ifdef FEATURE_SIMD
// Do not use maxSIMDStructBytes as that api in R2R on X86 and X64 may notify the JIT
// about the size of a struct under the assumption that the struct size needs to be recorded.
// By using largestEnregisterableStructSize here, the detail of whether or not Vector256<T> is
// enregistered or not will not be messaged to the R2R compiler.
return (structSize >= minSIMDStructBytes()) && (structSize <= largestEnregisterableStructSize());
#else
return false;
#endif // FEATURE_SIMD
}

#ifdef FEATURE_SIMD
static bool vnEncodesResultTypeForSIMDIntrinsic(SIMDIntrinsicID intrinsicId);
#endif // !FEATURE_SIMD
Expand Down Expand Up @@ -8272,21 +8292,63 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
return false;
}

bool compSupports(CORINFO_InstructionSet isa) const
#ifdef DEBUG
// Answer the question: Is a particular ISA supported?
// Use this api when asking the question so that future
// ISA questions can be asked correctly or when asserting
// support/nonsupport for an instruction set
bool compIsaSupportedDebugOnly(CORINFO_InstructionSet isa) const
{
#if defined(TARGET_XARCH) || defined(TARGET_ARM64)
return (opts.compSupportsISA & (1ULL << isa)) != 0;
#else
return false;
#endif
}
#endif // DEBUG

void notifyInstructionSetUsage(CORINFO_InstructionSet isa, bool supported) const;

// Answer the question: Is a particular ISA supported?
// The result of this api call will exactly match the target machine
// on which the function is executed (except for CoreLib, where there are special rules)
bool compExactlyDependsOn(CORINFO_InstructionSet isa) const
{

#if defined(TARGET_XARCH) || defined(TARGET_ARM64)
uint64_t isaBit = (1ULL << isa);
bool isaSupported = (opts.compSupportsISA & (1ULL << isa)) != 0;
if ((opts.compSupportsISAReported & isaBit) == 0)
{
notifyInstructionSetUsage(isa, isaSupported);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a jitdump here? Eg

Suggested change
notifyInstructionSetUsage(isa, isaSupported);
JITDUMP("Notifying runtime that codegen for this method is %susable if runtime HW supports %s\n",
isaSupported ? "" : "NOT ", InstructionSetToString(isa));
notifyInstructionSetUsage(isa, isaSupported);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notifyInstructionSetUsage does a jitdump internally. (All calls to the api are dumped)

((Compiler*)this)->opts.compSupportsISAReported |= isaBit;
}

return isaSupported;
#else
return false;
#endif
}

// Answer the question: Is a particular ISA supported?
// The result of this api call will match the target machine if the result is true
// If the result is false, then the target machine may have support for the instruction
bool compOpportunisticallyDependsOn(CORINFO_InstructionSet isa) const
{
if ((opts.compSupportsISA & (1ULL << isa)) != 0)
{
return compExactlyDependsOn(isa);
}
else
{
return false;
}
}

bool canUseVexEncoding() const
{
#ifdef TARGET_XARCH
return compSupports(InstructionSet_AVX);
return compOpportunisticallyDependsOn(InstructionSet_AVX);
#else
return false;
#endif
Expand Down Expand Up @@ -8381,14 +8443,11 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
{
JitFlags* jitFlags; // all flags passed from the EE

#if defined(TARGET_XARCH) || defined(TARGET_ARM64)
uint64_t compSupportsISA;
#endif
uint64_t compSupportsISAReported;
void setSupportedISAs(CORINFO_InstructionSetFlags isas)
{
#if defined(TARGET_XARCH) || defined(TARGET_ARM64)
compSupportsISA = isas.GetFlagsRaw();
#endif
}

unsigned compFlags; // method attributes
Expand Down
4 changes: 2 additions & 2 deletions src/coreclr/src/jit/gentree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17066,7 +17066,7 @@ GenTree* Compiler::gtGetSIMDZero(var_types simdType, var_types baseType, CORINFO
switch (simdType)
{
case TYP_SIMD16:
if (compSupports(InstructionSet_SSE))
if (compExactlyDependsOn(InstructionSet_SSE))
{
// We only return the HWIntrinsicNode if SSE is supported, since it is possible for
// the user to disable the SSE HWIntrinsic support via the COMPlus configuration knobs
Expand All @@ -17075,7 +17075,7 @@ GenTree* Compiler::gtGetSIMDZero(var_types simdType, var_types baseType, CORINFO
}
return nullptr;
case TYP_SIMD32:
if (compSupports(InstructionSet_AVX))
if (compExactlyDependsOn(InstructionSet_AVX))
{
// We only return the HWIntrinsicNode if AVX is supported, since it is possible for
// the user to disable the AVX HWIntrinsic support via the COMPlus configuration knobs
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/src/jit/hwintrinsic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -309,7 +309,7 @@ NamedIntrinsic HWIntrinsicInfo::lookupId(Compiler* comp,
return NI_Illegal;
}

bool isIsaSupported = comp->compSupports(isa) && comp->compSupportsHWIntrinsic(isa);
bool isIsaSupported = comp->compExactlyDependsOn(isa) && comp->compSupportsHWIntrinsic(isa);

if (strcmp(methodName, "get_IsSupported") == 0)
{
Expand Down
Loading