Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Create RefPositions without TreeNodeInfo #16517

Merged
merged 3 commits into from
May 23, 2018

Conversation

CarolEidt
Copy link

@CarolEidt CarolEidt commented Feb 23, 2018

This is the next phase of building RefPositions incrementally.
The big pictures is that, instead of creating TreeNodeInfo with the register requirements for each node, the Build methods in LinearScan build the RefPositions directly, putting the defs in a DefList for when the consuming node builds the corresponding uses.
There are zero diffs for crossgen of frameworks & tests across all the x64 & x86 + altjits (arm64, arm and x64/ux), aside from a small number of improvements due to some RMW handling changes.

@CarolEidt
Copy link
Author

This results in the following throughput improvements (crossgen of SPC.dll) as measured by pin instruction count:

  • MinOpts x86: 4.3% improvement
  • Opt x86: 2.5% improvement
  • MinOpts x64: 4.3% improvement
  • Opt x64: 2.7% improvement

@CarolEidt
Copy link
Author

@dotnet/jit-contrib PTAL (for preliminary feedback)

}
#endif
break;
RefPosition* def = BuildDef(tree);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to use the def variable for anything?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; at one point I had thought I would need to keep the defs around, but realized it was not necessary.

assert(info->dstCount == 1);
srcCount = 0;
assert(dstCount == 1);
dstCandidates = RBM_NONE;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placing this in a "else" ifdef would be better IMO

// to special case them.
// These tree nodes will have their op1 marked as isDelayFree=true.
// That is, op1's reg remains in use until the subsequent instruction.
GenTree* addr = tree->gtOp.gtOp1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using functions like gtGetOp1 and gtGetOp2 consistently

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do; I was programmed to avoid it back before we added gtGetOp2IfPresent(), and I guess I'm a creature of habit.

dstCount = 1;
if (!data->isContained())
{
RefPosition* dataUse = dataUse = BuildUse(data);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something odd with this line, dataUse appears twice. And it's not used anywhere anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mistakenly was initially setting delay free on the data, not addr, which was a mistake - no clue about the double assignment, though.

for (; sourceInfo != nullptr; sourceInfo = sourceInfo->Next())
srcCandidates = allRegs(TYP_INT) & ~RBM_RCX;
dstCandidates = allRegs(TYP_INT) & ~RBM_RCX;
if (tree->IsReverseOp())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse op? Wasn't this removed from LIR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I feel a little silly - I had either forgotten or had missed that this happened. There's code in LSRA that attempts to handle it, but I see that the Rationalizer is clearing it on all nodes, and CheckLIR is validating it. I added an issue #16528


assert(!call->isContained());
info->srcCount = 0;
srcCount = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant

}
if (!tree->isContained())
{
info->srcCount = srcCount;
RefPosition* def = BuildDef(tree, dstCandidates);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable?

bool isUnsignedMultiply = ((tree->gtFlags & GTF_UNSIGNED) != 0);
bool requiresOverflowCheck = tree->gtOverflowEx();
// Only non-floating point mul has special requirements
if (!varTypeIsFloating(tree->TypeGet()))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it consistent with BuildDivMod that does if (is float) { build simple; return; }? Or move the float check inside the switch case and rename BuildMul and BuildDivMod to BuildIntegerMul and BuildIntegerDivMod?

Note that FP add/sub/mul/div use the same instruction format but they're handled more or less differently throughout lower/ra/codegen. I have a PR to clean this up but I was waiting for your lsra refactoring to finish it.

reinterpret_cast<LocationInfoListNode*>(compiler->compGetMem(preallocateSize, CMK_LSRA));
size_t preallocateSize = sizeof(RefInfoListNode) * preallocate;
RefInfoListNode* preallocatedNodes =
reinterpret_cast<RefInfoListNode*>(compiler->compGetMem(preallocateSize, CMK_LSRA));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing code but note that casting from void* to X* is normally done via static_cast. reinterpret_cast is usually reserved for "weird" stuff such as casting int* to float*.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change

@CarolEidt
Copy link
Author

@mikedn - thanks so much for the quick and thorough review; it's really appreciated! Clearly I need to fix some bugs ;-) and will incorporate your feedback in the next round.

@mikedn
Copy link

mikedn commented Feb 23, 2018

quick and thorough review

Far from thorough, I've only took a quick look in the morning and then went to work. Maybe I'll take another look this evening :)

}

//------------------------------------------------------------------------
// getKillSetForMul: Determine the liveness kill set for a mod or div node.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong function name in comment. Applies to subsequent functions as well.

case GT_MUL_LONG:
#endif
killMask = RBM_RAX | RBM_RDX;
killMask = getKillSetForMul(tree->AsOp());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue with this particular line per se but this whole function seemed a bit strange to me when I looked at it in the past. I would have expected BuildNode/TreeNodeInfoInit to somehow deal with this instead of having this whole switch here, it looks like a special case.

Looking now through this function's uses I see that one is likely redundant: buildUpperVectorSaveRefPositions is only called from BuildDefsWithKills and this one has a killMask that's supposed to be the same as the one returned by getKillSetForNode due to an assert.

And then there's the said assert in BuildDefsWithKills. Does this function actually needs the killMask parameter?

Apparently the only "real" use of getKillSetForNode is in some stress related code in buildRefPositionsForNode. Maybe there's should be another way to communicate the kill mask from build node (e.g. via a class member similar to pendingDelayFree)?

This may then allow moving functions like getKillSetForMul to where they really belong - lsraxarch.cpp.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to maintain the kill mask as a class member. I'm making the general getKillSetForNode DEBUG-only; it will only be used in the assert and in the stress modes where we constrain nodes.

#ifdef _TARGET_XARCH_
if (tgtPrefUse != nullptr)
{
defRefPosition->getInterval()->assignRelatedIntervalIfUnassigned(tgtPrefUse->getInterval());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't defRefPosition->getInterval() same as interval?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is.

RefPosition* useRefPos = newRefPosition(interval, currentLoc, RefTypeUse, operand, candidates, multiRegIdx);
if (regOptional)
{
useRefPos->setAllocateIfProfitable(true);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better performance wise to always call setAllocateIfProfitable, always setting that bit might very well be cheaper than a branch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

//
void LinearScan::BuildSimple(GenTree* tree)
int LinearScan::BuildSimple(GenTree* tree)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function really useful? For example, I look at xarch build code and I see that it handles GT_CNS_INT, GT_CND_DBL, GT_CNS_LNG (huh? looks like this one shouldn't reach LSRA due to decomposition) so attempting to handle GTK_CONST in BuildSimple is probably useless.

In general, BuildNode's switch already handles a ton of opers, I might be better (from a performance point of view and even for clarity) to add whatever opers are missing to those BuildNode switches and make default case unreached.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, but probably not something I want to undertake with this PR.

src/jit/lsra.h Outdated
{
#ifdef _TARGET_XARCH_
RefPosition* tgtPrefUse = nullptr;
if (node->OperIsBinary() && isRMWRegOper(node))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to BuildSimple, it's not clear if this kind of special casing in a what's supposed to be general purpose function is useful.

It may be better to simply call BuildRMWUses directly from BuildNode's switch relevant cases (e.g. case GT_XOR) rather than calling BuildBinaryUses from many places and having to decide again what to do.

int dstCount = 0;
regMaskTP dstCandidates = RBM_NONE;
regMaskTP killMask = RBM_NONE;
pendingDelayFree = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These member assignments should be visually separated from the local variable above, I'd move them above locals and add a blank line. Or, perhaps they should not even be in BuildNode but in its caller.

Also, the need for these data members is a bit unfortunate. It's kind of difficult to track down where they are assigned values and where they are used, the delay free ones in particular.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that they are somewhat unfortunate; I tried to come up with an approach that didn't require these data members on LinearScan, but could only come up with models that involved excessive passing around of state.
I'd prefer to keep them in BuildNode, however, as that's the main entry point for all the building functionality. I'll separate them and document them more clearly.

assert(!tree->IsUnusedValue() || (dstCount != 0));
assert(dstCount == tree->GetRegisterDstCount());
INDEBUG(dumpNodeInfo(tree, dstCandidates, srcCount, dstCount));
return srcCount;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the return value of this function is used only in an assert. I hope that assert is worth all the added complication :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it's something I think should probably be removed, but I found it useful for debugging and comparison as I was making the transition from TreeNodeInfo

isLocalDefUse = true;
tree->SetUnusedValue();
}
RefPosition* def = BuildDef(tree);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vertical alignment of all assignments is the worst formatting rule the JIT code uses...

#endif // _TARGET_X86_
BuildDef(tree, dstCandidates);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just pass RBM_BYTE_REGS directly? It's RBM_ALLINT on x64 so it should do the right thing.

int srcCount = 0;
int internalCount = 0;
const int MaxInternalCount = 4;
RefPosition* internalDefs[MaxInternalCount];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used it seems. Same for internalCount above. Strange that they have the same name as 2 LinearScan data members.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; I had originally added these to each method needing internal temps, but it was quite a bit messier than handling it pseudo-globally. I'll delete these.

bool hasMultiRegRetVal = false;
ReturnTypeDesc* retTypeDesc = nullptr;
RefPosition* internalDefs[MAX_ARG_REG_COUNT];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is populated in the code below but then it doesn't appear to be used.

{
return;
}
int srcCount = BuildRMWUses(tree->AsOp());
Copy link

@mikedn mikedn Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, calling BuildRMWUses without calling isRMWRegOper that has special handling for GT_MUL?

@mikedn
Copy link

mikedn commented Feb 26, 2018

Hmm, I left a few comments related to RMW but I'm not sure I'm seeing the big picture, there may be a few issues that need to be considered:

  • BuildHWIntrinsic seems broken when it comes to RMW - many intrinsics are probably RMW when VEX is not available but that code doesn't handle this case
  • FP scalar ops (add, sub, mul, div) are currently treated as RMW but when VEX is available they need not be, this is something that we should improve in the future.

Now, neither issue is directly related to this PR but you may want to consider the impact they have on your refactoring. It may be that isRMWRegOper needs to go away and BuildX functions should call BuildRMWUses directly when they need it.

// Example1: GT_EQ(int, op1 of type ubyte, op2 of type ubyte) - in this case codegen uses
// ubyte as the result of comparison and if the result needs to be materialized into a reg
// simply zero extend it to TYP_INT size. Here is an example of generated code:
// cmp dl, byte ptr[addr mode]
Copy link

@mikedn mikedn Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setcc instruction is missing from the example

Copy link

@mikedn mikedn Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sure why "in this case codegen uses ubyte as the result of comparison and if the result needs to be materialized into a reg simply zero extend it to TYP_INT size" appears here, this comment is more suitable where dstCandidates is set.

// Example2: GT_EQ(int, op1 of type ubyte, op2 is GT_CNS_INT) - in this case codegen uses
// ubyte as the result of the comparison and if the result needs to be materialized into a reg
// simply zero extend it to TYP_INT size.
else if (varTypeIsByte(op1) && op2->IsCnsIntOrI())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this case added? Codegen should not attempt to generate a byte instruction unless both operands are byte. If the constant isn't a byte then an int compare instruction should be generated even if the first operand is a byte, it's supposed to have been extended to int.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cases where copied from LinearScan::ExcludeNonByteableRegisters(), previously in lsraxarch.cpp. It may be that these can be improved/eliminated, but I am aiming for zero-diffs for now.

assert(!sourceLo->isContained() && !sourceHi->isContained());
RefPosition* sourceLoUse = BuildUse(sourceLo, srcCandidates);
RefPosition* sourceHiUse = BuildUse(sourceHi, srcCandidates);

if (tree->OperGet() == GT_LSH_HI)
if (!tree->isContained())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, when is a shift node contained?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A shift node can be contained under a STOREIND when it is part of a memory RMW op. In most cases this is just handled in BuildIndir, but shift ops have special register requirements.

unsigned size = blkNode->gtBlkSize;
GenTree* source = blkNode->Data();
int srcCount = 0;
int internalCount = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this have been removed?

pgosupport.cmake Outdated
@@ -30,8 +30,8 @@ function(add_pgo TargetName)
# If we don't have profile data availble, gracefully fall back to a non-PGO opt build
if(EXISTS ${ProfilePath})
if(WIN32)
# set_property(TARGET ${TargetName} APPEND_STRING PROPERTY LINK_FLAGS_RELEASE " /LTCG /USEPROFILE:PGD=${ProfilePath}")
# set_property(TARGET ${TargetName} APPEND_STRING PROPERTY LINK_FLAGS_RELWITHDEBINFO " /LTCG /USEPROFILE:PGD=${ProfilePath}")
set_property(TARGET ${TargetName} APPEND_STRING PROPERTY LINK_FLAGS_RELEASE " /LTCG /USEPROFILE:PGD=${ProfilePath}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably you wanted to build without PGO and accidentally checked in the commented out lines? There's a build.cmd if you need that - -nopgooptimize.

@CarolEidt CarolEidt force-pushed the ElimNodeInfo branch 5 times, most recently from e1d63a4 to 238a4dc Compare March 19, 2018 21:22
@CarolEidt
Copy link
Author

@dotnet/jit-contrib ping.
All of the x64_arm64_altjit legs failed on the same 11 tests with Assertion failed 'NYI_ARM64: Arm64 does not support tail calls via helpers.'
Re-trying the ubuntu jitstressregs8 leg - it failed with no explicable message besides "category flow".

@sdmaclea
Copy link

All of the x64_arm64_altjit legs failed on the same 11 tests with Assertion failed 'NYI_ARM64: Arm64 does not support tail calls via helpers.'

This was fixed recently

@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs0x1000
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs0x80
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs1
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs2
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs3
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs4
@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs8

@CarolEidt
Copy link
Author

The 'x64_arm64_altjit Checked jitstressregs3' failures also occur in master.
Filed #18052

@CarolEidt
Copy link
Author

@dotnet-bot test Windows_NT x64_arm64_altjit Checked jitstressregs3

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple minor things noticed so far

src/jit/lsra.h Outdated
@@ -82,118 +82,117 @@ inline regMaskTP calleeSaveRegs(RegisterType rt)
//------------------------------------------------------------------------
// LocationInfo: Captures the necessary information for a node that is "in-flight"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change type name in comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

}
prevListNode = listNode;
}
assert(!"GetRefPosition didn't find the node");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert has wrong function name

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}
prevListNode = listNode;
}
assert(!"GetRefPosition didn't find the node");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert has wrong function name

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

//
// Return Value: a register mask of the registers killed
//
regMaskTP LinearScan::getKillSetForProfilerHook()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should be named getKillSetForProfilerHookTailcall

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that the node name is GT_PROF_HOOK, so this seems like the better name to me.

static bool OperIsMul(genTreeOps gtOper)
{
return (gtOper == GT_MUL) || (gtOper == GT_MULHI)
#if !defined(_TARGET_64BIT_) && !defined(LEGACY_BACKEND)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a conflict with @BruceForstall change to remove LEGACY_BACKEND

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are probably more conflicts as well; that's how it goes ;-)


if (op2 != nullptr)
{
return 2;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to add:

assert(!op1->OperIsList());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the scenarios where op1->OperIsList() and op2 != nullptr?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if op1 is a list, op2 must be null.

return 2;
}

if (op1 != nullptr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This would be easier to read (less nesting) if we did:

if (op1 == nullptr)
{
    return 0;
}

if (op1->OperIsList())
{
    // logic
}
else
{
    return 1;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to account for op2, but will restructure.

@@ -2263,9 +2290,8 @@ void LinearScan::BuildSIMD(GenTreeSIMD* simdTree)
// Return Value:
// None.

void LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree)
int LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned up this method here: #18078

It might be worthwhile merging the PRs together (some of the changes were fixing "correctness" issues I found, and the other changes are required to support the FMA instruction set, where any of the three operands can be contained).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes I did to support FMA are here: https://github.com/tannergooding/coreclr/commit/e402548087f2bf9554e83052911561c016c60346#diff-498fd4859cb147c9bf082f5ebf32ca8fR2507

(The PR isn't up as it requires the CoreFX change adding back the unimplemented APIs to the ref assembly).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worthwhile merging the PRs together

I think it might be easier to serialize them.

}

assert(numArgs > 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be: assert(numArgs >= 3) or are there cases where we (inefficiently) use a list for less than 2 args?

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.

{
info->srcCount = appendBinaryLocationInfoToList(tree->AsOp());
assert((kind & GTK_SMPOP) != 0);
int srcCount = BuildBinaryUses(tree->AsOp());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition hides the enclosing one; that seems wrong, or at least confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. (This default is rarely used, and @mikedn recommended we eliminate it, which is probably a good idea, but perhaps for another day.)

}
#endif // DEBUG
int newDefListCount = defList.Count();
int produce = newDefListCount - oldDefListCount;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only use (and assert using) produce is for amd64? Should you define this below, then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert was pre-existing, though before there were other uses of produce. However, I'd like to strengthen this assert to apply more broadly, but it requires some cleanup of the various nodes that produce multiple results. I've added a note to issue #13183 that deals with multi-reg nodes to strengthen this assert as well.

{
internalCandidates = allRegs(TYP_INT);
}
printf(" +<TreeNodeInfo %d=%d %di %df", dstCount, srcCount, internalIntCount, internalFloatCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are currently still many references in comments to the TreeNodeInfo struct -- and the type itself is still there. But the type is unused. Can you update the comments, and remove the references and type?

@@ -166,73 +184,99 @@ void LinearScan::BuildShiftLongCarry(GenTree* tree)
// requirements needed by LSRA to build the Interval Table (source,
// destination and internal [temp] register counts).
//
void LinearScan::BuildNode(GenTree* tree)
int LinearScan::BuildNode(GenTree* tree)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type isn't described in the comments above

}

//------------------------------------------------------------------------
// GetOperandInfo: Get the source registers for an operand that might be contained.
// BuildDef: Build one or more RefTypeDef RefPositions for the given node,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name in comment doesn't match function name

}

//------------------------------------------------------------------------
// GetOperandInfo: Get the source registers for an operand that might be contained.
// BuildDef: Build one or more RefTypeDef RefPositions for the given node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name in comment doesn't match function name

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I've only had a few minor comments, but overall it looks good to me.

@CarolEidt
Copy link
Author

@dotnet-bot test Windows_NT x64 Formatting

@CarolEidt CarolEidt merged commit b39a5b2 into dotnet:master May 23, 2018
@CarolEidt CarolEidt deleted the ElimNodeInfo branch May 23, 2018 14:39
CarolEidt added a commit to CarolEidt/coreclr that referenced this pull request Jul 3, 2018
This is no longer used after dotnet#16517
@erozenfeld
Copy link
Member

@CarolEidt We need to update ryujit-overview.md that still refers to TreeNodeInfo and gtLsraInfo.

@CarolEidt
Copy link
Author

@erozenfeld - thanks for pointing that out. I'm going to open an issue and assign it to myself so that it doesn't get lost.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants