Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Pull struct type info out of GenTreeObj #21705

Merged
merged 11 commits into from
Sep 5, 2019
Merged

Conversation

mikedn
Copy link

@mikedn mikedn commented Dec 29, 2018

Contributes to #19875
Contributes to #19412

@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from 835f679 to c6f9bf4 Compare December 31, 2018 19:33
@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from 7a9b46a to 11f70ea Compare February 6, 2019 19:51
@mikedn mikedn force-pushed the struct-layout branch 5 times, most recently from 3952d51 to 7955d32 Compare February 28, 2019 22:46
@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from e4c6b97 to 0087864 Compare March 9, 2019 15:50
@mikedn mikedn force-pushed the struct-layout branch 6 times, most recently from 8fd34c3 to df25389 Compare March 30, 2019 18:07
@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from 5a4fc3f to d62b071 Compare April 6, 2019 08:00
@mikedn mikedn mentioned this pull request Apr 8, 2019
@mikedn
Copy link
Author

mikedn commented Apr 22, 2019

@CarolEidt When you have some time it would be useful to have a discussion about this change. Originally it was primarily aimed at supporting struct typed GT_LCL_FLD nodes. These are already full and the only options we have is to make them large nodes (not ideal) or steal 2 bytes from gtLclOffs (which cannot be greater than 65535 anyway) to encode a "layout index".

Now, if the node size was the only concern perhaps making GTL_LCL_FLD large wouldn't be that bad. GT_OBJ is already large, so be it. But the approach GT_OBJ took is IMO impractical:

  • There are multiple data members that are related (gtClass, gtGcPtrs, _gtGcPtrCount, gtSlots and gtBlkSize) that require some care to keep in sync, copy when cloning the node.
  • Some are initialized lazily in order to minimize work and memory usage. This can be risky if someone forgets to do it.
  • The use of a BYTE* for the GC pointers adds more gas to the fire - for example, there's no consistent checking that no out-of-bounds access occur. It's also non-const and thus someone could change it, make it impossible to share between nodes having the same struct handle.
  • This data also appears in other JIT data structures, in slightly different forms (LclVarDsc, GenTreePutArgStk and probably more), just to make things inconsistent and require more work to pass it around (notably LclVarDsc stores the number of GC pointers in a 3-bit bitfield - lvStructGcCount - so if there are more than 7 you have to traverse lvGcLayout and count them).

I don't think there's any doubt that this information needs to be better handled in the JIT. That's the simple part. In summary:

class ClassLayout {
    CORINFO_CLASS_HANDLE m_classHandle;
    unsigned m_size;
    unsigned m_gcPtrCount;
    BYTE* m_gcPtrs;
    // Other technical details left out for simplicity.
    // Class is immutable and can be shared by, say, multiple `GT_OBJ` nodes.
    // A bunch of convenience functions are available HasGCPtr(), IsGCPtr(slot),
    // GetSlotCount() etc.
};

class ClassLayoutTable {
    // Basically a JitHashTable<CORINFO_CLASS_HANDLE, ClassLayout*>
    // but also allows access by index, similar to the LclVarDsc table
    // Users of `ClassLayout` can store a pointer or an index, depending
    // on the available space.
};

With this, we could probably replace pretty much every use of CORINFO_CLASS_HANDLE in the JIT's IR with ClassLayout* (or an index). For now I looked at GenTreeObj, LclVarDsc and GenTreePutArgStk:

  • In LclVarDsc this replaces lvGcLayout and lvStructGcCount.
  • In GenTreeObj/Blk this replaces everything, including gtBlkSize.
  • I'm still looking at GenTreePutArgStk (that's the worst) but it looks like there's no need for layout in it as it should obtain it from the source node.

The impact this can have on GenTreeObj/Blk is the interesting part.

  • GenTreeObj inherits from GenTreeBlk. In order to keep things clean I need to actually put the layout in GenTreeBlk, having size in GenTreeBlk and layout in GenTreeObj is asking for trouble. And in order to make layout work for GenTreeBlk nodes I need a class layout with only size and no class handle. That works but it's extra complication in the layout table, there are actually 2 hashtables - one with CORINFO_CLASS_HANDLE key and the other with size key.
  • GenTreeObj becomes a small node. In fact it no longer has any data members of its own.
  • There's also GenTreeDynBlk. This has nullptr layout so it requires extra care not to dereference it.
  • Recently we've seen that it may be problematic to have struct nodes without layout info in IR (CSE)

It seems to me that we may want to change how this works in the future:

  • GenTreeObj, GenTreeBlk and GenTreeDynBlk should all inherit from GenTreeIndir. Obj should have a layout pointer, Blk should have a constant size and DynBlk a size operand. This avoids having to add layout to GenTreeBlk and GenTreeDynBlk. Besides, especially in the GenTreeDynBlk the current hierarchy does not help in anyway, it's special cases all other the place.
  • With GenTreeObj becoming small I don't see any reason to use GenTreeBlk for more than cpblk operations. It's trivial to check if GenTreeObj contains GC pointers or not - AsObj()->GetLayout()->HasGCPtr().
  • We probably don't gain anything by using struct typed IND nodes, they should be OBJ nodes, even on the RHS.
  • We could probably add ClassLayout* to fgArgTabEntry and get rid of the need to wrap call args in OBJ nodes.

That said, in order to keep this PR manageable we'll probably want to keep the current Obj/Blk/DynBlk hierarchy for now.

@CarolEidt
Copy link

@mikedn - it appears (famous last words) that things have lightened up a bit, and I'm hoping to spend some time in the coming weeks to sort out and prioritize work items particularly in the area of structs and register allocation (union not intersection of those), and this change is one that I'd really like to see prioritized. So this would be a great time to discuss this. What form of discussion did you have in mind?

I'm still looking at GenTreePutArgStk (that's the worst) but it looks like there's no need for layout in it as it should obtain it from the source node.

Yes, I think it would be cleaner and more efficient.

I think it's a good start to use this in all the places where we currently have class handles and/or GC layout, and then subsequently we can add it to other nodes (e.g. GT_IND) that can have struct type but don't currently retain handle info, and then do some cleanup of the handling of SIMD types (whether that means adding to this layout structure or something else I'm not entirely sure).

I need a class layout with only size and no class handle. That works but it's extra complication in the layout table, there are actually 2 hashtables - one with CORINFO_CLASS_HANDLE key and the other with size key.

That seems like a minor additional complication - am I missing something that makes this more costly than I'm thinking?

GenTreeObj, GenTreeBlk and GenTreeDynBlk should all inherit from GenTreeIndir

They already do: GenTreeBlk inherits directly from GenTreeIndir and the others inherit from it.

With GenTreeObj becoming small I don't see any reason to use GenTreeBlk for more than cpblk operations. It's trivial to check if GenTreeObj contains GC pointers or not - AsObj()->GetLayout()->HasGCPtr().
We probably don't gain anything by using struct typed IND nodes, they should be OBJ nodes, even on the RHS.

I'd propose this as a possible subsequent round of change, as I believe there are still places where OBJ poisons optimizations.

We could probably add ClassLayout* to fgArgTabEntry and get rid of the need to wrap call args in OBJ nodes.

Agreed. Thinking further out, I'd also like to see if we can improve and simplify the handling of structs with mismatched ABI requirements (e.g. multiple fields that are passed in a single register, or vice-versa), possibly by creating a "fake" ClassLayout that reflects the way the struct is broken up for ABI purposes. Supporting assignment between the "real" and "fake" layout seems like a cleaner approach to dealing with these arguments.

@mikedn
Copy link
Author

mikedn commented Apr 24, 2019

and then subsequently we can add it to other nodes (e.g. GT_IND) that can have struct type but don't currently retain handle info

Adding layout to GenTreeIndir is something I considered in the early stages, it's technically possible because there's enough free space in GenTreeIndir to put a ClassLayout*. But that kind of makes GenTreeBlk/Obj useless, if GenTreeIndir has all it's needed, why the extra classes and opers?

I'm not convinced this is a good idea. In general I see "overloading" opers based on types with vastly different behavior to be dubious. It requires lots of if (isStruct) handleStruct; else handlePrimitive; kind of code. I have the same problem with using the same opers for integral and floating point operations - they support different optimizations, require different registers and generate different code. And thus require if (varTypeIsFloat(node)) checks pretty much everywhere.

That seems like a minor additional complication - am I missing something that makes this more costly than I'm thinking?

The main implementation complication is in ClassLayoutTable, it basically has 2 different, parallel code paths to deal with layouts with and without class handle. But I'm also concerned about the side effects this bifurcation can have:

  • Anyone trying to get a handle out of a ClassLayout has to check or otherwise be sure that a handle is actually present. Hopefully if we use ClassLayout consistently throughout the JIT this is less of an issue, there would be little reason to attempt to get a handle to begin with.
  • Layout identity issues. The JIT is currently discarding the handle in various cases and a word-by-word translation of the existing code to the ClassLayout world can result in the creation of 2 distinct layouts for a struct: a handle layout and a size layout. Not only that this wastes memory but it makes it impossible to recognize if 2 struct nodes have the same type.

They already do: GenTreeBlk inherits directly from GenTreeIndir and the others inherit from it.

Yes but what I'm saying is that they should directly inherit from GenTreeIndir:

struct GenTreeObj : public GenTreeIndir {
    ClassLayout* m_layout;
    Kind gtBlkOpKind;
};
struct GenTreeBlk : public GenTreeIndir {
    unsigned m_size;
    Kind gtBlkOpKind;
};
struct GenTreeDynBlk : public GenTreeIndir {
    GenTree* gtDynamicSize;
};

For GenTreeDynBlk this seems like a no brainer because it's special cased all other the place anyway. For the other 2 it's a bit more complicated because they actually share the logic in the "no GC pointers" case. But it shouldn't be much of a problem to restructure the existing code:

case GT_STORE_OBJ:
    if (node->AsObj()->GetLayout()->HasGCPtr())
        LowerGCBlockStore(node);
    else
        node->AsObj()->gtBlkOpKind = LowerBlockStore(node, node->AsObj()->GetLayout()->GetSize());
    break;
case GT_STORE_BLK:
    node->AsBlk()->gtBlkOpKind = LowerBlockStore(node, node->AsBlk()->GetSize());
    break;

But yes, such a change should be separate. Attempting to do it now would make this PR unmanageable. So for the time being we'll need to live this this ClassLayout handle/no-handle bifurcation.

Thinking further out, I'd also like to see if we can improve and simplify the handling of structs with mismatched ABI requirements (e.g. multiple fields that are passed in a single register, or vice-versa), possibly by creating a "fake" ClassLayout that reflects the way the struct is broken up for ABI purposes.

I'm not sure about creating "fake" layouts, it may complicate things rather than help. I already have to do such a thing to deal with something named "PPP Quirk". I'm still not comfortable enough with all the call handling code but my current impression is that too many low level ABI details leak into the HIR when at least some should be confined to LIR and/or codegen. Something like - have a struct that has to be passed into an integer register? Leave the HIR alone (no LCL_FLD, no OBJ etc.) and deal with it in lowering or codegen. But we'll see. I have another bone to pick with calls - their use of GT_LIST nodes and the side arg table - so one way or another I'm going to dig a lot in this area of the JIT.

@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from 81ee3ee to c784647 Compare June 9, 2019 08:23
@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from b2b41be to 37b67a9 Compare July 21, 2019 06:53
@mikedn mikedn force-pushed the struct-layout branch 2 times, most recently from 3bf1e9c to ea04b68 Compare August 2, 2019 05:42
Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this; indeed I've added a reference to this PR in my updated first-class-structs doc.
I'm coming around to your way of thinking that we should only use GT_BLK for truly opaque block nodes (i.e. from cpblk or initblk nodes). That said, it seems like it still makes the backend simpler if GenTreeObj still inherits from GenTreeBlk.
My comments are mostly minor, and I think the timing is right for getting this in and building on it for further improvements to struct codegen.

void SetLayout(ClassLayout* layout)
{
assert(varTypeIsStruct(lvType));
m_layout = layout;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I'd like to see use remove the use of the lvVerTypeInfo to get struct handles (I'm not even certain whether there's any need for lvVerTypeInfo to separately maintain the class handle after this change), but at least for now we should assert that they are the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I was just working on finishing up this PR and this was something I was intending to put in comments. Yes, we should definitely avoid the lvVerTypeInfo handle as it's duplicate and duplicate stuff is always at risk of getting out of sync and cause problems.

I actually did this but decided to keep it our of this PR to avoid some weirdness that goes on in lvaSetStruct - it has a setTypeInfo parameter that controls the setting of the handle in lvVerTypeInfo independently of layout. It would seem that when verification is enabled the handle in lvVerTypeInfo can perhaps be somehow different. I need to double check what's going on, though verification is AFAIK not a thing in .NET Core.

I also run into an annoyance caused by the implicit byref transformation - I would prefer to keep the assert(varTypeIsStruct(lvType)); asserts in Set/GetLayout but during that transformation the LclVarDsc is temporarily in a weird state where it no longer has struct type but the handle from layout is needed. For this and other reasons I suspect that I may have to clean up the implicit byref transformation first.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's unfortunate. Do you have thoughts on what that cleanup might look like?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only high level - we likely need to move this transform out of global morph, in front of it. The temporary weird state of the LclVarDsc might be unavoidable but I'd fell more comfortable if it's dealt with before global morph.

This would also tie in with #20957. Overall I think it should work like this:

  • LocalAddressVisitor deals with all indirect lclvar access and produces GT_LCL_VAR/GT_LCL_FLD whenever possible. When not possible the lclvar is left address exposed. It should also leave some breadcrumbs to drive promotion (e.g. collect reference counts).
  • Promotion of LclVarDscs
  • Transform IR according to promotion decisions above. Implicit byref transform should happen at the same time (since it's related to promotion).
  • Global morph - at this point most, if not all, lclvar related transforms have already been applied. Morph has less work to do and we avoid some chicken & egg situations (e.g. having to kill assertions due to promotion).
  • No more demotion of implicit by ref args

This does require an extra traversal of the IR so it would probably have a small throughput hit. But I keep finding ways to improve throughput in other parts of the JIT so I'm not worried about that.

src/jit/importer.cpp Outdated Show resolved Hide resolved
@@ -300,6 +300,153 @@ class FieldSeqStore
}
};

class ClassLayout

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd for this to be in GenTree, though I guess it's used there exclusively.
It might almost merit its own separate (header and cpp) files, including moving the cache there from compiler.cpp.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll confess that I simply dumped this class in the first place where it worked :). I'd be OK with adding new files, though I remember a discussion about the JIT source directory already having lots of files and no subdirectory structure to make it more manageable. Up to the team.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved all new code to layout.h/layout.cpp, seems reasonable. Especially if we're going to add more info/logic to ClassLayout:

  • HFA (and maybe other ABI related info that now gets duplicated in fgArgTabEntry) info
  • SIMD info
  • maybe field info - when promoting we query EE for fields and other info for every promotable variable, even if it has the same type as a previously promoted variable. We could try to add the necessary info to ClassLayout the first time we get it.
  • maybe it would be even worthwhile to have an abstraction for fields too, instead of using CORINFO_FIELD_HANDLE directly

src/jit/codegenxarch.cpp Outdated Show resolved Hide resolved
src/jit/simd.cpp Show resolved Hide resolved
@mikedn
Copy link
Author

mikedn commented Aug 29, 2019

Some other notes:

  • I think the "layout" name is a bit of misnomer. It's origin is in the getClassGClayout EE name. I would expect that types having the same layout share a ClassLayout object but that's not how it works now. I don't know if it's feasible to share the layout, probably VM would need to provide support for that otherwise we'd need to do a bunch of queries to figure out that 2 layouts are the same. And as long as some parts of the JIT IR still use CORINFO_CLASS_HANDLE sharing isn't an option anyway because we need to be able to recover the handle from the layout. But who knows, maybe in the future ClassLayout would represent a "true" layout.
  • The PPP quirk thing is dubious, I hope we can get rid of that at some point. Either by completely removing it (must be quite old and inherited form .NET Framework) or by implementing it in a different manner - setting a bit in LclVarDsc indicating that more stack space should be allocated to the variable. Increasing the size of the layout and lvExactSize sounds risky - what if the variable is LHS, would this increase mean that it copies more bytes from the RHS?
  • I've been looking at using ClassLayout in more places in the JIT IR (e.g. GT_INDEX) but decided to not include that in this PR to keep it smaller. The current set of changes (OBJ, PUTARGSTK, LclVarDsc) is due to the fact that that there's common code that involves all of them so it's easier and safer to change all of them at the same time).
  • Initially the GC info in ClassLayout was lazily initialized, that's more or less how OBJ did it. That turned out to be an assert farm, somehow I kept hitting code paths that needed GC info from layout objects that did not have it so I got rid of it. The fact that layout objects are shared between multiple OBJ nodes and lclvars makes highly unlikely that lazy initialization has any benefits.

lvaTable[shadowVar].lvVerTypeInfo = varDsc->lvVerTypeInfo;
if (varTypeIsStruct(type))
{
lvaTable[shadowVar].SetLayout(varDsc->GetLayout());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this should not be needed, gsParamsToShadows calls lvaSetStruct so both lvVerTypeInfo and the layout would be set anyway. Question is why isn't lvaSetStruct simply called here, makes no sense to me to delay it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I"m a bit confused by your comment. You said that it calls lvaSetStruct, but it doesn't seem to. Did you mean that it should do so?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does call it but in a different place:

  • first it creates the variables, without calling lvaSetStruct
  • then it changes the IR to use the new variables (ugly because the IR will now temporarily be in an invalid state - the newly created variables may have not been fully setup without lvaSetStruct)
  • then it generates copies from the old variables to the new variables and here it also calls lvaSetStruct -
    CORINFO_CLASS_HANDLE clsHnd = varDsc->lvVerTypeInfo.GetClassHandle();
    // We don't need unsafe value cls check here since we are copying the params and this flag
    // would have been set on the original param before reaching here.
    lvaSetStruct(shadowVar, clsHnd, false);

    I don't see any reason not to call lvaSetStruct when the variables are created.

But this code is ancient. There are still traces of old struct handling - it wraps the lclvar nodes in ADDR nodes and then calls gtNewCpObjNode. Which will of course remove the ADDR nodes as they're not necessary.

A bunch of extra complication basically.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - I didn't look down far enough. It should probably be cleaned up. It's not clear how much it matters, but I agree that it's unwise to have the inconsistent state, even temporarily.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedn mikedn changed the title [WIP] Pull struct type info out of GenTreeObj Pull struct type info out of GenTreeObj Sep 4, 2019
@mikedn
Copy link
Author

mikedn commented Sep 4, 2019

I've added a few more comments and made some small tweaks to ClassLayout code. There's obviously a lot more than can be done with ClassLayout but this should be enough for one PR.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question, but I'm going to go ahead and merge this.

lvaTable[shadowVar].lvVerTypeInfo = varDsc->lvVerTypeInfo;
if (varTypeIsStruct(type))
{
lvaTable[shadowVar].SetLayout(varDsc->GetLayout());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I"m a bit confused by your comment. You said that it calls lvaSetStruct, but it doesn't seem to. Did you mean that it should do so?

@CarolEidt CarolEidt merged commit 9479f67 into dotnet:master Sep 5, 2019
@mikedn mikedn deleted the struct-layout branch September 28, 2019 19:09
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Introduce ClassLayout

* Delete unused getStructGcPtrsFromOp

* Use ClassLayout in GenTreeObj/Blk

* Use ClassLayout in LclVarDsc

* Remove layout info from GenTreePutArgStk

* Always initialze ClassLayout GC layout

* Make ClassLayout::GetGCPtrs private

* Restore genAdjustStackForPutArgStk asserts

* Comments

* Put layout related code in new files

* More comments and small tweaks


Commit migrated from dotnet/coreclr@9479f67
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants