Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[interp] Rework the allocation of offsets for variables #49072

Merged
merged 17 commits into from
Mar 12, 2021

Conversation

BrzVlad
Copy link
Member

@BrzVlad BrzVlad commented Mar 3, 2021

Before this change, the offset was determined by offset on the IL stack, limiting the optimizations that can be applied to these vars. Now we allocate them separately at the end of codegen.

This also optimizes the newobj opcodes.

Aside from the newobj optimization, this change should not have any meaningful performance impact.

@ghost
Copy link

ghost commented Mar 3, 2021

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

Issue Details

Before this change, the offset was determined by offset on the IL stack, limiting the optimizations that can be applied to these vars. Now we allocate them separately at the end of codegen.

This also optimizes the newobj opcodes.

Aside from the newobj optimization, this change should not have any meaningful performance impact.

Author: BrzVlad
Assignees: -
Labels:

area-Codegen-Interpreter-mono

Milestone: -

@SamMonoRT
Copy link
Member

Contributes to #47520

@BrzVlad BrzVlad force-pushed the feature-interp-local-offset-allocator branch 2 times, most recently from e6ddd53 to aaeed43 Compare March 3, 2021 20:48
Copy link
Member

@lambdageek lambdageek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

think about breaking up transform.c it's getting pretty massive

}

static void
initialize_global_var (TransformData *td, int var, int bb_index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's time to think about moving some of the compiler pipeline out of transform.c?

Probably be enough to just have xyz.inline.c files and then #include "xyz.inline.c" into this file if you want to keep everything in one place.

}
}
td->total_locals_size = ALIGN_TO (final_total_locals_size, MINT_STACK_SLOT_SIZE);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offset alloc lgtm

@BrzVlad BrzVlad force-pushed the feature-interp-local-offset-allocator branch from aaeed43 to 23c97db Compare March 4, 2021 12:06
@BrzVlad BrzVlad force-pushed the feature-interp-local-offset-allocator branch 3 times, most recently from b4b38aa to 576adc6 Compare March 10, 2021 08:43
BrzVlad added 17 commits March 10, 2021 20:12
Use it just for allocating an offset for a var, at the top of the locals space.
This will aid later optimizations, in order to easily detect the call args for an opcode. This is stored as a -1 terminated array of var indexes. Also change the structure of newobj_reg_map array, so it can reuse this format (newobj_reg_map should be killed at some point anyway).
Making it consistent with other calli opcodes and simplifies a little bit the code generation path.
Before this change, calls used to receive a single special dreg argument. This was resolved to an offset. At this offset, the call could find all the parameters and the return value was also written at the same offset. With this change we move towards having an explicit dreg return. For calls, the last sreg must be of the special type MINT_CALL_ARGS_SREG. The var offset allocator should ensure all call args are allocated one after the other and that this special reg type is resolved to the offset where these args reside.
…ases

This flag should only be relevant to the var offset allocator
This change aims to simplify the handling of vars during optimizations. Before this change we had different types of vars : managed locals, var residing on the execution stack, vars that are the argument of a call. Multiple restrictions applied to vars residing on the execution stack and to call args. Following this change, all vars share the same semantics during optimizations passes. At the very end, we allocate offsets for them and we will end up with 3 types of vars : global vars (used from multiple bblocks), local vars (used in a single bblock) and call arg vars. Call arg vars are always local.

The first step of the allocator is to detect all global vars and allocate offsets for them by doing a full iteration over the code. They will reside in the first section of the stack frame and they are allocated one after the other in the order they are detected. The param area (containing call arg vars) will have to be allocated after the local var space, otherwise a call would overwrite vars in the calling method. These vars are allocated for one basic block at a time.

For simple local vars we do an initial iteration over the bblock instructions and we set the liveness information for each referenced var (live_start and live_end). We will maintain a list of active vars and the current top of stack. As a var becomes alive we allocate it at the current offset and add it to the active_vars array. As a var becomes dead, we remove entries from the active_vars array and update the current top of stack, if space has been freed at the end of the stack.

For call args, because we must control the offset at which these vars are allocated, in the initial pass we generate MOVs from the var to a new local var, if the call arg was initially global. Afterwards, call arg vars are allocated in a similar manner to normal local vars. The space for them is tied to the param area of the call, so the entire space is allocated at once. A call become active when any of its args is first written. The liveness of the call ends when the actual call is done, at which point we resolve the offset of every arg relative to the start of the param area of the method. Once all normal local vars are allocated, we will compute the final offset of the call arg vars.
…degen

They are no longer needed. We generate offsets for every var at the very end.
The offset allocator is allocating the vars at the right offset in the param area. We also used `push_types` to add the arguments back on the stack, which was allocating new vars for each argument. We no longer do this, so newobj_reg_map is not needed anymore.
For object ctors, MINT_NEWOBJ_INLINED allocates an object which will be used both as a `this` arg to the ctor as well as the return var from the newobj operation.

For valuetype ctors, we need to first inform the var offset allocator that the valuetype exists before the MINT_NEWOBJ_VT_INLINED invocation, which will take its address, which will be used as `this` arg to the inline method. We also need to dummy use the valuetype, so it never dies before the ctor is inlined, otherwise `this` points to garbage. We use this def/dummy_use mechanism in order to avoid promoting the valuetype to a global var, as it happens with normal vars that have their address taken (via ldloca).
MINT_NEWOBJ* should not store into a local if the ctor might throw, because we set the return value before the ctor starts executing, and a guarding clause can see this variable as being set.
When passing an argument or returning a value from a method. The stack contents are not necessarily matching the signature type, in which case we add conversions.
This test was exceeding the stack limit even before the new offset allocator, it was just not reported.
@BrzVlad BrzVlad force-pushed the feature-interp-local-offset-allocator branch from 576adc6 to 7cb1406 Compare March 10, 2021 18:12
@BrzVlad BrzVlad merged commit 686f752 into dotnet:main Mar 12, 2021
radical added a commit to radical/runtime that referenced this pull request Mar 17, 2021
…et#49072)"

This reverts commit 686f752.

Reverting this for now, as a workaround to get the AOT library tests
working again.

Running library tests with AOT+EnableAggressiveTrimming broke with:

```
  info: Arguments: --run,WasmTestRunner.dll,System.Buffers.Tests.dll,-notrait,category=OuterLoop,-notrait,category=failing
  info: Initializing.....
  fail: System.AggregateException: AggregateException_ctor_DefaultMessage (Arg_NullReferenceException)
         ---> System.NullReferenceException: Arg_NullReferenceException
           at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName)
        --- End of stack trace from previous location ---
           at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName)
        --- End of stack trace from previous location ---
           at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName)
           Exception_EndOfInnerExceptionStack
  info: Discovering: System.Buffers.Tests.dll (method display = ClassAndMethod, method display options = None)
  info: WASM EXIT 1
  fail: Application has finished with exit code TESTS_FAILED but 0 was expected
```

More info: dotnet#49770
@ghost ghost locked as resolved and limited conversation to collaborators Apr 11, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants