-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[interp] Rework the allocation of offsets for variables #49072
[interp] Rework the allocation of offsets for variables #49072
Conversation
Tagging subscribers to this area: @BrzVlad Issue DetailsBefore this change, the offset was determined by offset on the IL stack, limiting the optimizations that can be applied to these vars. Now we allocate them separately at the end of codegen. This also optimizes the newobj opcodes. Aside from the newobj optimization, this change should not have any meaningful performance impact.
|
Contributes to #47520 |
e6ddd53
to
aaeed43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
think about breaking up transform.c it's getting pretty massive
} | ||
|
||
static void | ||
initialize_global_var (TransformData *td, int var, int bb_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's time to think about moving some of the compiler pipeline out of transform.c?
Probably be enough to just have xyz.inline.c
files and then #include "xyz.inline.c"
into this file if you want to keep everything in one place.
} | ||
} | ||
td->total_locals_size = ALIGN_TO (final_total_locals_size, MINT_STACK_SLOT_SIZE); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offset alloc lgtm
aaeed43
to
23c97db
Compare
b4b38aa
to
576adc6
Compare
Use it just for allocating an offset for a var, at the top of the locals space.
This will aid later optimizations, in order to easily detect the call args for an opcode. This is stored as a -1 terminated array of var indexes. Also change the structure of newobj_reg_map array, so it can reuse this format (newobj_reg_map should be killed at some point anyway).
Making it consistent with other calli opcodes and simplifies a little bit the code generation path.
Before this change, calls used to receive a single special dreg argument. This was resolved to an offset. At this offset, the call could find all the parameters and the return value was also written at the same offset. With this change we move towards having an explicit dreg return. For calls, the last sreg must be of the special type MINT_CALL_ARGS_SREG. The var offset allocator should ensure all call args are allocated one after the other and that this special reg type is resolved to the offset where these args reside.
…ases This flag should only be relevant to the var offset allocator
This change aims to simplify the handling of vars during optimizations. Before this change we had different types of vars : managed locals, var residing on the execution stack, vars that are the argument of a call. Multiple restrictions applied to vars residing on the execution stack and to call args. Following this change, all vars share the same semantics during optimizations passes. At the very end, we allocate offsets for them and we will end up with 3 types of vars : global vars (used from multiple bblocks), local vars (used in a single bblock) and call arg vars. Call arg vars are always local. The first step of the allocator is to detect all global vars and allocate offsets for them by doing a full iteration over the code. They will reside in the first section of the stack frame and they are allocated one after the other in the order they are detected. The param area (containing call arg vars) will have to be allocated after the local var space, otherwise a call would overwrite vars in the calling method. These vars are allocated for one basic block at a time. For simple local vars we do an initial iteration over the bblock instructions and we set the liveness information for each referenced var (live_start and live_end). We will maintain a list of active vars and the current top of stack. As a var becomes alive we allocate it at the current offset and add it to the active_vars array. As a var becomes dead, we remove entries from the active_vars array and update the current top of stack, if space has been freed at the end of the stack. For call args, because we must control the offset at which these vars are allocated, in the initial pass we generate MOVs from the var to a new local var, if the call arg was initially global. Afterwards, call arg vars are allocated in a similar manner to normal local vars. The space for them is tied to the param area of the call, so the entire space is allocated at once. A call become active when any of its args is first written. The liveness of the call ends when the actual call is done, at which point we resolve the offset of every arg relative to the start of the param area of the method. Once all normal local vars are allocated, we will compute the final offset of the call arg vars.
…degen They are no longer needed. We generate offsets for every var at the very end.
The offset allocator is allocating the vars at the right offset in the param area. We also used `push_types` to add the arguments back on the stack, which was allocating new vars for each argument. We no longer do this, so newobj_reg_map is not needed anymore.
For object ctors, MINT_NEWOBJ_INLINED allocates an object which will be used both as a `this` arg to the ctor as well as the return var from the newobj operation. For valuetype ctors, we need to first inform the var offset allocator that the valuetype exists before the MINT_NEWOBJ_VT_INLINED invocation, which will take its address, which will be used as `this` arg to the inline method. We also need to dummy use the valuetype, so it never dies before the ctor is inlined, otherwise `this` points to garbage. We use this def/dummy_use mechanism in order to avoid promoting the valuetype to a global var, as it happens with normal vars that have their address taken (via ldloca).
MINT_NEWOBJ* should not store into a local if the ctor might throw, because we set the return value before the ctor starts executing, and a guarding clause can see this variable as being set.
When passing an argument or returning a value from a method. The stack contents are not necessarily matching the signature type, in which case we add conversions.
This test was exceeding the stack limit even before the new offset allocator, it was just not reported.
576adc6
to
7cb1406
Compare
As expected, this doesn't have an invasive performance impact, with a tendency however to improve performance https://pvscmdupload.blob.core.windows.net/drewtest/report_Daily_ca%3Dx64_cb%3Drefs-heads-vlbrez-interp-offset-allocator_co%3DUbuntu1804_cr%3Ddotnetruntime_cc%3DCompilationMode%3Dtiered-LLVM%3Dfalse-MonoAOT%3Dfalse-MonoInterpreter%3Dtrue-RunKind%3Dmicro_mono_bb%3Drefs-heads-main_2021-03-11.html |
…et#49072)" This reverts commit 686f752. Reverting this for now, as a workaround to get the AOT library tests working again. Running library tests with AOT+EnableAggressiveTrimming broke with: ``` info: Arguments: --run,WasmTestRunner.dll,System.Buffers.Tests.dll,-notrait,category=OuterLoop,-notrait,category=failing info: Initializing..... fail: System.AggregateException: AggregateException_ctor_DefaultMessage (Arg_NullReferenceException) ---> System.NullReferenceException: Arg_NullReferenceException at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName) --- End of stack trace from previous location --- at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName) --- End of stack trace from previous location --- at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName) Exception_EndOfInnerExceptionStack info: Discovering: System.Buffers.Tests.dll (method display = ClassAndMethod, method display options = None) info: WASM EXIT 1 fail: Application has finished with exit code TESTS_FAILED but 0 was expected ``` More info: dotnet#49770
Before this change, the offset was determined by offset on the IL stack, limiting the optimizations that can be applied to these vars. Now we allocate them separately at the end of codegen.
This also optimizes the newobj opcodes.
Aside from the newobj optimization, this change should not have any meaningful performance impact.