[interp] Rework the allocation of offsets for variables #49072

BrzVlad · 2021-03-03T18:41:45Z

Before this change, the offset was determined by offset on the IL stack, limiting the optimizations that can be applied to these vars. Now we allocate them separately at the end of codegen.

This also optimizes the newobj opcodes.

Aside from the newobj optimization, this change should not have any meaningful performance impact.

ghost · 2021-03-03T18:41:52Z

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

Issue Details

Before this change, the offset was determined by offset on the IL stack, limiting the optimizations that can be applied to these vars. Now we allocate them separately at the end of codegen.

This also optimizes the newobj opcodes.

Aside from the newobj optimization, this change should not have any meaningful performance impact.

Author:	BrzVlad
Assignees:	-
Labels:	`area-Codegen-Interpreter-mono`
Milestone:	-

SamMonoRT · 2021-03-03T18:49:52Z

Contributes to #47520

lambdageek

lgtm.

think about breaking up transform.c it's getting pretty massive

lambdageek · 2021-03-04T01:51:37Z

src/mono/mono/mini/interp/transform.c

+}
+
+static void
+initialize_global_var (TransformData *td, int var, int bb_index)


Maybe it's time to think about moving some of the compiler pipeline out of transform.c?

Probably be enough to just have xyz.inline.c files and then #include "xyz.inline.c" into this file if you want to keep everything in one place.

lambdageek · 2021-03-04T02:01:52Z

src/mono/mono/mini/interp/transform.c

+		}
+	}
+	td->total_locals_size = ALIGN_TO (final_total_locals_size, MINT_STACK_SLOT_SIZE);
+}


offset alloc lgtm

Use it just for allocating an offset for a var, at the top of the locals space.

This will aid later optimizations, in order to easily detect the call args for an opcode. This is stored as a -1 terminated array of var indexes. Also change the structure of newobj_reg_map array, so it can reuse this format (newobj_reg_map should be killed at some point anyway).

Making it consistent with other calli opcodes and simplifies a little bit the code generation path.

Before this change, calls used to receive a single special dreg argument. This was resolved to an offset. At this offset, the call could find all the parameters and the return value was also written at the same offset. With this change we move towards having an explicit dreg return. For calls, the last sreg must be of the special type MINT_CALL_ARGS_SREG. The var offset allocator should ensure all call args are allocated one after the other and that this special reg type is resolved to the offset where these args reside.

…ases This flag should only be relevant to the var offset allocator

This change aims to simplify the handling of vars during optimizations. Before this change we had different types of vars : managed locals, var residing on the execution stack, vars that are the argument of a call. Multiple restrictions applied to vars residing on the execution stack and to call args. Following this change, all vars share the same semantics during optimizations passes. At the very end, we allocate offsets for them and we will end up with 3 types of vars : global vars (used from multiple bblocks), local vars (used in a single bblock) and call arg vars. Call arg vars are always local. The first step of the allocator is to detect all global vars and allocate offsets for them by doing a full iteration over the code. They will reside in the first section of the stack frame and they are allocated one after the other in the order they are detected. The param area (containing call arg vars) will have to be allocated after the local var space, otherwise a call would overwrite vars in the calling method. These vars are allocated for one basic block at a time. For simple local vars we do an initial iteration over the bblock instructions and we set the liveness information for each referenced var (live_start and live_end). We will maintain a list of active vars and the current top of stack. As a var becomes alive we allocate it at the current offset and add it to the active_vars array. As a var becomes dead, we remove entries from the active_vars array and update the current top of stack, if space has been freed at the end of the stack. For call args, because we must control the offset at which these vars are allocated, in the initial pass we generate MOVs from the var to a new local var, if the call arg was initially global. Afterwards, call arg vars are allocated in a similar manner to normal local vars. The space for them is tied to the param area of the call, so the entire space is allocated at once. A call become active when any of its args is first written. The liveness of the call ends when the actual call is done, at which point we resolve the offset of every arg relative to the start of the param area of the method. Once all normal local vars are allocated, we will compute the final offset of the call arg vars.

…degen They are no longer needed. We generate offsets for every var at the very end.

The offset allocator is allocating the vars at the right offset in the param area. We also used `push_types` to add the arguments back on the stack, which was allocating new vars for each argument. We no longer do this, so newobj_reg_map is not needed anymore.

For object ctors, MINT_NEWOBJ_INLINED allocates an object which will be used both as a `this` arg to the ctor as well as the return var from the newobj operation. For valuetype ctors, we need to first inform the var offset allocator that the valuetype exists before the MINT_NEWOBJ_VT_INLINED invocation, which will take its address, which will be used as `this` arg to the inline method. We also need to dummy use the valuetype, so it never dies before the ctor is inlined, otherwise `this` points to garbage. We use this def/dummy_use mechanism in order to avoid promoting the valuetype to a global var, as it happens with normal vars that have their address taken (via ldloca).

MINT_NEWOBJ* should not store into a local if the ctor might throw, because we set the return value before the ctor starts executing, and a guarding clause can see this variable as being set.

When passing an argument or returning a value from a method. The stack contents are not necessarily matching the signature type, in which case we add conversions.

This test was exceeding the stack limit even before the new offset allocator, it was just not reported.

BrzVlad · 2021-03-12T09:12:04Z

As expected, this doesn't have an invasive performance impact, with a tendency however to improve performance https://pvscmdupload.blob.core.windows.net/drewtest/report_Daily_ca%3Dx64_cb%3Drefs-heads-vlbrez-interp-offset-allocator_co%3DUbuntu1804_cr%3Ddotnetruntime_cc%3DCompilationMode%3Dtiered-LLVM%3Dfalse-MonoAOT%3Dfalse-MonoInterpreter%3Dtrue-RunKind%3Dmicro_mono_bb%3Drefs-heads-main_2021-03-11.html

…et#49072)" This reverts commit 686f752. Reverting this for now, as a workaround to get the AOT library tests working again. Running library tests with AOT+EnableAggressiveTrimming broke with: ``` info: Arguments: --run,WasmTestRunner.dll,System.Buffers.Tests.dll,-notrait,category=OuterLoop,-notrait,category=failing info: Initializing..... fail: System.AggregateException: AggregateException_ctor_DefaultMessage (Arg_NullReferenceException) ---> System.NullReferenceException: Arg_NullReferenceException at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName) --- End of stack trace from previous location --- at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName) --- End of stack trace from previous location --- at Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument[Int32](String argumentName) Exception_EndOfInnerExceptionStack info: Discovering: System.Buffers.Tests.dll (method display = ClassAndMethod, method display options = None) info: WASM EXIT 1 fail: Application has finished with exit code TESTS_FAILED but 0 was expected ``` More info: dotnet#49770

BrzVlad requested a review from vargaz as a code owner March 3, 2021 18:41

dotnet-issue-labeler bot added the area-Codegen-Interpreter-mono label Mar 3, 2021

BrzVlad force-pushed the feature-interp-local-offset-allocator branch 2 times, most recently from e6ddd53 to aaeed43 Compare March 3, 2021 20:48

lambdageek approved these changes Mar 4, 2021

View reviewed changes

BrzVlad force-pushed the feature-interp-local-offset-allocator branch from aaeed43 to 23c97db Compare March 4, 2021 12:06

runfoapp bot mentioned this pull request Mar 4, 2021

System.Threading.Tasks.Tests timed out on net5.0-Linux-Debug-arm64-Mono_release #42024

Closed

BrzVlad force-pushed the feature-interp-local-offset-allocator branch 3 times, most recently from b4b38aa to 576adc6 Compare March 10, 2021 08:43

BrzVlad added 17 commits March 10, 2021 20:12

[interp] Cleanup get_interp_local_offset

d3e00b4

Use it just for allocating an offset for a var, at the top of the locals space.

[interp] Pass target_ip in a normal var to MINT_CALLI_NAT_FAST

7f717e9

Making it consistent with other calli opcodes and simplifies a little bit the code generation path.

[interp] Remove call args flag from code generation / optimization ph…

9e62315

…ases This flag should only be relevant to the var offset allocator

[interp] Improve dumping for call instructions

5db91b7

[interp] Fix var type of valuetype this

4ebc821

[interp] Re-enable copy propagation

0985aa8

[interp] Rename MINT_NEWOBJ opcodes

10de448

[interp] Disable tracking of offsets on the execution stack during co…

8d23d86

…degen They are no longer needed. We generate offsets for every var at the very end.

[interp] Avoid optimization if newobj is guarded

f18dd20

MINT_NEWOBJ* should not store into a local if the ctor might throw, because we set the return value before the ctor starts executing, and a guarding clause can see this variable as being set.

[interp] Refactor the active vars code a bit

8ef7f96

[interp] Add missing implicit conversion

a30b921

When passing an argument or returning a value from a method. The stack contents are not necessarily matching the signature type, in which case we add conversions.

[interp] Disable test using excessive stack space

7cb1406

This test was exceeding the stack limit even before the new offset allocator, it was just not reported.

BrzVlad force-pushed the feature-interp-local-offset-allocator branch from 576adc6 to 7cb1406 Compare March 10, 2021 18:12

vargaz approved these changes Mar 12, 2021

View reviewed changes

BrzVlad merged commit 686f752 into dotnet:main Mar 12, 2021

runfoapp bot mentioned this pull request Mar 12, 2021

[tests] System.Text.Json.Tests segfault, for Libraries Test Run release coreclr OSX x64 Release #47805

Closed

tqiu8 mentioned this pull request Mar 17, 2021

[wasm][AOT] Xunit.Sdk.ReflectionAttributeInfo.GetNamedArgument Exception #49770

Closed

radical mentioned this pull request Mar 17, 2021

[wasm] Revert commit to fix library AOT tests #49784

Closed

This was referenced Mar 22, 2021

Native Asset failure while System.Collections.Concurrent.Tests #48614

Closed

RunContinueWithStressTestsNoState timing out in CI #2271

Closed

System.Collections.Concurrent.Tests crashing in CI #45517

Closed

ghost locked as resolved and limited conversation to collaborators Apr 11, 2021

karelz added this to the 6.0.0 milestone May 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[interp] Rework the allocation of offsets for variables #49072

[interp] Rework the allocation of offsets for variables #49072

BrzVlad commented Mar 3, 2021

ghost commented Mar 3, 2021

SamMonoRT commented Mar 3, 2021

lambdageek left a comment

lambdageek Mar 4, 2021

lambdageek Mar 4, 2021

BrzVlad commented Mar 12, 2021

[interp] Rework the allocation of offsets for variables #49072

[interp] Rework the allocation of offsets for variables #49072

Conversation

BrzVlad commented Mar 3, 2021

ghost commented Mar 3, 2021

SamMonoRT commented Mar 3, 2021

lambdageek left a comment

Choose a reason for hiding this comment

lambdageek Mar 4, 2021

Choose a reason for hiding this comment

lambdageek Mar 4, 2021

Choose a reason for hiding this comment

BrzVlad commented Mar 12, 2021