Rename tiers and config options, add config option JitTier0ForLoops #23597

kouvel · 2019-03-30T15:18:40Z

Depends on #23599

Renamed tier 0 / tier 1 to StartupTier, OptimizedTier
Added config option QuickJitForLoops, which determines whether quick JIT, when enabled, may be used for methods that contain loops. Off by default, so after this change, StartupTierQuickJit=1 or ForceQuickJit=1 would still not use quick JIT for methods that contain loops by default.

kouvel · 2019-03-30T15:19:12Z

CC @jkotas @richlander

src/jit/error.cpp

jkotas · 2019-03-30T15:41:55Z

I would split this into two PRs:

Mush have changes in defaults, etc. for preview4
QuickJitForLoops for separate PR. It does not need to be in preview4.

kouvel · 2019-03-30T16:54:53Z

QuickJitForLoops for separate PR. It does not need to be in preview4.

QuickJit is disabled by default but if someone wanted to enable it in the startup tier for perf, I would still like the default behavior to be to disable QuickJit for methods that contain loops to avoid exposing people to that issue. Enabling QuickJit for loops would be a more advanced/risky thing to do. So I'm thinking about including the loop detection and switching from tier 0 to tier 1 in preview 4 in some fashion.

jkotas · 2019-03-30T17:01:59Z

I agree that the loop detection is useful. I do not think that the implementation in this PR is good because of it depends on exception handling for control flow.

My suggestion to split this was to reduce risk for P4: Getting the right defaults in place is high priority. Improving the heuristics for optional QuickJit is lower priority.

kouvel · 2019-03-30T17:03:39Z

Ok sounds good

kouvel · 2019-03-30T19:05:01Z

I have separated the minimal changes into #23599 for preview 4, and rebased the changes here on top of that commit, so ignore the first commit in this PR.

kouvel · 2019-03-31T21:43:57Z

Here are some perf results for startup time. I have excluded steady-state perf results because they don't have any significant diffs, and I have excluded cases with the cold-methods-with-hot-loops issue that are fully resolved by this change and #23599.

Perf results:

Tier = default mode after this change
Tier+QJ = quick JIT enabled for non-R2R'ed methods except for methods that contain loops
Tier+QJ+QJL = default mode before this change - quick JIT enabled for non-R2R'ed methods including for methods that contain loops
Diffs are against the NoTier column
Lower is better

JitBench startup

	NoTier	Tier	Diff	Tier+QJ	Diff	Tier+QJ+QJL	Diff
DotNetBuildHelloWorld	1332.14	1387.14	4.13%	1337.67	0.41%	1290.33	-3.14%
CscHelloWorld	609.00	632.57	3.87%	571.00	-6.24%	500.00	-17.90%
CscRoslynSource	2316.29	2461.33	6.26%	2397.00	3.48%	2353.80	1.62%
EmptyConsoleProgram	57.86	57.57	-0.49%	56.14	-2.96%	56.67	-2.06%
MusicStoreStartup	597.57	615.60	3.02%	583.29	-2.39%	551.71	-7.67%
MusicStoreFirstRequest	513.29	534.20	4.07%	452.00	-11.94%	406.00	-20.90%

ASP.NET startup

	NoTier	Tier	Diff	Tier+QJ	Diff	Tier+QJ+QJL	Diff
DbFortunesEf Windows	1783.16	1833.70	2.83%	1547.62	-13.21%	1282.57	-28.07%
DbFortunesEf Linux	1774.85	1823.57	2.75%	1525.10	-14.07%	1253.31	-29.39%
DbFortunesRaw Windows	1064.10	1074.16	0.95%	972.62	-8.60%	839.72	-21.09%
DbFortunesRaw Linux	1100.03	1102.17	0.19%	994.59	-9.59%	828.14	-24.72%
Json Windows	565.31	586.11	3.68%	539.14	-4.63%	516.31	-8.67%
Json Linux	518.77	525.45	1.29%	471.86	-9.04%	417.72	-19.48%
MvcJson Windows	735.69	758.05	3.04%	676.88	-7.99%	654.59	-11.02%
MvcJson Linux	671.86	691.51	2.92%	614.09	-8.60%	547.66	-18.49%
MvcPlaintext Windows	656.82	666.21	1.43%	620.26	-5.57%	581.36	-11.49%
MvcPlaintext Linux	611.49	627.18	2.57%	558.31	-8.70%	506.87	-17.11%
Plaintext Windows	487.44	501.95	2.98%	457.81	-6.08%	421.02	-13.63%
Plaintext Linux	427.12	435.44	1.95%	383.82	-10.14%	348.57	-18.39%

JitBench startup - R2R disabled

	NoTier	Tier	Diff	Tier+QJ	Diff	Tier+QJ+QJL	Diff
DotNetBuildHelloWorld	3547.86	3573.43	0.72%	3050.71	-14.01%	2568.17	-27.61%
CscHelloWorld	2194.86	2194.57	-0.01%	1779.00	-18.95%	1212.17	-44.77%
CscRoslynSource	4001.14	3998.43	-0.07%	3890.50	-2.77%	3645.57	-8.89%
EmptyConsoleProgram	101.57	102.14	0.56%	91.29	-10.13%	84.57	-16.74%
MusicStoreStartup	2071.57	2067.71	-0.19%	1654.86	-20.12%	1268.33	-38.77%
MusicStoreFirstRequest	1738.29	1730.71	-0.44%	1285.86	-26.03%	977.14	-43.79%

ASP.NET startup - R2R disabled

	NoTier	Tier	Diff	Tier+QJ	Diff	Tier+QJ+QJL	Diff
DbFortunesEf Windows	2827.39	2811.59	-0.56%	2319.55	-17.96%	1808.65	-36.03%
DbFortunesEf Linux	2694.68	2734.69	1.48%	2256.13	-16.27%	1763.44	-34.56%
DbFortunesRaw Windows	1971.53	1955.90	-0.79%	1627.06	-17.47%	1282.32	-34.96%
DbFortunesRaw Linux	1914.01	1886.68	-1.43%	1630.11	-14.83%	1259.21	-34.21%
Json Windows	1309.95	1297.10	-0.98%	1091.96	-16.64%	865.64	-33.92%
Json Linux	1194.95	1199.75	0.40%	1000.41	-16.28%	797.74	-33.24%
MvcJson Windows	1878.55	1885.21	0.35%	1550.97	-17.44%	1240.70	-33.95%
MvcJson Linux	1740.76	1745.34	0.26%	1468.83	-15.62%	1149.11	-33.99%
MvcPlaintext Windows	1676.32	1669.72	-0.39%	1395.98	-16.72%	1077.02	-35.75%
MvcPlaintext Linux	1567.92	1565.73	-0.14%	1329.23	-15.22%	1031.59	-34.21%
Plaintext Windows	1100.61	1102.61	0.18%	914.07	-16.95%	739.77	-32.79%
Plaintext Linux	1017.82	1020.50	0.26%	855.75	-15.92%	683.04	-32.89%

kouvel · 2019-03-31T21:48:31Z

@dotnet-bot test Ubuntu x64 Checked CoreFX Tests

AndyAyersMS · 2019-04-01T17:39:21Z

Aside from the preview4 / subsequent split, I would like to see the changes here that are not part of #23599 be split into a pure renaming part and then a policy/logic changes.

In the meantime:

How do we avoid jitting a method twice at tier1, if the jit decides to switch codegen from tier0 to tier1? Seems like the jit is feeding information back to the runtime, but I don't see where it gets used.
If we have this feedback mechanism, seems like we could use it more broadly to communicate when rejitting is not likely to be beneficial.
If we are going to rename, seems like we should rename things everywhere. But my preference would be to stick with Tier0 and Tier1, at least internally. I can see perhaps the new names might be useful for the user-visible config settings.
For loop detection, why not just put the logic in fgMarkBackwardJump ?

kouvel · 2019-04-01T18:04:41Z

Adding a set of tables that compare perf after this change directly against before this change and to tiering disabled. Also added some steady-state perf numbers. Lower is better for all diffs.

JitBench startup perf - time (ms)

	NoTier	Before	After	Diff from before	Diff from NoTier
DotNetBuildHelloWorld	1332.14	1290.33	1387.14	7.50%	4.13%
CscHelloWorld	609.00	500.00	632.57	26.51%	3.87%
CscRoslynSource	2316.29	2353.80	2461.33	4.57%	6.26%
EmptyConsoleProgram	57.86	56.67	57.57	1.60%	-0.49%
MusicStoreStartup	597.57	551.71	615.60	11.58%	3.02%
MusicStoreFirstRequest	513.29	406.00	534.20	31.58%	4.07%

ASP.NET startup perf - time (ms), server start + first request

	NoTier	Before	After	Diff from before	Diff from NoTier
DbFortunesEf Windows	1783.16	1282.57	1833.70	42.97%	2.83%
DbFortunesEf Linux	1774.85	1253.31	1823.57	45.50%	2.75%
DbFortunesRaw Windows	1064.10	839.72	1074.16	27.92%	0.95%
DbFortunesRaw Linux	1100.03	828.14	1102.17	33.09%	0.19%
Json Windows	565.31	516.31	586.11	13.52%	3.68%
Json Linux	518.77	417.72	525.45	25.79%	1.29%
MvcJson Windows	735.69	654.59	758.05	15.81%	3.04%
MvcJson Linux	671.86	547.66	691.51	26.27%	2.92%
MvcPlaintext Windows	656.82	581.36	666.21	14.60%	1.43%
MvcPlaintext Linux	611.49	506.87	627.18	23.74%	2.57%
Plaintext Windows	487.44	421.02	501.95	19.22%	2.98%
Plaintext Linux	427.12	348.57	435.44	24.92%	1.95%

JitBench startup perf - R2R disabled

	NoTier	Before	After	Diff from before	Diff from NoTier
DotNetBuildHelloWorld	3547.86	2568.17	3573.43	39.14%	0.72%
CscHelloWorld	2194.86	1212.17	2194.57	81.05%	-0.01%
CscRoslynSource	4001.14	3645.57	3998.43	9.68%	-0.07%
EmptyConsoleProgram	101.57	84.57	102.14	20.78%	0.56%
MusicStoreStartup	2071.57	1268.33	2067.71	63.03%	-0.19%
MusicStoreFirstRequest	1738.29	977.14	1730.71	77.12%	-0.44%

ASP.NET startup perf - R2R disabled - time (ms), server start + first request

	NoTier	Before	After	Diff from before	Diff from NoTier
DbFortunesEf Windows	2827.39	1808.65	2811.59	55.45%	-0.56%
DbFortunesEf Linux	2694.68	1763.44	2734.69	55.08%	1.48%
DbFortunesRaw Windows	1971.53	1282.32	1955.90	52.53%	-0.79%
DbFortunesRaw Linux	1914.01	1259.21	1886.68	49.83%	-1.43%
Json Windows	1309.95	865.64	1297.10	49.84%	-0.98%
Json Linux	1194.95	797.74	1199.75	50.39%	0.40%
MvcJson Windows	1878.55	1240.70	1885.21	51.95%	0.35%
MvcJson Linux	1740.76	1149.11	1745.34	51.89%	0.26%
MvcPlaintext Windows	1676.32	1077.02	1669.72	55.03%	-0.39%
MvcPlaintext Linux	1567.92	1031.59	1565.73	51.78%	-0.14%
Plaintext Windows	1100.61	739.77	1102.61	49.05%	0.18%
Plaintext Linux	1017.82	683.04	1020.50	49.41%	0.26%

JitBench steady-state perf - time (ms)

	NoTier	Before	After	Diff from before	Diff from NoTier
MusicStoreMedianResponse	2.58	1.87	1.82	-2.37%	-29.44%
Word2VecTraining	32511.14	32503.14	32676.43	0.53%	0.51%
Word2VecMedianSearch	21.47	21.59	21.63	0.21%	0.76%

ASP.NET steady-state perf - time (ns) per request

	NoTier	NoTier,NoR2r	Before	After	Diff from before	Diff from NoTier	Diff from NoTier,NoR2r
DbFortunesEf Linux	14965.92	12743.28	12650.50	12865.87	1.70%	-14.03%	0.96%
DbFortunesRaw Linux	10653.14	8670.81	8693.86	8793.95	1.15%	-17.45%	1.42%
Json Linux	3600.04	2662.96	2639.31	2619.86	-0.74%	-27.23%	-1.62%
Json https Linux	5788.02	4394.60	4340.89	4338.63	-0.05%	-25.04%	-1.27%
MvcJson Linux	8867.86	5927.52	5788.18	5843.80	0.96%	-34.10%	-1.41%
MvcPlaintext Linux	3071.05	1677.61	1636.26	1630.24	-0.37%	-46.92%	-2.82%
Plaintext Linux	756.31	509.73	504.92	499.42	-1.09%	-33.97%	-2.02%
Plaintext https Linux	1607.07	1119.20	1092.08	1112.21	1.84%	-30.79%	-0.62%
PlaintextNonPipelined Linux	2804.53	2101.86	2089.88	2093.27	0.16%	-25.36%	-0.41%
PlaintextNonPipelined https Linux	4922.10	3943.04	3916.68	3960.39	1.12%	-19.54%	0.44%

kouvel · 2019-04-01T21:03:02Z

I would like to see the changes here that are not part of #23599 be split into a pure renaming part and then a policy/logic changes.

Done, the second commit now has only the renames, and the commit titled "Functional changes" has the first set of functional changes.

How do we avoid jitting a method twice at tier1, if the jit decides to switch codegen from tier0 to tier1? Seems like the jit is feeding information back to the runtime, but I don't see where it gets used.

The call to setMethodAttribs here:
https://github.com/kouvel/coreclr/blob/9bcfcded882ccc37113c77b49ebb6b42655631dd/src/jit/flowgraph.cpp#L4276-L4277

Ends up here and call counting for the startup tier is disabled for the method:
https://github.com/kouvel/coreclr/blob/9bcfcded882ccc37113c77b49ebb6b42655631dd/src/vm/jitinterface.cpp#L6910-L6913

So the method will not move up a tier. For the default NativeCodeVersion the code above would also cause it to be treated as tier 1 code. I was missing an update of non-default NativeCodeVersion, fixed in the last commit.

If we have this feedback mechanism, seems like we could use it more broadly to communicate when rejitting is not likely to be beneficial.

I think I will move the QuickJitForLoops flag to be a JIT config flag. Since we're not exposing it as a config option from the host, I think it would fit better for it to be a JIT config flag.

Yes in the future the JIT may decide to turn on opts at at point, maybe if the code would be much larger or if missed optimizations would be too expensive. If the tier 0 code is good enough and it would not be beneficial to rejit then it can inform the VM in the same way and the VM will treat it as tier 1 code.

If we are going to rename, seems like we should rename things everywhere. But my preference would be to stick with Tier0 and Tier1, at least internally. I can see perhaps the new names might be useful for the user-visible config settings.

I was thinking if we add a new tier between tier 0 and tier 1 in the future that would be a mess. Giving them a name that reflects what that tier does I think would be good for code readability as well. I skipped renaming the JIT-side flags for now, as I'm not sure if it would be preferred to call them tier 0 and tier 1. The VM-side tiers don't have to necessarily match to JIT-side tiers (or flags). For example there could be a "CallCountingTier" in the future where the entry point would be a stub that counts calls before going to the code entry point (just one possible implementation, could be done in other ways).

For loop detection, why not just put the logic in fgMarkBackwardJump ?

That function is also called for recursive functions, so it's saying that a recursive function has a loop. I was intending on excluding those because they will build up call counts during the recursion to be promoted to the optimized tier.

AndyAyersMS · 2019-04-01T21:39:56Z

I was thinking if we add a new tier between tier 0 and tier 1 in the future that would be a mess.

I had similar thoughts -- at least for me the right analog in naming (at least internally) is the optimization level settings used by C/C++ compilers, generally a numeric scale where larger numbers imply more optimization & slower compilation. Instrumentation (call or block counts) I think would be a somewhat separate concern. Generally systems tend to instrument less optimized code, but it is not a hard and fast rule.

For loop detection, why not just put the logic in fgMarkBackwardJump ?

That function is also called for recursive functions ... I was intending on excluding those... they will build up call counts during the recursion

Perhaps -- but we might also blow the stack in the meantime. So I would be in favor of considering explicitly tail recursive methods as being methods with loops. And similar concerns for methods with explicit tail call sites.

I think one of the things we're seeing with the issues coming up with Tier0 -- at least from the jit standpoint -- is that we don't have as good a handle as we'd like on what codegen features end up being "requirements" in release, and when we vary from what Tier1 does, we end up with surprises.

If we're lucky most of the "required" things will actually be fairly cheap to implement; certainly the box-related opts and devirt fall into that category, as well as tail call opts.

kouvel · 2019-04-01T22:58:34Z

So I would be in favor of considering explicitly tail recursive methods as being methods with loops. And similar concerns for methods with explicit tail call sites.

Yea good point, will fix. Should fgMarkBackwardJump continue to be called for all recursive methods, such that it also covers implicit tail calls? I'm supposing that I should check for explicit tail calls specifically for this heuristic and not affect the backward jump flag on the block.

kouvel · 2019-04-01T23:07:30Z

Rebased on top of #23599, this PR begins from the commit titled "Rename tiers and config options, add config option QuickJitForLoops"

kouvel · 2019-04-01T23:13:44Z

I had similar thoughts -- at least for me the right analog in naming (at least internally) is the optimization level settings used by C/C++ compilers, generally a numeric scale where larger numbers imply more optimization & slower compilation. Instrumentation (call or block counts) I think would be a somewhat separate concern. Generally systems tend to instrument less optimized code, but it is not a hard and fast rule.

I would like to separate the concept of "optimization level" from "tier". Perhaps achieving that would involve some more renames, something along the lines of:

Rename JIT-side tier 0 / tier 1 flags to say OptimizationLevel instead of Tier
Rename OptimizationTier enumeration to Tier or something like that

I think that will give some flexibility. For example, if we were to introduce instrumentation, that could be a separate tier InstrumentedTier that may choose to instrument code with OptimizationLevel0 or OptimizationLevel1. [Edit] Or maybe that could map to a separate JIT flag and the JIT would choose which optimization level to instrument.

What do you think?

AndyAyersMS · 2019-04-02T00:35:52Z

Should fgMarkBackwardJump continue to be called for all recursive methods, such that it also covers implicit tail calls? I'm supposing that I should check for explicit tail calls specifically for this heuristic and not affect the backward jump flag on the block

The jit needs to call fgMarkBackwardJump from the importer as it does now, as some jit behavior is likely dependent on this. We probably don't need to call it unless we're at a tail call site, but let's handle that as a separate issue.

This recursive call detection in the importer comes too late to switch opt levels, as aspects of importer behavior depend on minopts (eg use of the box temp, possibly more). So if you want to detect this early you will need something customized.

I would like to separate the concept of "optimization level" from "tier". Perhaps achieving that would involve some more renames

I think instrumentation should be a flag (as it is now for IBC).

My preference is generally for the runtime to specify intent and the jit to map that to behavior, so that on the jit side we have some level of control over the combinatorics possible in codegen. Whether that is called a tier or an opt level is less critical, but we should probably at the same time revamp the other "intent" style jit flags like CORJIT_FLAG_SIZE_OPT.

kouvel · 2019-04-02T00:47:19Z

I think instrumentation should be a flag (as it is now for IBC).

My preference is generally for the runtime to specify intent and the jit to map that to behavior, so that on the jit side we have some level of control over the combinatorics possible in codegen. Whether that is called a tier or an opt level is less critical, but we should probably at the same time revamp the other "intent" style jit flags like CORJIT_FLAG_SIZE_OPT.

Ok I feel that your opinion does not necessarily disagree with my proposal. I agree that there needs to be some sort of mapping between VM concepts and JIT concepts that make sense. My motivation was mostly to separate those two things such that the VM can conceptualize things in a way that make sense there without relevance to JIT concepts. A component may do the mapping from VM concepts to JIT concepts, and the mapping from a VM concept could simply be an intent that the JIT would remap to specific behaviors. Does that align with your thinking?

AndyAyersMS · 2019-04-02T00:51:27Z

I think so...? Once we have a concrete proposal we can see if we actually agree or not.

kouvel · 2019-04-02T00:55:33Z

I'll take a stab at it, others feel free to weigh in

noahfalk · 2019-04-02T05:44:24Z

I was thinking if we add a new tier between tier 0 and tier 1 in the future that would be a mess. Giving them a name that reflects what that tier does I think would be good for code readability as well.

I'm fine with rename, but if the goal is future proofing I'm both accepting and expecting that future changes to the functionality will probably result in more name changes. For example if we introduced an interpreter in the future the names would probably churn again because in some places StartupTier refers to 'the 1st tier' and in other places it refers to 'the tier that has low opt jitted code'.

I would like to separate the concept of "optimization level" from "tier".

I'm not sure what tiers would be referring to if they didn't refer to optimization levels? I do think there is value in separating instrumentation from optimization, but I would just label code with 2 variables: JitCompile(JITFLAG_INSTRUMENTATION_X | JITFLAG_OPTIMIZATION_TIER_Y).

For example, if we were to introduce instrumentation, that could be a separate tier InstrumentedTier

I think as tiering matures we are likely to want a variety of instrumented options. For example in my tier2 experiment I had three that I treated distinctly (T0+low instrumentation, T1+low instrumentation, T1+high instrumentation). We could define a combo of instrumentation+optimization level as a tier and give each one a unique name, but personally I found it easier to program against the underlying two variables without trying to abstract it further.

I think instrumentation should be a flag (as it is now for IBC).

👍

noahfalk · 2019-04-02T06:14:36Z

src/inc/corinfo.h

@@ -832,7 +832,7 @@ enum CorInfoFlag
    CORINFO_FLG_INTRINSIC             = 0x00400000, // This method MAY have an intrinsic ID
    CORINFO_FLG_CONSTRUCTOR           = 0x00800000, // This method is an instance or type initializer
    CORINFO_FLG_AGGRESSIVE_OPT        = 0x01000000, // The method may contain hot code and should be aggressively optimized if possible
-//  CORINFO_FLG_UNUSED                = 0x02000000,
+    CORINFO_FLG_ALLOW_TIER0_TO_TIER1  = 0x02000000, // Indicates that for a tier 0 compilation request, the JIT may choose to switch to tier 1 if appropriate


I think of this slightly differently - that the runtime made a request for an indeterminate optimization level and the JIT resolved it to a specific optimization level based on some criteria. Although arguably this is just semantics, my concern about describing it this way is that new developers are likely to take flags/constants that say Tier0 at face value without necessarily noticing that there is a special case that converts 0 into 1. Using alternate terminology such as TIER0_OR_1 or UNRESOLVED_TIER (and not specifying TIER0) feels much harder to misunderstand.

The requested intent from the VM seems to be "JIT quickly". The JIT may choose to do so or not for various reasons. For example if the method only has a small loop and a large amount of code outside it may choose to optimize the loop and to not optimize code outside. In that case the VM would still have to consider that as unoptimized code. Or if the JIT finds that the tier 0 code would be too slow because of too many high-impact optimizations being missed, it may choose to generate fully optimized code, in which case the VM would consider that as optimized code. Maybe something like "PREFER_TIER0" would be clearer intent, or something more generic like "JIT_QUICKLY".

I think I'll leave the JIT config flags as they are for now. For this and the preview 4 PR I would like to start with some decent names for the config options. It may not be a big deal though if we have to rename them again.

kouvel · 2019-04-02T20:15:11Z

Yea it could be done in different ways:

StartupTier may refer to the whole startup portion, including interpreter, quick JIT, and call counting for each. InstrumentedTier could have various modes of instrumentation / optimization. Optimized tier may be profile-optimized or not.
Have separate tier for each phase, such as InterpretedTier, QuickJittedTier, several variants of InstrumentedTier, several variants of OptimizedTier

(1) may be easier to work with but kind of makes the tier irrelevant. Configuring the whole combination would be a bit challenging, for example the call counting threshold for interpreter and quick JIT to transition to the next tier probably would not be the same.

(2) makes it easy to tie config options to the tier, such as call counting threshold, options for configuring modes of interpreter, quick JIT, or instrumentation in that tier. Tiers would also map to code versions so it would be easy to describe a code version with the tier name.

I'm leaning towards (2) at the moment, with that perhaps I would rename StartupTier to QuickJittedTier and that would keep those config options more stable in the future. Thoughts?

kouvel · 2019-04-03T13:47:57Z

In Ubuntu x64 Checked CoreFX Tests it seems to be running an old version of the ThrowExceptionForHR_ErrorInfo_ReturnsValidException test, the issue appears to have been fixed by dotnet/corefx#34968 a while back.

…T for loops - Renamed tier 0 / tier 1 to StartupTier, OptimizedTier - Added config option JitTier0ForLoops, which determines whether quick JIT, when enabled, may be used for methods that contain loops. Off by default, so after this change, QuickJit=1 would still not use quick JIT for methods that contain loops by default.

kouvel · 2019-04-09T20:12:01Z

It looks like a tier 0 compilation is respecting explicit tail calls at least in some cases. I tried the following with QuickJit enabled:

IL:

  .method private hidebysig static int32 
          CalculateSumWithTailCall(int32 n,
                                   [opt] int32 sum) cil managed
  {
    .param [2] = int32(0x00000000)
    // Code size       27 (0x1b)
    .maxstack  8
    IL_0000:  ldarg.0
    IL_0001:  ldc.i4.0
    IL_0002:  bgt.s      IL_0006

    IL_0004:  ldc.i4.0
    IL_0005:  ret

    IL_0006:  ldarg.1
    IL_0007:  ldarg.0
    IL_0008:  add
    IL_0009:  starg.s    sum
    IL_000b:  ldarg.0
    IL_000c:  ldc.i4.1
    IL_000d:  bne.un.s   IL_0011

    IL_000f:  ldarg.1
    IL_0010:  ret

    IL_0011:  ldarg.0
    IL_0012:  ldc.i4.1
    IL_0013:  sub
    IL_0014:  ldarg.1
    IL_0015:  tail. call int32 ExplicitTailCallNoSO::CalculateSumWithTailCall(int32,
                                                                              int32)
    IL_001a:  ret
  } // end of method ExplicitTailCallNoSO::CalculateSumWithTailCall

Effectively equivalent to the following except with an explicit tail call:

    private static int CalculateSumWithTailCall(int n, int sum = 0)
    {
        if (n <= 0)
            return 0;
        sum += n;
        if (n == 1)
            return sum;
        return CalculateSumWithTailCall(n - 1, sum);
    }

; Assembly listing for method ExplicitTailCallNoSO:CalculateSumWithTailCall(int,int):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; Tier-0 compilation
; compiler->opts.MinOpts() is true
; rbp based frame
; fully interruptible
; Final local variable assignments
;
;  V00 arg0         [V00    ] (  1,  1   )     int  ->  [rbp+0x10]
;  V01 arg1         [V01    ] (  1,  1   )     int  ->  [rbp+0x18]
;# V02 OutArgs      [V02    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]   "OutgoingArgSpace"
;  V03 tmp1         [V03    ] (  1,  1   )     int  ->  [rbp-0x04]   "arg temp"
;
; Lcl frame size = 16

G_M30597_IG01:
       55                   push     rbp
       4883EC10             sub      rsp, 16
       488D6C2410           lea      rbp, [rsp+10H]
       894D10               mov      dword ptr [rbp+10H], ecx
       895518               mov      dword ptr [rbp+18H], edx

G_M30597_IG02:
       837D1000             cmp      dword ptr [rbp+10H], 0
       7F08                 jg       SHORT G_M30597_IG04
       33C0                 xor      eax, eax

G_M30597_IG03:
       488D6500             lea      rsp, [rbp]
       5D                   pop      rbp
       C3                   ret

G_M30597_IG04:
       8B4518               mov      eax, dword ptr [rbp+18H]
       034510               add      eax, dword ptr [rbp+10H]
       894518               mov      dword ptr [rbp+18H], eax
       837D1001             cmp      dword ptr [rbp+10H], 1
       7509                 jne      SHORT G_M30597_IG06
       8B4518               mov      eax, dword ptr [rbp+18H]

G_M30597_IG05:
       488D6500             lea      rsp, [rbp]
       5D                   pop      rbp
       C3                   ret

G_M30597_IG06:
       8B4510               mov      eax, dword ptr [rbp+10H]
       FFC8                 dec      eax
       8945FC               mov      dword ptr [rbp-04H], eax
       8B45FC               mov      eax, dword ptr [rbp-04H]
       894510               mov      dword ptr [rbp+10H], eax
       EBCA                 jmp      SHORT G_M30597_IG02

Are there explicit tail call cases where a tier 0 compilation would overflow the stack and a tier 1 compilation would not?

kouvel · 2019-04-09T20:14:30Z

Although methods like above should be treated as though they contain a loop

noahfalk · 2019-04-10T01:02:09Z

I'm leaning towards (2) at the moment, with that perhaps I would rename StartupTier to QuickJittedTier and that would keep those config options more stable in the future. Thoughts?

Fine by me. To whatever limited degree I can predict the future, I agree this tier name sounds less likely than others to need a future change.

kouvel · 2019-04-11T20:03:03Z

RE QuickJittedTier, using that would probably also mean adding a separate tier PrecompiledTier such that precompiled code would not be in the QuickJittedTier. I think it would be useful anyway in PerfView to be able to distinguish time spent in precompiled vs quick-jitted code, although there may be other ways to get that info. There may be other benefits to distinguish those. I figure if for a method the precompiled code is good enough to consider fully optimized, it would instead be in the OptimizedTier.

kouvel · 2019-04-12T04:45:57Z

Closing for now

kouvel added the area-TieredCompilation label Mar 30, 2019

kouvel added this to the 3.0 milestone Mar 30, 2019

kouvel self-assigned this Mar 30, 2019

kouvel requested a review from noahfalk March 30, 2019 15:18

kouvel mentioned this pull request Mar 30, 2019

Change option that enables/disables tier 0 JIT and expose another related option dotnet/sdk#3064

Merged

kouvel force-pushed the QuickJit branch from c0cd110 to e8fc318 Compare March 30, 2019 15:21

kouvel requested a review from AndyAyersMS March 30, 2019 15:22

jkotas reviewed Mar 30, 2019

View reviewed changes

src/jit/error.cpp Outdated Show resolved Hide resolved

jkotas reviewed Mar 30, 2019

View reviewed changes

src/jit/error.cpp Outdated Show resolved Hide resolved

kouvel force-pushed the QuickJit branch from 10f6fe9 to 75177d0 Compare March 30, 2019 19:02

kouvel changed the title ~~Disable tier 0 JIT (quick JIT) by default, config option and tier renames~~ Rename tiers and config options, add config option QuickJitForLoops Mar 30, 2019

kouvel mentioned this pull request Mar 31, 2019

[Preview 4] Disable tier 0 JIT (quick JIT) by default, rename config option #23599

Merged

kouvel force-pushed the QuickJit branch from e28c5a1 to 9bcfcde Compare April 1, 2019 20:36

kouvel force-pushed the QuickJit branch from 9bcfcde to 214f95d Compare April 1, 2019 23:04

noahfalk reviewed Apr 2, 2019

View reviewed changes

kouvel force-pushed the QuickJit branch from 214f95d to 108b0ba Compare April 2, 2019 19:07

kouvel force-pushed the QuickJit branch from 108b0ba to d0d7660 Compare April 2, 2019 22:05

Koundinya Veluri added 9 commits April 3, 2019 07:00

Functional changes

df1d99c

Fix formatting

89a6bfd

Change opt level in-place and continue jitting

4c99064

Change QuickJitForLoops to be not host-modifiable

0ea1056

Minor fix

ea1321d

Update native code version's tier on switch from tier 0 to tier 1

d78a582

Move QuickJitForLoops to be a JIT config flag JitTier0ForLoops

cee879c

Fix formatting

9af6355

kouvel force-pushed the QuickJit branch from 8bf7363 to 9af6355 Compare April 3, 2019 14:01

kouvel changed the title ~~Rename tiers and config options, add config option QuickJitForLoops~~ Rename tiers and config options, add config option JitTier0ForLoops Apr 9, 2019

kouvel closed this Apr 12, 2019

kouvel deleted the QuickJit branch April 12, 2019 04:46

kouvel mentioned this pull request May 1, 2019

When QuickJit is enabled, disable it for methods that contain loops by default #24252

Merged

Rename tiers and config options, add config option JitTier0ForLoops #23597

Rename tiers and config options, add config option JitTier0ForLoops #23597

Uh oh!

Conversation

kouvel commented Mar 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kouvel commented Mar 30, 2019

Uh oh!

Uh oh!

Uh oh!

jkotas commented Mar 30, 2019

Uh oh!

kouvel commented Mar 30, 2019

Uh oh!

jkotas commented Mar 30, 2019

Uh oh!

kouvel commented Mar 30, 2019

Uh oh!

kouvel commented Mar 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kouvel commented Mar 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kouvel commented Mar 31, 2019

Uh oh!

AndyAyersMS commented Apr 1, 2019

Uh oh!

kouvel commented Apr 1, 2019

Uh oh!

kouvel commented Apr 1, 2019

Uh oh!

AndyAyersMS commented Apr 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kouvel commented Apr 1, 2019

Uh oh!

kouvel commented Apr 1, 2019

Uh oh!

kouvel commented Apr 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndyAyersMS commented Apr 2, 2019

Uh oh!

kouvel commented Apr 2, 2019

Uh oh!

AndyAyersMS commented Apr 2, 2019

Uh oh!

kouvel commented Apr 2, 2019

Uh oh!

noahfalk commented Apr 2, 2019

Uh oh!

noahfalk Apr 2, 2019

Choose a reason for hiding this comment

Uh oh!

kouvel Apr 2, 2019

Choose a reason for hiding this comment

Uh oh!

kouvel commented Apr 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kouvel commented Apr 3, 2019

Uh oh!

kouvel commented Apr 9, 2019

Uh oh!

kouvel commented Apr 9, 2019

Uh oh!

noahfalk commented Apr 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kouvel commented Apr 11, 2019

Uh oh!

kouvel commented Apr 12, 2019

Uh oh!

Uh oh!

kouvel commented Mar 30, 2019 •

edited

Loading

kouvel commented Mar 30, 2019 •

edited

Loading

kouvel commented Mar 31, 2019 •

edited

Loading

AndyAyersMS commented Apr 1, 2019 •

edited

Loading

kouvel commented Apr 1, 2019 •

edited

Loading

kouvel commented Apr 2, 2019 •

edited

Loading

noahfalk commented Apr 10, 2019 •

edited

Loading