Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler should emit access to auto-prop backing store for access in same compilation unit #15644

Closed
stephentoub opened this issue Dec 2, 2016 · 38 comments
Labels
Area-Compilers Feature Request Resolution-By Design The behavior reported in the issue matches the current design

Comments

@stephentoub
Copy link
Member

stephentoub commented Dec 2, 2016

Version Used:
1.3.1.60616

Steps to Reproduce:
Compile:

class Test
{
    public static int SomeProp { get; } = 42;
    public static int Main() => SomeProp;
}

Expected Behavior:
Generates the IL for Main:

.method public hidebysig static int32  Main() cil managed
{
  .entrypoint
  // Code size       6 (0x6)
  .maxstack  8
  IL_0000:  ldsfld     int32 Test::<SomeProp>k__BackingField
  IL_0005:  ret
} // end of method Test::Main

Actual Behavior:
Generates the IL for Main:

.method public hidebysig static int32  Main() cil managed
{
  .entrypoint
  // Code size       6 (0x6)
  .maxstack  8
  IL_0000:  call       int32 Test::get_SomeProp()
  IL_0005:  ret
} // end of method Test::Main

Primary benefit here would be alleviating a bit of work on the JIT having to inline the accessor, and on avoiding any concern from the developer that such inlining would be done. This could be done wherever the caller in the same compilation unit has visibility to the backing field.

@0xd4d
Copy link

0xd4d commented Dec 2, 2016

This only works if the backing field is accessible to the caller. Compiler generated fields (at least generated by C# Roslyn) are private.

@stephentoub
Copy link
Member Author

This only works if the backing field is accessible to the caller.

Yes. I'm suggesting that when they are accessible, this could be done, similar to how it's done for events.

@alrz
Copy link
Member

alrz commented Dec 2, 2016

Also for read-write auto-properties, the compiler can directly write to the underlying field, if accessible.

EDIT: Even if the property isn't auto-implemented but it's a simple field access, this still can be done.

@alrz
Copy link
Member

alrz commented Dec 2, 2016

btw, I wonder how deep compiler currently inlines code and doe not depend on the JIT. Variables for example, that are just referenced once and there is no side-effect in between can be inlined. Also, it seems that there is no optimization on loops. Are any of these considered for the future, @stephentoub?

@gafter
Copy link
Member

gafter commented Dec 2, 2016

The language specifies the behavior here, and it is visible in the debugger.

@alrz
Copy link
Member

alrz commented Dec 2, 2016

debugger

Perhaps this should be done only in optimized builds.

@gafter
Copy link
Member

gafter commented Dec 2, 2016

We do not generally do optimizations in the front end.

@stephentoub
Copy link
Member Author

Oh course we do; otherwise what's the difference between debug and release? And we should do more.

@gafter
Copy link
Member

gafter commented Dec 2, 2016

The difference is that we omit debug information. And yes, I agree, we should omit more of it.

@stephentoub
Copy link
Member Author

We do more than just that. We avoid hoisting locals that aren't needed. We use structs instead of classes where possible. We do const folding. We avoid boxing in some situations. Etc. If all of that is "debug info" then I'm not sure how a method call instead of a field access is any less "debug info".

@gafter
Copy link
Member

gafter commented Dec 3, 2016

@stephentoub "avoiding hoisting" is avoiding a deoptimization that we sometimes perform to support debugging. Constant folding is required by the language specification, not something that we do for performance. Etc.

@stephentoub
Copy link
Member Author

avoiding hoisting" is avoiding a deoptimization that we sometimes perform to support debugging

Oh come on. In C#5 async methods lifted all locals to the state machine, then in C#6 we got smarter and avoided lifting those we could get away without. You're saying that what we did in C#6 was actually just undoing a purposeful deoptimization, not applying an optimization? :)

Etc

We use Array.Empty instead of new T[]. We replace char consts with string consts to choose a better overload of string.Concat. Those are both cases of avoiding deoptimizations for debugging?

@gafter
Copy link
Member

gafter commented Dec 3, 2016

We actually had to go out of our way to reintroduce the deoptimization around lifting in Roslyn when we noticed the degradation of the debug experience. The "optimization" was the natural behavior of lambda lowering.

If you write string.Concat with a character constant, we do not replace it with a different overload. You're thinking of the string + operator, whose translation is not specified.

We changed the language spec to permit the use of Array.Empty. It is not an optimization of the specified behavior.

@stephentoub
Copy link
Member Author

You're thinking of the string + operator.

Yes, and I never said otherwise. That's what I was talking about. I'm the one who contributed that:
b932a6d

We actually had to go out of our way to reintroduce the deoptimization

As we do with most optimizations the compiler applies. They're disabled in debug, enabled in release.

The "optimization" was the natural behavior of lambda lowering.

That "natural behavior" required a lot of additional code. Are you saying that writing that extra code wasn't an act of adding an optimization?

We changed the language spec to permit the use of Array.Empty. It is not an optimization of the specified behavior.

So the optimization included updating the spec.

I'm really not sure at this point what we're discussing. The C# compiler gets code added to it to generate better IL than it did before. The process of adding such code is adding optimizations. You said that the C# compiler generally doesn't do optimizations, and I was countering that. Now, maybe you meant to refer to a specific class of optimizations, in which case that may be the case but we should lift that arbitrary restriction.

@gafter
Copy link
Member

gafter commented Dec 3, 2016

You're thinking of the string + operator.

Yes, and I never said otherwise. That's what I was talking about. I'm the one who contributed that:
b932a6d

Because it directly implements the specified language semantics, it isn't an optimization (except by comparison to a previous poor code gen strategy). It is just a better code-gen strategy. The optimization requested in this issue is to generate code different from that required by the language specification.

The "optimization" was the natural behavior of lambda lowering.

That "natural behavior" required a lot of additional code. Are you saying that writing that extra code wasn't an act of adding an optimization?

Yes, I'm saying that the act of writing the Roslyn C# compiler from scratch, including a straightforward lambda lowering based on the language specification's definition of lifted variable, was not an act of adding an optimization.

We changed the language spec to permit the use of Array.Empty. It is not an optimization of the specified behavior.

So the optimization included updating the spec.

I think we have to distinguish the kind of optimization requested in this issue, which is to generate code different from that required by the specification (even if it results in similar-enough semantics for most purposes), from the kinds of code improvements permitted by the specification. They are qualitatively different. I'm saying that we don't do the former.

@stephentoub
Copy link
Member Author

Because it directly implements the specified language semantics, it isn't an optimization

So you're saying that something is only an "optimization" if it doesn't comply with the language spec? That doesn't make sense to me as part of the definition of the word.

(except by comparison to a previous poor code gen strategy). It is just a better code-gen strategy

Yes, that is a form of optimization.

I think we have to distinguish the kind of optimization requested in this issue

I agree with that. They're both "optimizations", and I believe you're just trying to draw a distinction between those optimizations permitted by the spec and those not permitted by the spec. I agree that if an optimization is explicitly prohibited by the spec, then we either shouldn't do it or should consider updating the spec to not prohibit it. There are plenty of possible optimizations that don't violate the spec, and I expect there are some which we should consider even if they do, and update the spec accordingly, on a case-by-case basis.

For reference, which part of the spec does this suggested change violate, and in what way does that violation cause problems if implemented only for release builds?

@CyrusNajmabadi
Copy link
Member

Overall, i like this idea. My only concern would be if this was somehow observable, or could otherwise lead to a change in behavior that could reasonably affect someone. Say, for example, someone used some sort of IL rewriting tool to instrument their code, or do some other action. Now, they'd see different behavior.

Of course, in that case, it still might be ok, with the caveat that that other tool might also need to be updated to handle this new behavior.

--

IMO, IL and the runtime are essentially implementation details. We use them to implement the semantics of the language. And as long as those semantics are preserved, then we can do whatever we want there. For stuff that happens inside a method body, we have a lot of liberty in terms of what we generate. We will have to be paranoid in ensuring that this doesn't break some rule of the language**. But if i doesn't, then this seems like a good thing to do.

--

**
For example, if you read/wrote a property in an Expression<T> we'd have to make sure that we didn't normalize that to a field read/write. In this case, the Expression<T> would be data describing your intent, usually for some other API to consume, so it would be inappropriate to change that.

These are the types of things i'd want us to be careful about.

@mikedn
Copy link

mikedn commented Dec 3, 2016

Primary benefit here would be alleviating a bit of work on the JIT having to inline the accessor, and on avoiding any concern from the developer that such inlining would be done

FWIW the JIT was/is considering adding some sort of fast path for trivial methods, that would probably take care of this. And of course, the JIT has the advantage that it can do it for fields that aren't normally accessible.

@gafter
Copy link
Member

gafter commented Dec 5, 2016

The Roslyn compiler strives to produce good code, within the guidelines of the C# and VB language specifications. Optimizations that change the code to something different from the requirements of the specifications and that are visible in the IL are, as a matter of design policy, not done in the Roslyn front ends. While the suggested optimization is not observable to some kinds of observers (e.g. some runtime environments), it is visible to others (e.g. IL rewriters). We do not consider the requirements of the language specifications "arbitrary" constraints on the kinds of code improvements we permit.

It is possible that we will entertain optimizations beyond those permitted by this policy in the future, but at this time they are outside the scope of the Roslyn project.

@gafter gafter closed this as completed Dec 5, 2016
@gafter gafter added the Resolution-By Design The behavior reported in the issue matches the current design label Dec 5, 2016
@mattwarren
Copy link

FWIW the JIT was/is considering adding some sort of fast path for trivial methods, that would probably take care of this. And of course, the JIT has the advantage that it can do it for fields that aren't normally accessible.

@mikedn do you have a link to the issue that discusses that? I'd be interested to read more about it

@mikedn
Copy link

mikedn commented Dec 12, 2016

@mattwarren See https://github.com/dotnet/coreclr/issues/7738 1.iii (and 1.ii to some extent)

@FransBouma
Copy link

@gafter

It is possible that we will entertain optimizations beyond those permitted by this policy in the future, but at this time they are outside the scope of the Roslyn project.

It should be possible to do optimizations on a large(r) scale on the front-end but emit the result not as optimized IL but as hints for the JIT, so the JIT can make better decisions at runtime without requiring analysis. On the front-end one should be able to spend more time to do deeper analysis of the code for optimization (e.g. like what the C++ compiler does too), time that can't be spend at runtime when the JIT has to decide how to optimize a given piece of code. E.g. if deeper analysis turned out a given method is only called with a small range of values, the front-end should be able to tell the JIT that. This requires that IL is adjusted for being able to carry these hints, but it would give an opportunity to do more up front to aid what's used at runtime: the IL emitted by the front-end is the same, it just contains hints for the JIT how to optimize it at runtime.

C# compiling is fast, and that's nice, but when I ship code, I wouldn't mind if I have to wait 5 minutes for a deep analyzer to analyze the IL to emit hints for the JIT so it runs much faster at runtime because of it.

@gafter
Copy link
Member

gafter commented Dec 12, 2016

@FransBouma Runtime compilers are better positioned to do that kind of deeper analysis and optimizations as well. One way that can occur, which is better than any hints from the C# front end, is a profile record from a "typical" run, which can be used to drive runtime optimizations at the most beneficial locations in the code.

@FransBouma
Copy link

Ok, like e.g. what hotspot does? you got a point there :)

@migueldeicaza
Copy link

@gafter It seems to me that this sort of optimization should be allowed and that any existing IL rewriter should already be able to cope with this sort of change. The requested change can clearly be expressed in C# using a different programming style, so IL rewriters should already handle this scenario.

class Test
{
    static internal int myBackingField = 42;
    public static int SomeProp { get { return myBackingField; } }
    public static int Main() => SomeProp;
}

@gafter
Copy link
Member

gafter commented Dec 13, 2016

@migueldeicaza I don't know what you mean by "cope with". Do you mean that IL rewriters should assume that an access to the underlying field corresponds to an invocation of the accessor, and should therefore instrument it appropriately (e.g. by inserting code before and after it, for an AOP rewriter)? That seems particularly dangerous, so I suspect you mean something else, but I just don't know what it is you intend.

@migueldeicaza
Copy link

I had not considered that scenario, my scenario of IL rewriting was something like a linker.

Perhaps the alternative is something like [NoInline] on the property in the face of /optimize for AOP tools, as I think they are a rare beast compared to the most common scenario that is a JIT failing to inline a method due to the aggregate computed cost of the property access.

@gafter
Copy link
Member

gafter commented Dec 14, 2016

@migueldeicaza I think requiring the user to add attributes to the program so that the compiler will obey the specification is backwards. The compiler should always* obey the specification. Moreover, the author of a piece of code doesn't always know how that code will be processed later.

*Plus or minus epsilon

@migueldeicaza
Copy link

I was merely trying to accommodate for people that are using external tools like the one you mention (an AOP weaver).

My take is that I think that the job of our compilation tool chain is to bring the code that the user wrote and execute it in the most optimal way. Some of the work needs to be done in the compiler, some in the JIT, some in the libraries.

I appreciate that Roslyn has a policy for what it should do for the sake of these external tools, and I believe that this policy needs to be reversed to put reliability and performance at the front, and interop with other tool as a secondary objective, one where we could require either different compilation options (/optimize-) or the use of special compiler directives (Like the attribute I proposed which would be aligned with things like AggressiveInline or NoInline attributes in existence).

And while I understand that AOP and assembly weaving was a thing, I do not believe these are mainstream and those could work with a /optimize- flag.

@benaadams
Copy link
Member

I don't know what you mean by "cope with"

I'd assumed it meant an optimization that can still be expressed in the original language rather than only in il; therefore il tooling should be able to deal with it as an expected output of code.

There does seem to be an optimization gap. The jit can only apply so many optimizations as it is time constrained at runtime. The compile to il compilers; which aren't as time constrained have a lot of optimizations considered out of scope.

Do we need a 3rd compiler between roslyn and jit that optimizes the il as part of the regular compile or a "publish" compile; rather than as a specialist tool?

@gafter
Copy link
Member

gafter commented Dec 15, 2016

@migueldeicaza We already put reliability as a top priority. The toolchain following the front end is currently responsible for any additional improved performance beyond the program as written.

@benaadams I think an "IL optimizer" would be a better solution to this problem than modifying every compiler.

@benaadams
Copy link
Member

@gafter haven't checked, but I'd assumed all the language compilers flowed into the same il emit code rather than having different variants per language?

Opened issue #15929

@mikedn
Copy link

mikedn commented Dec 15, 2016

and I believe that this policy needs to be reversed to put reliability and performance at the front

I don't quite understand what reliability has to do with this. You have the current C# compiler which can have bugs, you have the current JIT compiler which can also have bugs the proposition here is to add more code to the C# compiler, code that overlaps with JIT code in terms in functionality but can have its own bugs. That doesn't sounds like a recipe for reliability to me.

The jit can only apply so many optimizations as it is time constrained at runtime

As already mentioned before, the optimization discussed here can be done efficiently in the JIT. It would be pretty trivial for the JIT to recognize trivial property getter/setter method bodies. And it has to do it anyway to deal with cases that the C# compiler can't handle (properties outside the project, properties in other classes).

The compile to il compilers; which aren't as time constrained have a lot of optimizations considered out of scope.

And a lot of optimization that they can't do for various reasons - lack of access to all code, inability to express the result of optimizations in (verifiable) IL, lack of knowledge about the target architecture, possibility of producing IL code that the JIT handles poorly because it doesn't expect normal managed compilers to produce such code etc.

Do we need a 3rd compiler between roslyn and jit that optimizes the il as part of the regular compile or a "publish" compile; rather than as a specialist tool?

Feel free to write one and prove that it is needed 😄. There's some useful things such a tool could do (one was mentioned in past CoreCLR discussions - outlining throw blocks). But there's also a good chance that it will turn out to be a complete waste of time.

@benaadams
Copy link
Member

possibility of producing IL code that the JIT handles poorly because it doesn't expect normal managed compilers to produce such code etc.

Which is why suggested would likely need to be a collaboration between Jit and Roslyn teams in #15929

And a lot of optimization that they can't do for various reasons - lack of access to all code

Feel free to write one and prove that it is needed 😄. There's some useful things such a tool could do (one was mentioned in past CoreCLR discussions - outlining throw blocks). But there's also a good chance that it will turn out to be a complete waste of time.

Mono/Xamarin linker might be a starting point...

@migueldeicaza
Copy link

@gafter I did not mean to imply that Roslyn did not make reliability a top priority. When I was writing the sentence I thought it would make sense to list the two big ticket items, one clearly done, one being part of the policy that I want to alter.

@migueldeicaza
Copy link

@mikedn said:

I don't quite understand what reliability has to do with this. You have the current C# compiler which can have bugs, you have the current JIT compiler which can also have bugs the proposition here is to add more code to the C# compiler, code that overlaps with JIT code in terms in functionality but can have its own bugs. That doesn't sounds like a recipe for reliability to me.

Reliability is just another attribute of what I expect in a compiler, as stated on my previous comment, Roslyn does quite well in this regard.

There is no "recipe" for reliability, and most definitely there is not one for a codebase with four million lines of code, so I am at a loss for what to respond to this.

What I can say is that testing compilers is a much simpler problem than testing interactive applications, so I am confident that whatever change we make to the compiler to produce better code can be adequately tested to guarantee the reliability of the optimization.

@mikedn
Copy link

mikedn commented Dec 15, 2016

There is no "recipe" for reliability, and most definitely there is not one for a codebase with four million lines of code, so I am at a loss for what to respond to this.

Sure there is. Don't add more code that duplicates existing functionality.

What I can say is that testing compilers is a much simpler problem than testing interactive applications, so I am confident that whatever change we make to the compiler to produce better code can be adequately tested to guarantee the reliability of the optimization.

Having just run into an optimization bug that's present in both the JIT and the VS 2015 C# compiler I'm not so sure about that. More generally, having seen all kinds of bugs in various compilers over the time I can only guess that you either have no clue what you are talking about or you're simply too optimistic.

@gafter
Copy link
Member

gafter commented Dec 16, 2016

@benaadams Re "I'd assumed all the language compilers flowed into the same il emit code rather than having different variants per language?"

That is not correct. The bound nodes from which we emit are language-specific, and so the emit phase is language-specific. We emit directly into byte arrays. So optimization at the IL level would not benefit from placement in the Roslyn front ends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Feature Request Resolution-By Design The behavior reported in the issue matches the current design
Projects
None yet
Development

No branches or pull requests

9 participants