-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Alternative System.Lazy implementation #8963
Conversation
Hi @manofstick, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution! TTYL, DNFBOT; |
@manofstick, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR. |
@manofstick - I have updated the charts to include your updated https://gist.github.com/mrange/aa82d33e94ad76d0f33ed86e704d7492 |
src/mscorlib/src/System/Lazy.cs
Outdated
|
||
private static class CreateInstance | ||
{ | ||
private static T construct() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment: Construct
?
src/mscorlib/src/System/Lazy.cs
Outdated
/// </summary> | ||
private static object GetObjectFromMode(LazyThreadSafetyMode mode) | ||
/// <param name="factory">The object factory used to create the underlying object</param> | ||
/// <param name="forceMemoryBarrier">true when called with ExecutionAndPublication, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: No parameter named forceMemoryBarrier
?
src/mscorlib/src/System/Lazy.cs
Outdated
private T CreateValuePublicationOnly(Func<T> factory, PublicationOnly comparand) | ||
{ | ||
var possibleValue = factory(); | ||
lock (comparand) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I find a bit confusing that the lock for PublicationOnly
happens inside the Create*
method but for ExecutionAndPublication
the caller holds the lock. Is there a reason the lock can't be moved into CreateValueExecutionAndPublication
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree with the asymmetry with the lock, but I need to call TakeFactory to null out the factory object in the case of ExecutionAndPublication. Anyway I have a plan to remove the lock from PublicationOnly, so that should make it better!
src/mscorlib/src/System/Lazy.cs
Outdated
/// <returns>The underlying object</returns> | ||
private T CreateValuePublicationOnly(Func<T> factory, PublicationOnly comparand) | ||
{ | ||
var possibleValue = factory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping that PublicationOnly
only could get away for with an Interlock.CompareAndExchange
over a full lock. What are the reasons a full lock is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've shamed me into it! I've thought of a cunning plan... soon, soon...
src/mscorlib/src/System/Lazy.cs
Outdated
|
||
// Fall through to the slow path. | ||
return LazyInitValue(); | ||
var implementation = m_implementation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you think it's worth to add aggressive inlining to the performance critical property getters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... that ones above my pay grade :-) I'm guessing that the JIT is probably inlining the calls to Value anyway? Maybe you could check?
src/mscorlib/src/System/Lazy.cs
Outdated
protected Func<T> TakeFactory() | ||
{ | ||
var factory = Factory; | ||
if (!ReferenceEquals(CreateInstance.Factory, factory)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this special handling needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TakeFactory is required to handle re-enterency. In the case of recursive initialization it throws an exception. Needed to replicate existing functionality.
src/mscorlib/src/System/Lazy.cs
Outdated
m_valueFactory = ALREADY_INVOKED_SENTINEL; | ||
if (factory == null) | ||
throw new InvalidOperationException(Environment.GetResourceString("Lazy_Value_RecursiveCallsToValue")); | ||
Factory = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right in that this handling is only correct if only one thread is working or we have execution and publication lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's on the money.
src/mscorlib/src/System/Lazy.cs
Outdated
|
||
public override T Value | ||
{ | ||
get { return Owner.CreateValuePublicationOnly(Factory, this); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No TakeFactory
here I guess because it's not safe for concurrent calls? How is reentrancy detection handled for PublicationOnly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't. It hangs! But same functionality as previous implementation.
src/mscorlib/src/System/Lazy.cs
Outdated
var possibleValue = factory(); | ||
lock (comparand) | ||
{ | ||
if (ReferenceEquals(m_implementation, comparand)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we are comparing against a comparand is that during evaluation of the lazy value we might get an exception? That is captured in an object that is referenced by m_implementation
. If we had no need to capture exceptions we could have compared against null?
src/mscorlib/src/System/Lazy.cs
Outdated
[NonSerialized] | ||
private object m_threadSafeObj; | ||
// m_implementation, a volatile reference, is set to null after m_value has been set | ||
private volatile ILazyItem m_implementation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the current implementation either holds a lock when updating m_implementation
or we have no protection ==> Lazy<'T>
can't be used concurrently then the only place where volatile
matters is in the .ctor
to ensure all writes to ILazyItem
objects are visible to other threaads when they read it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure of your comment there? (Or is it a comment that is meant to be a comment? even then I don't think it reads very well?)
The volatile matters in all places where m_implementation is accessed. i.e. it is the gatekeeper to ensure that m_value has a valid value before it is accessed. As reference type writes are atomic, we're guaranteed that we'll either have a valid ILazyItem implementation (from the .ctor) or null, after m_value has been written. (I'm about to make that a little more complex...)
src/mscorlib/src/System/Lazy.cs
Outdated
@@ -50,63 +42,22 @@ internal static class LazyHelpers | |||
[ComVisible(false)] | |||
[DebuggerTypeProxy(typeof(System_LazyDebugView<>))] | |||
[DebuggerDisplay("ThreadSafetyMode={Mode}, IsValueCreated={IsValueCreated}, IsValueFaulted={IsValueFaulted}, Value={ValueForDebugDisplay}")] | |||
public class Lazy<T> | |||
public class Lazy<T> : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @manofstick, I have checked the code and added a few observations/questions. Overall I see no obvious weakness. You said there was no test-coverage for Lazy<'T>
today. Can that be added (damn difficult for these kinds of classes but at least to cover the functionality)?
Hi @manofstick, I have checked the code and added a few observations/questions. Overall I see no obvious weakness. You said there was no test-coverage for Lazy<'T> today. Can that be added (damn difficult for these kinds of classes but at least to cover the functionality)? |
I think the reason the |
From my performance testing the read overhead of a |
I've not reviewed these changes yet, but there are functional tests for Lazy; as with most of the types in mscorlib, the full set of tests is in corefx, e.g. |
This involves a little bit of trickery. In avoiding the lock we need to go from m_implementation as PublicationOnly to null in a single step. We can't do this though without another object, so we reuse out Lazy object, which basically just spins its wheels until the value is ready. This shouldn't be a very common scenerio (i.e. hitting that code path) but it is possible.
OK; with a bit of trickery I restored the CompareExchange in PublicaitonOnly without having to create a new object. In my testing with lots of threads this performed very well. Can you update your numbers to see if you get the previous performance? |
Hiya @stephentoub ! Are you happy for me to dump all of my tests into the existing LazyTest file? |
Extra tests would be excellent, thanks! I'd ask, though, that you de-dup with the existing ones so that we're only adding new tests for behaviors that don't already have them. |
Threading bug. TakeFactory() would crash on second call that was banked up behind the lock.
Storing factory in Lazy object, and passing Lazy as a parameter to ILazyItem functions. This means that None & PublicationOnly now need no creation of secondary object.
@mrange sorry dude, I've modified things again, but I think I'm now at absolute minimum. I think None and PublicationOnly should show some improvements, as they no longer create any other objects other than the Lazy object itself (and as per previous comment, PublicationOnly's use of CompareExchange was restored) @stephentoub hopefully I'll get a bit of chance to review tests tests within the next week (and hopefully no world war will begin within this time... hmmm...) Anyway, I think I'm pretty happy with the whole thing now, I've minimized object creation and all paths seem pretty optimal. I do hope serialization doesn't prove the show stopper... |
Before this optimization, there are 3 generic types that will be created per After this throughput optimization, there are 9 generic types that will be created per The generic types take time and memory to create too, and if T is valuetype there is code JITed per T too. It would interesting to look at whether the ~5x+ increase in startup cost and footprint per T pays off in typical programs. For example, if I run https://github.com/aspnet/MusicStore app, how many different Lazy types are there and how many times is each one created? |
Hi @jkotas I must admit I am ignorant in the ways of the JIT. but is it really creating all those types, of are they just potentially created (dare I say it, lazily created)? And if they are created eagerly is that just because they are inner classes, in which case they could be moved out and just the each other with internal scoped members? I.e. following the happy path of creation would be, 'Lazy', 'ILazyItem', 'ExecutionAndPublication' and that's it. So still just 3. That would be most uses. You should need 4 for 'PublicationOnly' and 3 for ’None'. In the case of serialisation of exceptions an extra one each is required, but they are heavy operations anyway. Anyway, let the knowledgeable people speak, I'll move back to my corner of ignorance! |
Inner class relationship does not matter for performance. What matters is how the implementation is interconnected. For example, execution of The idea about making the value a direct field and use auxiliary state object is good, ie this is the best part of the change:
However the number of extra generic plumbing types makes it hard to tell whether it is obvious good trade-off for programs out there. Could you please try to refactor the implementation to avoid the generic types, and use non-generic types instead? The existing implementation is not perfect in this regard either - you should be able to beat it on the startup performance (=number of generic types) as well. Feel free to bring up any issues you may run into. I have created a small perf test that creates a lot of different Lazy instantiations that you can use to measure: https://gist.github.com/jkotas/9c492e62c59cb5e7220daeed587d254a . On my machines, it gives 3953ms without your changes, 5734ms with your current changes. You can compile it directly against corelib for convenience:
Thank you for your work on this so far! |
Fixing startup performance concerns
Howdy @jkotas, OK, well after a little playing around I have found that the main performance issue that raised its head in regards to your concerns was not so much the creation of generics, but rather that a Anyway, I have made some modification to remove this, and the numbers appear a fair bit better to me. I ran some tests with 32 & 64-bit and Let me know what you think. |
Could you please still share some data about where it wins and loses? I still believe it would be beneficial to make all the plumbing types non-generic. I think you may need to add extra cast from |
I don't think it's as simple as that. Maybe with a completely different implementation, but here I need to be able to call back into the 'Lazy' object from the ILazyItem implementation functions so they therefore need to be generic. Anyway, I'll have a think. Not sure if I'll get much time to play this week, but I'll try to do a quick run to print timings from your testb at some stage. |
src/mscorlib/src/System/Lazy.cs
Outdated
} | ||
} | ||
|
||
/// <summary> | ||
/// Gets a value indicating whether this instance may be used concurrently from multiple threads. | ||
/// </summary> | ||
internal LazyThreadSafetyMode Mode | ||
internal LazyThreadSafetyMode? Mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: All these forwarders can be written as one-liners:
internal LazyThreadSafetyMode? Mode => LazyHelper.GetMode(_state);
internal bool IsValueFaulted => LazyHelper.GetIsValueFaulted(_state);
public bool IsValueCreated => _state == null;
Race condition was introduced when state wasn't passed to LazyGetValue. Also just made LazyGetValue into void CreateValue, as it was internally recursively calling back to Value, so I think this is easier to grok threading implications (i.e. why it wasn't just returning _value.)
OK, well I think that things are good to progress. There had been a race condition introduced when I have removed the state from being passed to LazyGetValue, but I believe we're all good again now. I'm getting "GitHub pull request #8963 of commit 77d3a7f, has merge conflicts." so I think that's why the build is failing? Anyway, all over to you guys. |
@manofstick Could you please merge current CoreCLR master and resolve the conflicts? |
src/mscorlib/src/System/Lazy.cs
Outdated
@@ -322,147 +528,16 @@ public T Value | |||
{ | |||
get | |||
{ | |||
Boxed boxed = null; | |||
if (m_boxed != null ) | |||
while (true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a while loop here will make the Value property to be a bad inlining candidate - bad for steady state perf. Could you please change CreateValue to return the value instead so that it can be return (_state == null) ? _value : CreateValue();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or it can be - without changing CreateValue
to return the value:
if (_state != null)
CreateValue();
return _value;
This one is slightly better because it has smaller IL, and thus it will make it better inlining candidate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't just return _value without check if _state (volatile semantics). I'm not sure I can guarantee that it would always be valid (although I'm sure it would be in tests, but that's not the same thing is it 😀) I'll return it to how it was.
src/mscorlib/src/System/Lazy.cs
Outdated
@@ -18,6 +18,7 @@ | |||
using System.Runtime; | |||
using System.Runtime.InteropServices; | |||
using System.Security; | |||
using System.Security.Permissions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not need to add using System.Security.Permissions
back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry about that I'm bashing around trying to control git (I'm used to mercurial at work, and I must say, beyond rudimentary commands git drives me insane!)
Anyway, I'll get there...
Very well done - thanks a lot! I will take care of porting this to CoreRT repo for you. |
Sorry @jkotas I just noticed that in my shameful use of git merging I splatted over your previous task which removed the ComVisible attribute. Can you just remove it; or do you want me to create a new PR? |
Create a new PR for it please |
I have just moved Lazy.cs to a different directory, so make sure to sync to latest master before creating the PR. |
@jkotas do you never sleep/watch a movie/go to pub/etc :-) :-) Anyway, I'll do it tomorrow. Have a good weekend! |
I went to the pub twice this week, and watched iMax movie today :-) Thanks for submitting the PR. |
@jkotas obviously you're just superhuman then! :-) No worries for PR; anyway, I can pack up my bags and return to the F# playground, I've left it rotting for a couple of months with an alternative linq-to-objects (F#'s |
* Remove use of lock for PublicationOnly This involves a little bit of trickery. In avoiding the lock we need to go from m_implementation as PublicationOnly to null in a single step. We can't do this though without another object, so we reuse out Lazy object, which basically just spins its wheels until the value is ready. This shouldn't be a very common scenerio (i.e. hitting that code path) but it is possible. * Minimize additional object creation Storing factory in Lazy object, and passing Lazy as a parameter to ILazyItem functions. This means that None & PublicationOnly now need no creation of secondary object. * Remove Func<T> for default constructor invoking Fixing startup performance concerns * Moved non-generic functionality out of Lazy ...and into a helper class * Expression-bodied functions
* System.Lazy regression tests This is a prelude to an alternative Lazy implementation in a PR at dotnet/coreclr#8963 Commit migrated from dotnet/corefx@029f4c4
* Remove use of lock for PublicationOnly This involves a little bit of trickery. In avoiding the lock we need to go from m_implementation as PublicationOnly to null in a single step. We can't do this though without another object, so we reuse out Lazy object, which basically just spins its wheels until the value is ready. This shouldn't be a very common scenerio (i.e. hitting that code path) but it is possible. * Minimize additional object creation Storing factory in Lazy object, and passing Lazy as a parameter to ILazyItem functions. This means that None & PublicationOnly now need no creation of secondary object. * Remove Func<T> for default constructor invoking Fixing startup performance concerns * Moved non-generic functionality out of Lazy ...and into a helper class * Expression-bodied functions Commit migrated from dotnet/coreclr@2453adf
2017-21-01: I have modified the implementation a little, so updated the text to reflect the changes
2017-28-01: Updated text to represent current state of implementation
The current version of System.Lazy isn't particularly performant. @mrange had some analysis in an article called On the cost of being lazy.
Ultimately this stemmed back to FSharp's implementations of
Seq.init(Infinite)?
which wrapped each call toIEnumerator<T>.MoveNext()
in alazy
that was only evaluated onCurrent
. (Which is one of the issues that I have address when composingSeq
s in my (incomplete) PR Seq Composer - although it still remains in there under some circumstances)This version:
ExecutionAndPublish
(2 objects created, vs 3),None
(1 vs 2) &PublicationOnly
(1 vs 2)Still to do:
Lazy<T>
to corefx repo. (System.Lazy regression tests corefx#15577)Lazy<T>
are passing