Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track last call and specification in thread-safe manner #264

Closed

Conversation

zvirja
Copy link
Contributor

@zvirja zvirja commented Nov 22, 2016

Hi guys,

The intention from this PR was to better support concurrency in NSubsitute. Currently NSubstitute is very weak when you configure return values for call. If it happens that any invocation in background happens on same substitute, specification might become randomly broken.
This PR fixes all the tests in this file (except test with async - it didn't fail).

I do understand that current weakness is likely "by design". However, concurrency is present everywhere in code and sometimes it isn't possible to avoid usage of substitute in background thread during the setup. That leads to very tricky issues that happen from time to time only.
Also I understand that it might be not possible to cover all the scenarios. But if something could be easily fixed - it's better to support that.

Changes in PR:

  1. Use ThreadLocal for IPendingSpecification.
  2. Use ILastCallTracker to track last call in RecordReplay route using ThreadLocal.
  3. Use ILastCallTracker to get information about last call, rather than ICallStack.
  4. Rename ICallStack to ICallCollection. That is because the Pop() method is not required more (and was deleted), and Delete() method was added instead. That is much more collection that stack.

P.S. Sorry for the large PR. I haven't found a way to split that to smaller set of changes - all them are related to each other.

@zvirja zvirja force-pushed the thread-local-last-call-info branch from 82a39e4 to 447df7a Compare November 22, 2016 16:29
@zvirja
Copy link
Contributor Author

zvirja commented Nov 22, 2016

Why does the build fails? :( It should be something wrong with CI because I haven't touched the build scripts somehow.

@alexandrnikitin
Copy link
Member

@zvirja It failed on macOS due to an OpenSSL issue https://github.com/dotnet/cli/issues/1161. I fixed it in #265

@zvirja zvirja force-pushed the thread-local-last-call-info branch from 447df7a to d9ab496 Compare November 23, 2016 11:23
@zvirja
Copy link
Contributor Author

zvirja commented Nov 23, 2016

@alexandrnikitin Wow, thank you! Rebased my code, so build is green now. Will be waiting for the code review now.

Assert.That(subs.Echo(42), Is.EqualTo("42"));
}

#if (NET45 || NET4 || NETSTANDARD1_5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NET4 doesn't have async/await keywords support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Fixed test to avoid async usage, because test should be .NET4 compatible.

@@ -57,6 +62,7 @@ public IRoute RecordCallSpecification(ISubstituteState state)
{
return new Route(new ICallHandler[] {
new RecordCallSpecificationHandler(state.PendingSpecification, state.CallSpecificationFactory, state.CallActions)
, new ClearLastTrackedCallHandler(state.LastCallTracker)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this line. Why do you clear last call in RecordCallSpecification?

Copy link
Contributor Author

@zvirja zvirja Nov 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason why we track information about the last call is to later resolve specification for that "last call". During the "specifying call" we record call specification, which should be used for the subsequent Returns() and information about "last call" (which is now the call before the last) should not be used.

Consider the following code sample:

subs.MethodA(42); //register in LastCallTracker 
subs.MethodB(Arg.Any<int>()); //invoke `RecordCallSpecification` route.

As you can see, after the second invocation the LastCallTracker will hold MethodA, that is definitely obsoleted.
Even given that the GetCallSpec ignore last call information if pending specification is present (see code), I'd prefer to clear obsoleted data to avoid potential issues later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, agree, even if GetCallSpec has the check it's better to clear it explicitly :)

@alexandrnikitin
Copy link
Member

Everything looks good to me. I'm happy to see this PR merged. @dtchepak WDYT?

@dtchepak
Copy link
Member

dtchepak commented Nov 27, 2016

@alexandrnikitin: Code looks good to me. Just running against some test suites before merging, getting an out-of-memory exception that I need to track down (passes with previous NSub, fails with a local build of this branch).

var callWrapper = _callWrappers.FirstOrDefault(w => w.Call.Equals(call));
if (callWrapper == null) throw new InvalidOperationException("Collection doesn't contain the call.");

callWrapper.IsDeleted = true;
Copy link
Member

@alexandrnikitin alexandrnikitin Nov 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtchepak the only principal thing that was changed is this: we don't delete the "call" objects as we did in CallStack. We just mark it as deleted. In case of huge amount of specifications it could case OOM. I think it's better to change that to the previous behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zvirja it's worth to add that test too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtchepak I can think of the following test (a lot of heavy specs):

[Test]
[Ignore]
public void Call_between_invocation_and_received_doesnt_cause_issue()
{
    var subs = Substitute.For<ISomething>();
    for (int i = 0; i < 1000000; i++)
    {
        subs.Echo(0).Returns(new string('a', 100000) + i);
    }
}

Do you have a different case?

Copy link
Contributor Author

@zvirja zvirja Nov 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed implementation so now it doesn't leak the memory more. Good catch :)

It appeared that this test still fails because there is a source of memory leak:

->  00000004005d64c8 NSubstitute.Core.CallResults
->  00000004005d64e8 System.Collections.Concurrent.ConcurrentQueue`1[[NSubstitute.Core.CallResults+ResultForCallSpec, NSubstitute]]
->  00000004005d6518 System.Collections.Concurrent.ConcurrentQueue`1+Segment[[NSubstitute.Core.CallResults+ResultForCallSpec, NSubstitute]]
->  000000027fff36c8 System.Collections.Concurrent.ConcurrentQueue`1+Segment[[NSubstitute.Core.CallResults+ResultForCallSpec, NSubstitute]]
->  000000027fff37b8 NSubstitute.Core.CallResults+ResultForCallSpec[]
->  000000027fffc550 NSubstitute.Core.CallResults+ResultForCallSpec
->  000000027fffc238 NSubstitute.Core.ReturnValue
->  0000000576801020 System.Byte[]

By the way, CallCollection cannot a be a source of memory leak for such a scenario, because it stores passed arguments, not configured results.

However, I've added another test, that should fail if a memory leak is present in my code. I was forced to fix another memory leak to actually see that my code is the culprit now :)

Nevertheless, now a couple of memory leaks are fixed, so it's time for another test.

@zvirja zvirja force-pushed the thread-local-last-call-info branch from 77e8c15 to b8c7da7 Compare November 27, 2016 17:17
@zvirja
Copy link
Contributor Author

zvirja commented Nov 27, 2016

@dtchepak I've changed the implementation and fixed the spotted place of leak. Could you try again?

Additionally, a couple of questions:

  1. Do you observe that leak for each time?
  2. Have you tried to run tests using the latest master? So we can be sure that issue is because of these changes, not the PRs before.
  3. Do you run tests using x86 or x64 runner?

@zvirja zvirja force-pushed the thread-local-last-call-info branch from 579d7d0 to 97c2a45 Compare November 27, 2016 19:04
@dtchepak
Copy link
Member

Hi, sorry to comment and run last night (GMT+11).
I'll spend more time on this as soon as I can, but so far can confirm the following:

  • doesn't happen with master
  • happens with the most recent checkins to this PR
  • happens in a huge, private test suite I have handy, haven't been able to find an isolated repro yet. Will take me a while to unpick this.
  • I haven't had a chance yet, but it is worth running these long-running thread tests over this PR to see if that shows any problems.

@zvirja
Copy link
Contributor Author

zvirja commented Nov 30, 2016

I've found the reason of the OutOfMemory exception. That is not because of memory leak, but just because of ThreadLocal<> intensive usage (one per each substitution). During the stress-testing it leaded to about 2GB overhead, but that value could be even larger on large solutions.

Let me redesign implementation a bit because that could be omitted without degrading the functionality.
I'll follow up soon.

@dtchepak
Copy link
Member

@zvirja : nice work! Thanks a lot for all your work on this.

@zvirja
Copy link
Contributor Author

zvirja commented Dec 2, 2016

Hi guys,

I've refactored my implementation. A few words about new design:

  1. SubstitutionContext how stores IPendingSpecifictionInfo in thread local storage. That is information about either last call or pending specification. I found that it's OK to store that globally rather than per substitute, because last call is useful only. Calls about previous invocations are not relevant more.

  2. IPendingSpecification now holds both information about previous call and pending specification. That is encapsulated to the PendingSpecificationInfo.

    The reasons why I added one more responsibility to this class are following:

    • We always either use pending specification or information about previous call. This data is mutually exclusive (we delete other one when store current one). Therefore, I decided that information about previous call is also a specification to some degree. If you wish, I can rename IPendingSpecification, but I wasn't able to find more proper name.
    • It's simpler to track sources of specifications because now we have a single place only. In my previous implementation we needed to clear pending specification and information about last call - more chances to fail somewhere.
    • Data is stored in SubstitutionContext, so we have single field instead of two (one for pending specification, second for last call information).
  3. GetCallSpec now uses IPendingSpecification only to resolve specification. Therefore, I renamed the FromLastCall method to the FromPendingSpecification because it better reflects the method implementation.

Please review and test it and let me know about your observations :)

{
return _pendingSpec != null;
var info = _substitutionContext.ThreadLocalPendingSpecificationInfo as PendingSpecificationInfo;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to cast as PendingSpecificationInfo rather than lifting the required members to the interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to do that for encapsulation. The consumers of PendingSpecificationInfo doesn't care about owner and they need target specification or a call information only. However, to make code more robust I've added check for ownership. Currently that check is hidden as an implementation detail and isn't exposed publicly. Later it could be safely removed without worrying about potential usages.

In my original design I didn't create interface and used PendingSpecificationInfo directly, but API looked more polluted.

If you feel this interface is redundant, I could remove it.

: FromCall(_callStack.Pop(), matchArgs);
if (!_pendingSpecification.HasPendingCallSpecInfo())
{
throw new InvalidOperationException("No pending specification or previous call info.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few throw new InvalidOperationException(...) calls like this. If these are things that can happen in the normal course of use I think it would be better to use/reuse a custom SubstituteException with an explanation of what has gone wrong, and how the person using the library can modify their code to fix it.

Copy link
Contributor Author

@zvirja zvirja Dec 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these are just guards to ensure that developer didn't make a mistake. They should never happen during normal usage unless NSubstitute code is modified. However, I wasn't comfortable to just keep potential places for NullReferenceExceptions and decided to throw something meaningful.

I've already followed the approach you described and introduced CouldNotSetReturnDueToMissingInfoAboutPreviousCallException for places that potentially could happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that method is a bit difficult to read. The logic stays the same. That's the case when type system should help with types like Either

Copy link
Contributor Author

@zvirja zvirja Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Don't like to have API when either field should be present - they are not verbose enough. You should still worry what to do if both or none values are present - which should never happen logically. But in C# we don't have too many options. Therefore, I'd keep this as it is if you don't wish opposite :)

@dtchepak
Copy link
Member

dtchepak commented Dec 3, 2016

I've just had a quick look so far, will have a test and more detailed look as soon as I can.

@zvirja
Copy link
Contributor Author

zvirja commented Dec 3, 2016

Replied :)
Would be waiting for your test session results to see whether it helped with memory exceptions. Fingers crossed 😉

@dtchepak
Copy link
Member

dtchepak commented Dec 5, 2016

@zvirja : I'm pleased to report it now works on the large test suite where it was previously failing! 😄 👍

I'd like to spend a bit more time going through the code, but I think this will be good to go in. Are you happy with this + the latest changes @alexandrnikitin ?

@@ -45,6 +45,17 @@ public class CouldNotSetReturnDueToNoLastCallException : CouldNotSetReturnExcept
#if NET35 || NET4 || NET45
[Serializable]
#endif
public class CouldNotSetReturnDueToMissingInfoAboutPreviousCallException : CouldNotSetReturnException
{
public CouldNotSetReturnDueToMissingInfoAboutPreviousCallException() : base("Could not find information about the previous call to return from.") { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already an exception that meant for that: CouldNotSetReturnDueToNoLastCallException
And most probably users want to see a better error message. Something like here:

protected const string WhatProbablyWentWrong =

Also "the previous" call is a bit misleading for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They only reason why this exception is thrown is incorrect usage of the NSubstitute or bugs, which we are not aware of. At this moment I'm not absolutely sure whether it's even possible to get situation when we have last call router captured, but no pending specification is present. However, I'd assume that it might happen in some exotic scenarios. Therefore this basic warning to use NSubstitute correctly is still relevant and I want to provide it as a main suggestion.

Idea is to distinguish between different situations to improve diagnostics, therefore I don't like idea to throw CouldNotSetReturnDueToNoLastCallException for both cases.

If you want to have a different message, clarify which one you would like to have. For now I see that current message is quite correct: there is information about who made the previous call (substitution), but no information which specs were captured.

I could suggest the following
Exception type: CouldNotSetReturnDueToMissingPendingSpecificationException
Message: Could not resolve specification for the previous call. + Standard suggestions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that CouldNotSetReturnDueToMissingInfoAboutPreviousCallException class extends CouldNotSetReturnException so the standard info will present there. My bad.
I'm OK to have a separate exception type but just think that it's useless for end user. As for NSub developer then it's easy for find out the source of it in case of repro.

One thing I would change for sure is "previous" to "last". We call it "last" call everywhere, so that it will be consistent. There are more lines with "previous": https://github.com/nsubstitute/NSubstitute/pull/264/files#diff-eddc8d190e75c5fd320ea3c3c777480bR25

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by renaming exception (and related tests) to use "Last" word instead of "Previous".
For stable scenario that of course doesn't bring value, but for some weird concurrency issues like this one it's still better to have separate exceptions.

Let me know if you still want to change something after the latest adjustments.

@@ -67,8 +67,8 @@ public IRoute RecordCallSpecification(ISubstituteState state)
public IRoute RecordReplay(ISubstituteState state)
{
return new Route(RouteType.RecordReplay, new ICallHandler[] {
new ClearUnusedCallSpecHandler(state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep ClearUnusedCallSpecHandler to explicitly clear the specs in case the implementations of TrackLastCallHandler changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that makes the performance penalty because ThreadLocal works slower than [ThreadStatic] attribute, but I've changed that.

Copy link
Member

@alexandrnikitin alexandrnikitin Dec 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance impact is negligible here. But the code lost the knowledge that it should clear specs, that's critical for me. Thanks for the change :)

public ICall LastCall { get; }
public IPendingSpecification Owner { get; }

public PendingSpecificationInfo(ICallSpecification callSpecification, ICall lastCall, IPendingSpecification owner)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have the IPendingSpecification owner reference to the "higher" layer? In case of wrong owner it will just throw an exception, it won't handle such case. I mean we could introduce such "check" but in case we can react to that. I think that ThreadLocal value is enough here. Do you have a particular case in your mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I did that is because I'm uncertain in this code. That should never happen, but we are working with thread static scope and it could happen that we overlooked something.

In fact we can remove that checks at all - I see that it's quite impossible to have situation when last router points to one substitution, while thread local pending specs are from other substitution. But if it's possible to happen, we'll have a bug reported and could investigate that. Otherwise, we could just guess why sometimes behavior is strange.

If you feel that it's better to remove that check at all and assume that it should never happen - let me know and I'll remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to keep things simple. Thread local wrapper already improves the situation a lot. I think it will be sufficient for now. I don't remember any related bug report and cannot imagine a situation when it can happen. I know everything is possible but I'm for simplicity here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed the IPendingSpecificationInfo and use PendingSpecificationInfo model instead.

@@ -9,6 +9,7 @@ public interface ISubstitutionContext
{
ISubstituteFactory SubstituteFactory { get; }
SequenceNumberGenerator SequenceNumberGenerator { get; }
IPendingSpecificationInfo ThreadLocalPendingSpecificationInfo { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to change naming from ThreadLocalPendingSpecificationInfo to PendingSpecificationInfo. It's an interface and doesn't know about implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@alexandrnikitin
Copy link
Member

I left few minor comments I spotted. Overall everything looks good to me. It's a working solution 👍

@zvirja
Copy link
Contributor Author

zvirja commented Dec 9, 2016

Cool, thank you for your time! I've replied to them and put the appropriate changes.
Let me know if you wish more things to change.

@alexandrnikitin
Copy link
Member

👍 Thanks for the changes! LGTM.

@zvirja
Copy link
Contributor Author

zvirja commented Dec 11, 2016

That's great, thanks! So now we are waiting for the @dtchepak approval, right? :)

@dtchepak
Copy link
Member

Just letting y'all know I haven't forgotten about this. I'm aiming to merge in over the weekend (sorry for the delay, busy time of year :) ).

@zvirja
Copy link
Contributor Author

zvirja commented Dec 15, 2016

@dtchepak Thanks for the update, because I was monitoring this PR each day :)

Could you please also publish it as a Release Candidate after the merge? I'm almost done with AutoFixture implementation and now I want to fire a PR to AF project to start a discussion about implementation and breaking changes. Given the new governance model it might take quite a long for this to be merged and published (if that even happens..) 😢

@dtchepak
Copy link
Member

@alexandrnikitin : I've finished reviewing and testing this. I'm ready to push it to master, but just wanted to get your opinion first: do you prefer I rebase/squash this into one commit, or leave as separate?

@alexandrnikitin
Copy link
Member

@dtchepak I prefer squashing since it has several fixes/rework.

@zvirja
Copy link
Contributor Author

zvirja commented Dec 16, 2016

Let's squash - it contains too many commits. But deeply inside I'll be missing this nice history ;)

@zvirja
Copy link
Contributor Author

zvirja commented Dec 16, 2016

@dtchepak Do you want me to squash them or you will just press the "Squash and Merge" button in github UI?

@zvirja zvirja force-pushed the thread-local-last-call-info branch from 889dde3 to 94cb24b Compare December 16, 2016 20:05
@zvirja
Copy link
Contributor Author

zvirja commented Dec 16, 2016

I've squashed commits and added some description to its message to indicate the applied changes.
Now history will be partially preserved :)

@zvirja zvirja force-pushed the thread-local-last-call-info branch from 94cb24b to 0507eb8 Compare December 16, 2016 20:10
NSubstitute relies on information about last call to set return value.
In order to better support concurrent usage of substitution during its
configuration implementation was switched to use thread local storage for
last call info.

The following changes were made:
 - PendingSpecification is now stored in ThreadLocal context.
 - PendingSpecification stores both pending specification and information
   about the last call.
 - CallStack was reworked to be a CallCollection.
 - Add new kind of exception to indicate that last router is known, but
   information about last call is missing.
@zvirja zvirja force-pushed the thread-local-last-call-info branch from 0507eb8 to 2c23091 Compare December 16, 2016 20:12
dtchepak pushed a commit that referenced this pull request Dec 16, 2016
Rename ICallStack to ICallCollection. That is because the Pop()
method is not required more (and was deleted), and Delete()
method was added instead. That is much more collection that stack.

IPendingSpecification now holds both information about previous
call and pending specification. That is encapsulated to the
PendingSpecificationInfo. The reasons why I added one more
responsibility to this class are following:
- We always either use pending specification or information about
previous call. This data is mutually exclusive (we delete other
one when store current one). Therefore, I decided that information
about previous call is also a specification to some degree.
- It's simpler to track sources of specifications because now we
have a single place only. In my previous implementation we needed
to clear pending specification and information about last call - more
chances to fail somewhere.
- Data is stored in SubstitutionContext, so we have single field
instead of two (one for pending specification, second for last
call information).

PendingSpecificationInfo stores thread-local on SubstitutionContext
(i.e. per substitute).

GetCallSpec now uses IPendingSpecification only to resolve
specification. Therefore, I renamed the FromLastCall method
to the FromPendingSpecification because it better reflects the
method implementation.

TrackLastCallHandler introduced to set `IPendingSpecification`.
dtchepak added a commit that referenced this pull request Dec 16, 2016
@dtchepak
Copy link
Member

Merged in bb06d33.

дуже дякую @zvirja :)
Thanks @alexandrnikitin for your review feedback.

I will try to get a release out this weekend.

@dtchepak dtchepak closed this Dec 16, 2016
@zvirja zvirja deleted the thread-local-last-call-info branch December 17, 2016 01:03
@dtchepak
Copy link
Member

dtchepak commented Dec 17, 2016

@zvirja : Release 2.0.1-rc has this and the AutoFixture changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants