Issue 688 AsyncHelper.WaitForCompletion leaks unobserved exceptions #692

jm771 · 2020-08-14T14:54:40Z

Fix for #688

I have some comments which I'll leave inline on the code review

…d exceptions

dnfadmin · 2020-08-14T14:54:57Z

All CLA requirements met.

src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlUtil.cs

jm771 · 2020-08-14T14:59:21Z

src/Microsoft.Data.SqlClient/tests/FunctionalTests/SqlHelperTest.cs

+        private void TimeOutATask()
+        {
+            TaskCompletionSource<bool> tcs = new TaskCompletionSource<bool>();
+            AsyncHelper.WaitForCompletion(tcs.Task, 1); //Will time out as task uncompleted


I don't love that this line causes the unit test to take a second to run - but that's the minimum timeout on WaitForCompletion

jm771 · 2020-08-14T15:04:41Z

src/Microsoft.Data.SqlClient/tests/FunctionalTests/SqlHelperTest.cs

+            EventHandler<UnobservedTaskExceptionEventArgs> handleUnobservedException = 
+                (o, a) => { unobservedExceptionHappenedEvent.Set(); };
+
+            TaskScheduler.UnobservedTaskException += handleUnobservedException;


I don't much like messing with global state from a unit test, but I'm not sure of what other way to test this. I had a look at TaskExceptionHolder (which is what is responsible for raising the event when finalized) but it looks anything along those lines would be messing with the internals of the Task system too much.

…ailures

jm771 · 2020-08-14T16:02:46Z

So here's something exciting

2020-08-14T15:43:06.0285145Z X Microsoft.Data.SqlClient.Tests.SqlHelperTest.WaitForCompletion_DoesNotCreateUnobservedException [1s 35ms] 2020-08-14T15:43:06.0285944Z Error Message: 2020-08-14T15:43:06.0286678Z Did not expect an unobserved exception, but found a System.InvalidOperationException with message "Fail requested"

This is not the unobserved exception that I'm creating, so some other unit test is leaking unobserved exceptions which my unit test is picking up on - would welcome advice on how to proceed.

Wraith2 · 2020-08-14T23:38:07Z

src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlUtil.cs

@@ -204,6 +206,7 @@ internal static void WaitForCompletion(Task task, int timeout, Action onTimeout
            }
            if (!task.IsCompleted)
            {
+                task.ContinueWith(_ => { }); //Ensure the task does not leave an unobserved exception


Can you verify that this doesn't allocate a delegate and closure/state object on each invocation? I've done a lot of work to try and reduce allocations for async calls and I'd like to make sure things aren't regressed without good reason.

Shouldn't do - as there is nothing to capture in the closure - did this as an experiment to confirm:

private static Action<Task> GetLambdaNoClosure(int context) { return _ => { }; } private static Action<Task> GetLambdaWithClosure(int context) { return _ => { context++; }; } public static void Main() { var a = GetLambdaNoClosure(1); var b = GetLambdaNoClosure(2); Console.WriteLine(object.ReferenceEquals(a, b)); var c = GetLambdaWithClosure(1); var d = GetLambdaWithClosure(2); Console.WriteLine(object.ReferenceEquals(c, d)); }

Output:
True False

Ok. Correctness has priority before performance anyway so probably not is good enough for me.

jm771 · 2020-08-15T10:48:52Z

...icrosoft.Data.SqlClient/tests/FunctionalTests/BaseProviderAsyncTest/BaseProviderAsyncTest.cs

@@ -90,14 +95,14 @@ public static void TestDbCommand()
            {
                Fail = true
            };
-            commandFail.ExecuteNonQueryAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();


Setting TaskContinuationOptions.OnlyOnFaulted and not checking the exception leads to the exception being unobserved:

https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskcontinuationoptions?view=netcore-3.1

"If you do not access the Exception property, the exception is unhandled."

I also was not a fan of the fact that if the command did not fail, the test would hand indefinitely, so I swapped to the current approach

Think I have found the source of these exceptions, and should be fixed in latest commit

jm771 · 2020-08-15T12:09:04Z

Trying to go through the test failures now - there's a bunch of manual test failures which it's hard to see how I'd have caused, some failures relating to encryption which I have no idea about, and then the one failure that is definitely me is the test I've added failing on build configuration Ubuntu 16 AnyCPU,Release,netcoreapp,true - the same test does not fail on build configuration Ubuntu 16 AnyCPU,Release,netcoreapp,false - but I don't have any idea as to what that bool parameter is referring to - I don't have permission to view the pipeline, and couldn't glean any difference from viewing the log files - can someone help me out?

Wraith2 · 2020-08-16T20:46:32Z

Most the AE test runs fail for configuration reasons. I ignore them, i really hope the SQL team aren't relying on them.

The only one i can find that's relevant to your PR is this: https://dev.azure.com/sqlclientdrivers-ci/sqlclient/_build/results?buildId=15188&view=logs&j=9d906afe-3eb5-5d79-316d-86f3f3fa360c&t=52375b9b-c5db-514a-5e98-413583db2186

2020-08-15T11:22:49.7596685Z [xUnit.net 00:00:17.23]     Microsoft.Data.SqlClient.Tests.SqlHelperTest.WaitForCompletion_DoesNotCreateUnobservedException [FAIL]
2020-08-15T11:22:49.7602056Z [xUnit.net 00:00:17.23]       Did not expect an unobserved exception, but found a System.TimeoutException with message "Dummy timeout exception"
2020-08-15T11:22:49.7611531Z [xUnit.net 00:00:17.23]       Expected: False
2020-08-15T11:22:49.7612839Z [xUnit.net 00:00:17.23]       Actual:   True
2020-08-15T11:22:49.7613874Z [xUnit.net 00:00:17.24]       Stack Trace:
2020-08-15T11:22:49.7614567Z [xUnit.net 00:00:17.24]         SqlHelperTest.cs(46,0): at Microsoft.Data.SqlClient.Tests.SqlHelperTest.WaitForCompletion_DoesNotCreateUnobservedException()
2020-08-15T11:22:49.7615169Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.RetrieveWorkstationId(workstation: null, withDispose: False, shouldMatchSetWorkstationId: False) [1ms]
2020-08-15T11:22:49.7615733Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.RetrieveWorkstationId(workstation: "RandomStringForTargetServer", withDispose: False, shouldMatchSetWorkstationId: True) [1ms]
2020-08-15T11:22:49.7616815Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.RetrieveWorkstationId(workstation: "RandomStringForTargetServer", withDispose: True, shouldMatchSetWorkstationId: False) [1ms]
2020-08-15T11:22:49.7617389Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.RetrieveWorkstationId(workstation: "", withDispose: False, shouldMatchSetWorkstationId: False) [1ms]
2020-08-15T11:22:49.7617837Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.ConnectionTest [652ms]
2020-08-15T11:22:49.7618230Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.ConnectionTestValidCredentialCombination [1ms]
2020-08-15T11:22:49.7618637Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.SqlConnectionEmptyParameters [1ms]
2020-08-15T11:22:49.7619256Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.ClosedConnectionSchemaRetrieval [1ms]
2020-08-15T11:22:49.7620040Z   âˆš Microsoft.Data.SqlClient.Tests.SqlConnectionBasicTests.IntegratedAuthConnectionTest [161ms]
2020-08-15T11:22:49.7620562Z   X Microsoft.Data.SqlClient.Tests.SqlHelperTest.WaitForCompletion_DoesNotCreateUnobservedException [1s 30ms]
2020-08-15T11:22:49.7621135Z   Error Message:
2020-08-15T11:22:49.7621966Z    Did not expect an unobserved exception, but found a System.TimeoutException with message "Dummy timeout exception"
2020-08-15T11:22:49.7622793Z Expected: False
2020-08-15T11:22:49.7623560Z Actual:   True
2020-08-15T11:22:49.7624319Z   Stack Trace:
2020-08-15T11:22:49.7625178Z      at Microsoft.Data.SqlClient.Tests.SqlHelperTest.WaitForCompletion_DoesNotCreateUnobservedException() in E:\ci-agent1\_work\1\s\src\Microsoft.Data.SqlClient\tests\FunctionalTests\SqlHelperTest.cs:line 46

jm771 · 2020-08-17T07:50:29Z

...icrosoft.Data.SqlClient/tests/FunctionalTests/BaseProviderAsyncTest/BaseProviderAsyncTest.cs

@@ -116,17 +121,17 @@ public static void TestDbCommand()
            source = new CancellationTokenSource();
            Task.Factory.StartNew(() => { command.WaitForWaitingForCancel(); source.Cancel(); });
            Task result = command.ExecuteNonQueryAsync(source.Token);
-            Assert.True(result.IsFaulted, "Task result should be faulted");
+            Assert.True(result.Exception != null, "Task result should be faulted");


Checking IsFaulted does not count as observing an exception, accessing the Exception property does

Wraith2 · 2020-09-01T09:05:58Z

The test suite got updated recently to make all the AE runs function. Can you merge or rebase to master and see if you get clean test runs?

cheenamalhotra · 2020-09-01T15:08:06Z

...icrosoft.Data.SqlClient/tests/FunctionalTests/BaseProviderAsyncTest/BaseProviderAsyncTest.cs

@@ -13,6 +13,11 @@ namespace Microsoft.Data.SqlClient.ManualTesting.Tests
 {
    public static class BaseProviderAsyncTest
    {
+        private static void AssertTaskFaults(Task t)
+        {
+            Assert.ThrowsAny<Exception>(() => t.Wait(TimeSpan.FromMilliseconds(1)));


Nit: Could you confirm if if providing timeout here is intentional?
Although it's a test, but it changes existing behavior.

Yes it's intentional - note that t.Wait returns false in the case of timeout (doesn't throw a TimeoutException) so if the task under test times out the test will fail, as no exception will be thrown. I thought this was preferable to the previous behaviour where if the Task hung the test would hang indefinitely (or until killed by the framework)

In this particular case all the Tasks under test are already completed ones generated by Task.FromResult, so we can get away with a 1ms timeout and know nothing will timeout.

However this is somewhat a matter of taste, and I don't know the details of whether your test suite is set up to handle timing out tests well. So if you'd rather have a longer timeout or no timeout I'm happy to change it.

The original design for this is to verify that the unhandled exception can be propagated through tasks. We don't want the tests to fail due to this 1 ms timeout although it should not happen according to your explanation. I personally would prefer not setting timeout here.

src/Microsoft.Data.SqlClient/tests/FunctionalTests/SqlHelperTest.cs

cheenamalhotra · 2020-09-01T15:40:42Z

src/Microsoft.Data.SqlClient/tests/FunctionalTests/SqlHelperTest.cs

+        }
+
+        [Fact]
+        public void WaitForCompletion_DoesNotCreateUnobservedException()


This test does need a bit of rework to handle randomness. It fails randomly in pipelines.

You'll notice if you run this test in loop, the second round does not pass ever. Any thoughts why can't we run this test in second round? It maybe a hint towards random errors.

Was just looking into this - it seems the fix I had written to WaitForCompletion wasn't actually a fix. I needed to make the continuation actually observe the exception, the behaviour I had lead to a race condition, where if the continuation hadn't been executed at the point we called GC.Collect, then there was still a reference to the original task and it wasn't Collected. Creating a false positive for me having fixed it.

I proved this by modifying WaitForCompletion to return the continuation, and waited for it to complete, at which point the error happened deterministically. Making the continuation observe the exception fixed the error. My bad, I misinterpreted the text here: https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskcontinuationoptions?view=netcore-3.1 saying "If you do not access the Exception property, the exception is unhandled." To only apply to OnlyOnFaulted, when it applies to all TaskContinuationOptions.

I can't see a way to make the failure deterministic without passing some sort of signal back from the continuation, which doesn't fit the signature and isn't desirable.

This part is just a test so make it as messy as it needs to be to get a deterministic answer.

It does pass deterministically now with the fix in place - what I couldn't get deterministic was the failure without the fix in place.

jm771 · 2020-09-01T16:40:11Z

src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlUtil.cs

@@ -204,6 +206,7 @@ internal static void WaitForCompletion(Task task, int timeout, Action onTimeout
            }
            if (!task.IsCompleted)
            {
+                task.ContinueWith(t => { var ignored = t.Exception; }); //Ensure the task does not leave an unobserved exception


N.B. I repeated the experiment for the new lambda to confirm it's still not creating lots of instances of the closure

Co-authored-by: Cheena Malhotra <v-chmalh@microsoft.com>

karinazhou · 2020-09-25T19:02:47Z

Hi @jm771 ,

Could you please merge master into your branch again so we can trigger the failed AzureSQLDB pipelines? The failure indicates some missing changes from the new test framework. Just to make sure your fixes working well on these pipelines before we can merge it. Sorry for the inconvenience.

Thanks,

jm771 · 2020-09-25T19:29:45Z

On holiday at the moment, I'll take a look when I get back next week.

karinazhou · 2020-10-20T21:46:05Z

Hi @jm771 , do you have any chance to update your branch so we can merge your changes soon?

Thanks,

…lient into UnobservedExceptionPR

jm771 · 2020-10-21T20:14:49Z

Hi @jm771 , do you have any chance to update your branch so we can merge your changes soon?

Thanks,

Sorry about the delay, I've merged just now. I'll have a look over all the results of the tests tomorrow.

karinazhou · 2020-10-22T18:17:23Z

@jm771 Thank you for the updates. I have checked the failed tests on the pipelines and they are known issues that are not introduced by your PR.

jm771 · 2020-10-22T21:58:59Z

@jm771 Thank you for the updates. I have checked the failed tests on the pipelines and they are known issues that are not introduced by your PR.

Excellent, thanks Karina!

Jack Mills added 3 commits August 13, 2020 11:07

Create an example of how TaskHelper.WaitForCompletion leaks unobserve…

ad763b8

…d exceptions

Use internalsvisibleto rather than making things public

c4d6ed2

Tidy up pre PR

2618c65

jm771 commented Aug 14, 2020

View reviewed changes

jm771 mentioned this pull request Aug 14, 2020

AsyncHelper.WaitForCompletion leaks unobserved exceptions #688

Closed

Jack Mills added 2 commits August 14, 2020 16:38

Add more detailed info on the unhandled exception to diagnose build f…

60cea01

…ailures

Patch fix to netfx version of SqlUtil too

4a8b3e9

Wraith2 reviewed Aug 14, 2020

View reviewed changes

Stop other unit tests from leaving unobserved exceptions

3b09392

jm771 commented Aug 15, 2020

View reviewed changes

Leak even fewer unobserved exceptions

a2ea136

jm771 commented Aug 17, 2020

View reviewed changes

cheenamalhotra requested review from cheenamalhotra, karinazhou, DavoudEshtehari, JRahnama and johnnypham August 27, 2020 19:19

cheenamalhotra assigned karinazhou Aug 27, 2020

Merge branch 'master' into UnobservedExceptionPR

fc09c96

cheenamalhotra reviewed Sep 1, 2020

View reviewed changes

Jack Mills added 2 commits September 1, 2020 17:23

View the exception from the continuation

08b54d4

Observe the exception from _the other_ SqlUtil

f3aed3b

jm771 commented Sep 1, 2020

View reviewed changes

Swap lambda to local function

07ad6ec

Co-authored-by: Cheena Malhotra <v-chmalh@microsoft.com>

karinazhou added the ⏳ Waiting for Customer Issues/PRs waiting for user response/action. label Oct 5, 2020

jm771 added 2 commits October 21, 2020 21:11

Merge https://github.com/dotnet/SqlClient into UnobservedExceptionPR

c350827

Merge branch 'UnobservedExceptionPR' of https://github.com/jm771/SqlC…

04ac97f

…lient into UnobservedExceptionPR

karinazhou approved these changes Oct 22, 2020

View reviewed changes

karinazhou removed the ⏳ Waiting for Customer Issues/PRs waiting for user response/action. label Oct 22, 2020

cheenamalhotra added this to the 2.1.0-preview2 milestone Oct 22, 2020

cheenamalhotra approved these changes Oct 22, 2020

View reviewed changes

cheenamalhotra merged commit a947ded into dotnet:master Oct 22, 2020

cheenamalhotra mentioned this pull request Oct 23, 2020

Fix | Fix Assembly signing issue due to InternalsVisibleTo attribute #773

Merged

cheenamalhotra mentioned this pull request Jul 16, 2021

Unkown exception after upgrading to .Net 5 #1171

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 688 AsyncHelper.WaitForCompletion leaks unobserved exceptions #692

Issue 688 AsyncHelper.WaitForCompletion leaks unobserved exceptions #692

jm771 commented Aug 14, 2020

dnfadmin commented Aug 14, 2020 •

edited

Loading

jm771 Aug 14, 2020

jm771 Aug 14, 2020

jm771 commented Aug 14, 2020 •

edited

Loading

Wraith2 Aug 14, 2020

jm771 Aug 15, 2020 •

edited

Loading

Wraith2 Aug 15, 2020

jm771 Aug 15, 2020

jm771 Aug 15, 2020

jm771 commented Aug 15, 2020

Wraith2 commented Aug 16, 2020

jm771 Aug 17, 2020

Wraith2 commented Sep 1, 2020

cheenamalhotra Sep 1, 2020 •

edited

Loading

jm771 Sep 1, 2020

karinazhou Sep 24, 2020

cheenamalhotra Sep 1, 2020

jm771 Sep 1, 2020

Wraith2 Sep 1, 2020

jm771 Sep 1, 2020

jm771 Sep 1, 2020

karinazhou commented Sep 25, 2020

jm771 commented Sep 25, 2020

karinazhou commented Oct 20, 2020

jm771 commented Oct 21, 2020

karinazhou commented Oct 22, 2020

jm771 commented Oct 22, 2020

Issue 688 AsyncHelper.WaitForCompletion leaks unobserved exceptions #692

Issue 688 AsyncHelper.WaitForCompletion leaks unobserved exceptions #692

Conversation

jm771 commented Aug 14, 2020

dnfadmin commented Aug 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jm771 commented Aug 14, 2020 • edited Loading

Choose a reason for hiding this comment

jm771 Aug 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jm771 commented Aug 15, 2020

Wraith2 commented Aug 16, 2020

Choose a reason for hiding this comment

Wraith2 commented Sep 1, 2020

cheenamalhotra Sep 1, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karinazhou commented Sep 25, 2020

jm771 commented Sep 25, 2020

karinazhou commented Oct 20, 2020

jm771 commented Oct 21, 2020

karinazhou commented Oct 22, 2020

jm771 commented Oct 22, 2020

dnfadmin commented Aug 14, 2020 •

edited

Loading

jm771 commented Aug 14, 2020 •

edited

Loading

jm771 Aug 15, 2020 •

edited

Loading

cheenamalhotra Sep 1, 2020 •

edited

Loading