-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Conversation
catch (InvalidOperationException) | ||
{ | ||
// Ignore. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it's hiding a symptom. Is there an underlying problem we should be fixing instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong but I don't think so. We initialize instances when the class loads. And then we skip the test and we immediately hit Dispose() Now, on busy/slow system this happens before the task runs end executes the listener call. That can either lead to timeout, object disposed exception or invalid operation based on exact timing. In that case I saw on ARM32 the actual test was skipped but hit "Error: (not Fail) just loading and unloading the test class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on busy/slow system this happens before the task runs end executes the listener call
What's the stack of the exceptions you're seeing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from my x64 VM Machine
Condition(s) not met: "IsSelectedSitesTestEnabled"
[Test Class Cleanup Failure (System.Net.Http.Functional.Tests.NtAuthTests)] System.AggregateException
System.AggregateException : One or more errors occurred. (Cannot access a disposed object.
Object name: 'listener'.)
---- System.ObjectDisposedException : Cannot access a disposed object.
Object name: 'listener'.
Stack Trace:
/root/coreclr/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs(2843,0): at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
/root/coreclr/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs(2714,0): at System.Threading.Tasks.Task.Wait()
/home/furt/git/wfurt-corefx/src/System.Net.Http/tests/FunctionalTests/NtAuthTests.cs(57,0): at System.Net.Http.Functional.Tests.NtAuthServer.Dispose()
/home/furt/git/wfurt-corefx/src/System.Net.Http/tests/FunctionalTests/NtAuthTests.cs(95,0): at System.Net.Http.Functional.Tests.NtAuthServers.Dispose()
----- Inner Stack Trace -----
/home/furt/git/wfurt-corefx/src/System.Net.HttpListener/src/System/Net/Managed/ListenerAsyncResult.Managed.cs(197,0): at System.Net.ListenerAsyncResult.GetContext()
/home/furt/git/wfurt-corefx/src/System.Net.HttpListener/src/System/Net/Managed/HttpListener.Managed.cs(347,0): at System.Net.HttpListener.EndGetContext(IAsyncResult asyncResult)
/home/furt/git/wfurt-corefx/src/System.Net.HttpListener/src/System/Net/HttpListener.cs(290,0): at System.Net.HttpListener.<>c.<GetContextAsync>b__44_1(IAsyncResult iar)
/root/coreclr/src/System.Private.CoreLib/src/System/Threading/Tasks/FutureFactory.cs(529,0): at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
/home/furt/git/wfurt-corefx/src/System.Net.Http/tests/FunctionalTests/NtAuthTests.cs(34,0): at System.Net.Http.Functional.Tests.NtAuthServer.<.ctor>b__4_0()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, ok. GetContext documents both InvalidOperationException and ObjectDisposedException as occurring in cases like this, where the HttpListener may not be listening or has been disposed of already, so... ok. Subsequently it might be nice to try to restructure the tests to avoid this, but I agree your tactical fix is good to go ahead with now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally started on path to add synchronization.
But than it looked like unnecessary complexity for cases when we do not run the test anyway.
@dotnet-bot test Packaging All Configurations x64 Debug please |
// Set timeout limit on the connect phase. | ||
SetCurlOption(CURLoption.CURLOPT_CONNECTTIMEOUT_MS, int.MaxValue); | ||
// Set timeout limit on the connect phase. curl has bug on ARM so use max - 1s. | ||
SetCurlOption(CURLoption.CURLOPT_CONNECTTIMEOUT_MS, int.MaxValue - 1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth opening a bug against curl, even if we have a work around. Their bug tracker is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about it. I would need to probably isolate C repo and show behavior with some public server.
@dotnet-bot test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
* arm fixes * use PassingTestTimeoutMilliseconds instead of hardcoded number Commit migrated from dotnet/corefx@4eccf3f
fixes #33991 [arm32/Linux] System.Net.Http.Functional.Tests failures on arm32 linux
The change has three distinct parts. When I ewas investigating this on machine given to me, there were some tests always failing like this:
Increasing times did not help and we clearly timed out before we should. I also try same tests on same OS/Curl version on x64 and everything was working fine. I think this is bug in Curl as I verified that we pass proper values and types in debugger. I don't know if this is because
time_t
is 32bit or if this is ARM architecture but when passing in infinite (int.MaxValue) timeout value it somehow overflows. I was ready to disable offending tests but I found out that if I pass infinity - 1s it works as expected. This is clearly workaround for curl but it makes curl work fine on ARM32.After that I was getting random timeouts when running tests in parallel. They were not specific one and the failure was always timeout. I bumped timeout to 20s and it was OK for inner loop with occasional timeouts when running outerloop. With 30, I did no see any single failure with many runs. There may be better ways how to do that is this clearly seems to be related to host CPU performance.
But for now using the ARCH is easy and solves the problem without counting cortes or trying to compute system computational power.
Last part of this change is improvement to NtAuthServer.
I have seen occasional failures on other platforms as well but on my ARM it was failing consistently. There seems to be race condition in Dispose() when test was skipped and we are still trying to start task. This change allows to swallow two more exception caused by early dispose/late task start and it also increases timeout on Wait()
With this did many successful outerloop runs on ARM32.