Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[arm32/Linux] System.Net.Http.Functional.Tests failures on arm32 linux #28142

Closed
joperezr opened this issue Dec 11, 2018 · 5 comments · Fixed by dotnet/corefx#34559
Closed

[arm32/Linux] System.Net.Http.Functional.Tests failures on arm32 linux #28142

joperezr opened this issue Dec 11, 2018 · 5 comments · Fixed by dotnet/corefx#34559
Assignees
Labels
arch-arm32 area-System.Net.Http os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs
Milestone

Comments

@joperezr
Copy link
Member

Running the arm32 tests on linux for System.Net.Http.Functional.Tests produced the following results: passed 5815/5833 tests.

18 tests failed for different reasons. Attaching the testResults.xml or a better analysis.

testResults.zip

@danmoseley
Copy link
Member

danmoseley commented Dec 11, 2018

Many of these are timeouts in WhenAllOrAnyFailed where there's a hard coded 3 second timeout. A few don't match that pattern but may be timeouts elsewhere.

Suggest changing the code here to be something larger when running on ARM. (10x maybe?)

If that helps, then we will need to track down little timeouts sprinkled all over. And perhaps change them all to be adjusted by something centralized like TestUtilities.MyMachineIsSlowMultiplier.

@karelz
Copy link
Member

karelz commented Dec 12, 2018

cc @wfurt

@wfurt
Copy link
Member

wfurt commented Dec 12, 2018

should we use the overload with time instead and use PassingTestTimeout (or equivalent)
I think we should be careful about architecture vs actual computation machine power.

@wfurt
Copy link
Member

wfurt commented Dec 28, 2018

I'm able to reproduce this even with running -parallel none e.g. no concurrent tests.
I tried to increase timeouts but that did not fix it.
Also it fails faster then 3s when running single test:

System.Net.Http.Functional.Tests Total: 1, Errors: 0, Failed: 1, Skipped: 0, Time: 1.000s

I will investigate.

@wfurt wfurt self-assigned this Dec 28, 2018
@wfurt
Copy link
Member

wfurt commented Dec 28, 2018

Here is relevant trace for System.Net.Http.Functional.Tests.PlatformHandler_HttpCookieProtocolTests.GetAsyncWithRedirect_SetCookieContainer_CorrectCookiesSent

[pid  3413] 01:28:14 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 508 <0.000044>
[pid  3413] 01:28:14 setsockopt(508, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000027>
[pid  3413] 01:28:14 setsockopt(508, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0 <0.000029>
[pid  3413] 01:28:14 setsockopt(508, SOL_TCP, TCP_KEEPIDLE, [60], 4) = 0 <0.000021>
[pid  3413] 01:28:14 setsockopt(508, SOL_TCP, TCP_KEEPINTVL, [60], 4) = 0 <0.000019>
[pid  3413] 01:28:14 fcntl64(508, F_GETFL) = 0x2 (flags O_RDWR) <0.000020>
[pid  3413] 01:28:14 fcntl64(508, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000026>
[pid  3413] 01:28:14 connect(508, {sa_family=AF_INET, sin_port=htons(43953), sin_addr=inet_addr("127.0.0.1")}, 16 <unfinished ...>
[pid  3414] 01:28:14 <... epoll_wait resumed> {{EPOLLIN, {u32=0, u64=0}}}, 32, -1) = 1 <0.073055>
[pid  3413] 01:28:14 <... connect resumed> ) = -1 EINPROGRESS (Operation now in progress) <0.000235>
[pid  3413] 01:28:14 poll([{fd=508, events=POLLOUT|POLLWRNORM}], 1, 0) = 1 ([{fd=508, revents=POLLOUT|POLLWRNORM}]) <0.000325>
[pid  3413] 01:28:14 getsockopt(508, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 <0.000026>
Process 3415 attached
[pid  3413] 01:28:14 getpeername(508, {sa_family=AF_INET, sin_port=htons(43953), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 <0.000028>
[pid  3413] 01:28:14 getsockname(508, {sa_family=AF_INET, sin_port=htons(54608), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 <0.000028>
[pid  3415] 01:28:14 mmap2(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
[pid  3413] 01:28:14 send(508, "GET /foo HTTP/1.1\r\nHost: 127.0.0.1:43953\r\nAccept: */*\r\nCookie: cookie1=value1\r\nConnection: close\r\n\r\n", 100, MSG_NOSIGNAL) = 100 <0.000091>
[pid  3413] 01:28:14 poll([{fd=508, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout) <0.000021>
[pid  3415] 01:28:14 <... mmap2 resumed> ) = 0xaa123000 <0.004785>
[pid  3413] 01:28:14 poll([{fd=508, events=POLLIN}, {fd=503, events=POLLIN}], 2, 0) = 0 (Timeout) <0.000055>
[pid  3414] 01:28:14 epoll_wait(505,  <unfinished ...>
[pid  3413] 01:28:14 poll([{fd=508, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0 <unfinished ...>
[pid  3403] 01:28:14 accept4(502,  <unfinished ...>
[pid  3413] 01:28:14 <... poll resumed> ) = 0 (Timeout) <0.000087>
[pid  3403] 01:28:14 <... accept4 resumed> {sa_family=AF_INET, sin_port=htons(54608), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_CLOEXEC) = 509 <0.000080>
[pid  3413] 01:28:14 poll([{fd=508, events=POLLIN}, {fd=503, events=POLLIN}], 2, 188 <unfinished ...>
[pid  3403] 01:28:14 mmap2(NULL, 65536, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa8afe000 <0.000028>
[pid  3403] 01:28:14 setsockopt(509, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000033>
[pid  3403] 01:28:14 fcntl64(509, F_GETFL) = 0x2 (flags O_RDWR) <0.000019>
[pid  3403] 01:28:14 fcntl64(509, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000018>
[pid  3403] 01:28:14 recvmsg(509, {msg_name(0)=NULL, msg_iov(1)=[{"GET /foo HTTP/1.1\r\nHost: 127.0.0.1:43953\r\nAccept: */*\r\nCookie: cookie1=value1\r\nConnection: close\r\n\r\n", 1024}], msg_controllen=0, msg_flags=0}, 0) = 100 <0.000028>
[pid  3403] 01:28:14 mmap2(NULL, 65536, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa8aee000 <0.000026>
[pid  3403] 01:28:14 mmap2(NULL, 65536, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa8ade000 <0.000025>
[pid  3403] 01:28:14 sendmsg(509, {msg_name(0)=NULL, msg_iov(1)=[{"HTTP/1.1 302 Found\r\nDate: Fri, 28 Dec 2018 01:28:14 GMT\r\nContent-Length: 0\r\nLocation: /bar\r\nConnection: close\r\n\r\n", 113}], msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
[pid  3413] 01:28:14 <... poll resumed> ) = 1 ([{fd=508, revents=POLLIN}]) <0.069520>
[pid  3413] 01:28:14 poll([{fd=508, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=508, revents=POLLIN|POLLRDNORM}]) <0.000019>
[pid  3413] 01:28:14 recv(508, "HTTP/1.1 302 Found\r\nDate: Fri, 28 Dec 2018 01:28:14 GMT\r\nContent-Length: 0\r\nLocation: /bar\r\nConnection: close\r\n\r\n", 16384, 0) = 113 <0.000024>
[pid  3403] 01:28:14 <... sendmsg resumed> ) = 113 <0.004359>
[pid  3403] 01:28:14 shutdown(509, SHUT_WR) = 0 <0.000071>
[pid  3403] 01:28:14 close(509)         = 0 <0.000038>
[pid  3413] 01:28:14 close(508)         = 0 <0.000125>

this is first request & response. 508 is client side socket and 509 is socket accepted by loopback server. Now after redirect this happens:

[pid  3413] 01:28:14 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 508 <0.000038>
[pid  3413] 01:28:14 setsockopt(508, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000028>
[pid  3413] 01:28:14 setsockopt(508, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0 <0.000027>
[pid  3413] 01:28:14 setsockopt(508, SOL_TCP, TCP_KEEPIDLE, [60], 4) = 0 <0.000026>
[pid  3413] 01:28:14 setsockopt(508, SOL_TCP, TCP_KEEPINTVL, [60], 4) = 0 <0.000026>
[pid  3413] 01:28:14 fcntl64(508, F_GETFL) = 0x2 (flags O_RDWR) <0.000027>
[pid  3413] 01:28:14 fcntl64(508, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000025>
[pid  3413] 01:28:14 connect(508, {sa_family=AF_INET, sin_port=htons(43953), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000136>
[pid  3414] 01:28:14 <... epoll_wait resumed> {{EPOLLIN, {u32=0, u64=0}}}, 32, -1) = 1 <0.116215>
[pid  3414] 01:28:14 epoll_wait(505,  <unfinished ...>
[pid  3413] 01:28:14 close(508)         = 0 <0.000104>
[pid  3403] 01:28:14 accept4(502, {sa_family=AF_INET, sin_port=htons(54609), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_CLOEXEC) = 508 <0.000044>

new 508 socket is created and it attempts to connect. However, it is immediately closed by curl before loopback server accepts. When it does it gets same socket number but since the client already closed it will read EOF and exception is thrown.

      System.Net.Http.Functional.Tests.PlatformHandler_HttpCookieProtocolTests.GetAsyncWithRedirect_SetCookieContainer_CorrectCookiesSent [FAIL]
        System.AggregateException : One or more errors occurred. (A task was canceled.) (One or more errors occurred. (Unexpected EOF trying to read request header))
        ---- System.Threading.Tasks.TaskCanceledException : A task was canceled.
        ---- System.AggregateException : One or more errors occurred. (Unexpected EOF trying to read request header)

This looks like a bug in platform curl as we do not manage connections in this case.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm32 area-System.Net.Http os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants