-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix flaky test-tls-wrap-timeout #7857
Conversation
The fix for this test is in io.js 1.3 and absent in io.js 1.2. So I modified the test a bit so that it would work on those versions to confirm that the test still passes in 1.3 and fails in 1.2. That form of the test can be found at https://gist.github.com/Trott/e4110c64314b9666185c3018dbb33b5c |
Hm... doesn't really LGTM. The test was about server timeout, not client's. |
Sorry! |
@indutny I don't understand. Both this version of the test and the current one check the same object ( The old version made sure that It was introduced in 9b6b0555. Prior to that, the timeout was firing (incorrectly) because This new proposed version removes the race condition by checking that Maybe it would be better to also check that |
(Failure that this PR is supposed to address happened again on FreeBSD in CI: https://ci.nodejs.org/job/node-test-commit-freebsd/3427/nodes=freebsd10-64/console ) |
@indutny I put back the timer (now |
New version of test is failing on some platforms. More work to be done... |
OK, let's try again: https://ci.nodejs.org/job/node-test-pull-request/3437/ EDIT: OK, that's looking much better so far. |
Previous CI failures look CI infra related and not an issue with this test, but uh, let's run it again anyway: https://ci.nodejs.org/job/node-test-pull-request/3450/ |
Ci again again: https://ci.nodejs.org/job/node-test-pull-request/3452/ |
|
@jasnell All the CI failures are build related. None are related to this test. But I do love green, so let's do CI again again again: https://ci.nodejs.org/job/node-test-pull-request/3455/ |
I am eager to get this fix in for this test because the version currently on master is failing a fair amount on FreeBSD. Here's another one from this morning:
|
LGTM with green-ish CI then ;-) |
Last CI is green! ✅ |
Another failure on FreeBSD with current master. Definitely eager to land this fix. https://ci.nodejs.org/job/node-test-commit-freebsd/3506/nodes=freebsd10-64/console
|
var tsocket = tls.connect({ | ||
lastIdleStart = socket._idleStart; | ||
|
||
const tsocket = tls.connect({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding here
assert.notStrictEqual(socket._idleTimeout, -1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@santigimeno Good idea. Added!
I like that with this change we don't rely on timers. I'm not too sold on relying on implementation details (i.e: |
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: nodejs#7650
@santigimeno Yeah, I'd welcome a better way. For what it's worth, there is at least one other test where we do something similar ( |
@Trott FWIW I gave this a try and found a way that, at least, seems less flaky than the current one: santigimeno@3df3f47. I run a couple of stress tests and didn't fail:
whereas current master did fail:
The problem is that still relies on timers and the number of tests run doesn't seem representative enough to consider it non-flaky. |
Also, I looked to the original issue: nodejs/node-v0.x-archive#9242, and by the looks of the sample code pasted there, it seems that the timer associated with the socket is unref-ed as it doesn't keep the loop running even though the timer is already running. This test would have been so much easier to implement if the timer were ref-ed... |
@santigimeno I'm inclined to think that using |
@indutny @santigimeno @jasnell Maybe something like d76bb07a19d890d95e71329da504187f3e4462f2 is sufficiently minimal a change to have everyone feel better about it? (Haven't actually stress tested or run it on the old versions and so on, but you get the idea of the approach.) |
Gosh, looks like I was thinking about different test all the time. LGTM. Sorry! |
@Trott, after your explanation, I'm fine with this PR, and I think I prefer this to the new one. LGTM |
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: nodejs#7650 PR-URL: nodejs#7857 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com> Reviewed-By: jasnell - James M Snell <jasnell@gmail.com>
Landed in bc464a8. Thanks, everyone! |
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: #7650 PR-URL: #7857 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com> Reviewed-By: jasnell - James M Snell <jasnell@gmail.com>
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: nodejs#7650 PR-URL: nodejs#7857 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com> Reviewed-By: jasnell - James M Snell <jasnell@gmail.com>
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: #7650 Backport-PR-URL: #12567 PR-URL: #7857 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com> Reviewed-By: jasnell - James M Snell <jasnell@gmail.com>
Checklist
make -j4 test
(UNIX), orvcbuild test nosign
(Windows) passesAffected core subsystem(s)
test tls
Description of change
Competing timers were causing a race condition and thus the test was
flaky. Instead, we check an object property on process exit.
Fixes: #7650
/cc @indutny @nodejs/testing @SomeKittens