-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix flaky test-dgram-exclusive-implicit-bind #10212
Conversation
Sample CI failure indicating this test was flaky: https://ci.nodejs.org/job/node-test-commit-linux/6627/nodes=fedora24/console not ok 1295 parallel/test-dgram-exclusive-implicit-bind
---
duration_ms: 60.62
severity: fail
stack: |-
timeout |
Could try a stress test, but I don't think this happens with enough frequency for that to be useful without a very large number of runs. Here's master on fedora24 run 9999 times. We'll see if it fails at all. If so, we can do another run with this PR and see if it comes up clean. https://ci.nodejs.org/job/node-stress-single-test/1070/nodes=fedora24/console |
@@ -96,4 +99,12 @@ if (process.env.BOUND === 'y') { | |||
source.unref(); | |||
} | |||
|
|||
source.send(Buffer.from('abc'), 0, 3, common.PORT, '127.0.0.1'); | |||
function send() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could all of this just be simplified to:
setInterval(() => {
source.send(Buffer.from('abc'), 0, 3, common.PORT, '127.0.0.1');
}, 1).unref();
I think you shouldn't have to worry about clearing the interval if you unref it. That would reduce some complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned that competing 1ms interval timers on some operating systems could result in increased flakiness, but I didn't actually test it, so yeah, let's try that. :-D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't keep the timer-clearing logic, this happens:
Error: Not running
at Socket._healthCheck (dgram.js:527:11)
at Socket.send (dgram.js:347:8)
at Timeout.setInterval [as _onTimeout] (/Users/trott/io.js/test/parallel/test-dgram-exclusive-implicit-bind.js:104:10)
at ontimeout (timers.js:365:14)
at Timer.unrefdHandle (timers.js:471:5)
dgram.js:527
throw new Error('Not running'); // error message from dgram_legacy.js
^
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean
Updated to use setInterval() per suggestion from @cjihrig. CI again: https://ci.nodejs.org/job/node-test-pull-request/5352/ |
|
||
source.on('close', function() { | ||
clearTimeout(interval); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a previous comment I'm not sure this was a problem but can't you get rid of this as the interval is already unrefed? At least locally it works for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove that line, the test still passes, but it produces a messy output that looks like it's failing:
$ ./node test/parallel/test-dgram-exclusive-implicit-bind.js
dgram.js:527
throw new Error('Not running'); // error message from dgram_legacy.js
^
Error: Not running
at Socket._healthCheck (dgram.js:527:11)
at Socket.send (dgram.js:347:8)
at Timeout.setInterval [as _onTimeout] (/Users/trott/io.js/test/parallel/test-dgram-exclusive-implicit-bind.js:104:10)
at ontimeout (timers.js:365:14)
at Timer.unrefdHandle (timers.js:471:5)
dgram.js:527
throw new Error('Not running'); // error message from dgram_legacy.js
^
Error: Not running
at Socket._healthCheck (dgram.js:527:11)
at Socket.send (dgram.js:347:8)
at Timeout.setInterval [as _onTimeout] (/Users/trott/io.js/test/parallel/test-dgram-exclusive-implicit-bind.js:104:10)
at ontimeout (timers.js:365:14)
at Timer.unrefdHandle (timers.js:471:5)
$
I'd prefer the processes clean up after themselves and not generate irrelevant errors like that, even if the errors don't cause the test to fail, so I'd prefer to keep the clearInterval()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a suggestion / question
Landed 3ddbea6 |
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean PR-URL: #10212 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Italo A. Casas <me@italoacasas.com>
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean PR-URL: nodejs#10212 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Italo A. Casas <me@italoacasas.com>
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean PR-URL: #10212 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Italo A. Casas <me@italoacasas.com>
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean PR-URL: #10212 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Italo A. Casas <me@italoacasas.com>
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean PR-URL: #10212 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Italo A. Casas <me@italoacasas.com>
test-dgram-exclusive-implicit-bind is written assuming that dgram messages are received with 100% reliability. While missing a dgram message sent to localhost is rare, we do see it as evidenced by CI failures from time to time. The test has been rewritten to send dgram messages over and over until the test requirements have been met. Additional incidental refactoring includes: * var -> const * use of common.mustCall() instead of exit listener + boolean PR-URL: #10212 Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Italo A. Casas <me@italoacasas.com>
Checklist
make -j4 test
(UNIX), orvcbuild test nosign
(Windows) passesAffected core subsystem(s)
test dgram
Description of change
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.
The test has been rewritten to send dgram messages over and over until
the test requirements have been met.
Additional incidental refactoring includes: