-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: improve test-cluster-disconnect-suicide-race #4739
Conversation
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. Ref: nodejs#4674 cc @santigimeno @iwuzhere
CI with this new test but the code to https://ci.nodejs.org/job/node-test-commit/1811/ Red is good in this case. |
CI for this PR: https://ci.nodejs.org/job/node-test-pull-request/1286/ |
cluster.worker[process.env.action](); | ||
cluster.on('exit', (worker, code) => { | ||
if (code) | ||
common.fail('worker exited with error'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not assert.notEqual(code, 0)
?
LGTM with a suggestion. Unrelated failures on the OS X buildbot. |
LGTM. Thanks again! |
CI failure on OSX with test-cluster-disconnect-leak => https://ci.nodejs.org/job/node-test-commit-osx/1847/nodes=osx1010/console |
OS X issue does indeed look unrelated to this (and would be partially or perhaps even completely fixed by #4736). But, you know, just to make sure there's not some weird interaction that wouldn't be suspected (because when has that ever happened, amirite?!), another CI run: https://ci.nodejs.org/job/node-test-commit/1822/ All green! \o/ Although, uh, now I wish I had thrown in Ben's suggested change before running the test... OK, so, now with that change: https://ci.nodejs.org/job/node-test-commit/1825/ All green again! \o/ |
LGTM :-) |
Landed in 44aba1a |
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: #4739 Ref: #4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Two cluster tests have recently changed so that they are no longer resource intensive. Move them back to parallel. Ref: nodejs#4736 Ref: nodejs#4739
Two cluster tests have recently changed so that they are no longer resource intensive. Move them back to parallel. Ref: nodejs#4736 Ref: nodejs#4739 PR-URL: nodejs#4774 Reviewed-By: Johan Bergström <bugs@bergstroem.nu> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: #4739 Ref: #4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: #4739 Ref: #4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: #4739 Ref: #4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: nodejs#4739 Ref: nodejs#4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: nodejs#4739 Ref: nodejs#4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: nodejs#4739 Ref: nodejs#4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues: * Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to... * Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..." This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is. Advantages of this approach include: * The test runs much faster. * The test should be reliable on any new platform regardless of CPU and RAM. PR-URL: nodejs#4739 Ref: nodejs#4674 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Two cluster tests have recently changed so that they are no longer resource intensive. Move them back to parallel. Ref: nodejs#4736 Ref: nodejs#4739 PR-URL: nodejs#4774 Reviewed-By: Johan Bergström <bugs@bergstroem.nu> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Previously, test-cluster-disconnect-suicide-race had two issues:
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."
This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.
Advantages of this approach include:
RAM.
Ref: #4674
cc @santigimeno @iwuzhere