Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: prevent workers outliving parent #9257

Merged
merged 1 commit into from
Oct 27, 2016

Conversation

sam-github
Copy link
Contributor

Checklist
  • make -j8 test (UNIX), or vcbuild test nosign (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

Waiting for confirmation on CI, which was leaking child processes.

Affected core subsystem(s)

test

Description of change

test-child-process-pass-fd.js parent can exit with an error on failure
to fork, in which case it will leak child processes. Limit child
lifetime to that of parent.

Fix #9255

/to @mhdawson

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label Oct 24, 2016
Copy link
Member

@gibfahn gibfahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mscdex mscdex added the child_process Issues and PRs related to the child_process subsystem. label Oct 24, 2016
@gibfahn
Copy link
Member

gibfahn commented Oct 24, 2016

@sam-github
Copy link
Contributor Author

Thanks @gibfahn , did you trigger that yourself? I've been out of touch with node core for a while, I can't remember what the process is for triggering a PR to build in CI.

@sam-github
Copy link
Contributor Author

Btw, I think this test is safe, in that it will fail if fork fails, and cleanup the children.

It won't fix whatever might have called fork to fail, perhaps transient out of memory, or limits on number of processes, but it might help narrow down the root cause to something other than this test.

And at least it won't make things worse, because after failing due to failure-to-fork, it won't leave dozens of processes around, causing fork to be even more likely to fail.

@gibfahn
Copy link
Member

gibfahn commented Oct 24, 2016

@sam-github Yes I did, you just go to the Jenkins job (test-pull-request) and run it against the PR number.

@mhdawson
Copy link
Member

LGTM provided CI is ok.

@gibfahn
Copy link
Member

gibfahn commented Oct 24, 2016

@sam-github
Copy link
Contributor Author

parallel/test-dgram-send-empty-array failed on freebsd, I can't see how that is related.

arm-fanned died with fatal: Not a git repository (or any of the parent directories): .git, is it healthy?

@sam-github
Copy link
Contributor Author

ci 3: https://ci.nodejs.org/job/node-test-pull-request/4652/

Seems to be getting worse, how do I compare to expected platform build results?

@gibfahn
Copy link
Member

gibfahn commented Oct 25, 2016

@sam-github Looking at the failures in other recent PRs usually helps. I'm not sure there's an expected version to compare against, a lot of these failures are intermittent.

Unless your test is doing something really crazy, any failures that aren't test-child-process-pass-fd.js should be safe to ignore though.

@sam-github
Copy link
Contributor Author

The test failures are unrelated to the test I modified. I'll merge this tomorrow.

Copy link
Contributor

@cjihrig cjihrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Is it possible to run a stress test on AIX and check for left over processes?

@sam-github
Copy link
Contributor Author

@cjihrig Was that question for me? I don't know - you certainly know more about node's CI setup than I do!

@cjihrig
Copy link
Contributor

cjihrig commented Oct 26, 2016

It was more of a question for anyone listening. We do have a stress test CI job, but someone with an account on the machine would have to log in to check for leftover processes. Alternatively, someone with an AIX machine could run tools/test.py --repeat=N sequential/test-child-process-pass-fd, where N is a large number.

@gibfahn
Copy link
Member

gibfahn commented Oct 26, 2016

@cjihrig Will do both (community and our own machines) tomorrow

@gibfahn
Copy link
Member

gibfahn commented Oct 27, 2016

@georgeadams95 has run the test a few hundred times on one of our AIX boxes, and there were no child processes left behind.

test-child-process-pass-fd.js parent can exit with an error on failure
to fork, in which case it will leak child processes. Limit child
lifetime to that of parent.

Fixes: nodejs#9255
PR-URL: nodejs#9257
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@sam-github sam-github merged commit 4d896c4 into nodejs:master Oct 27, 2016
@sam-github sam-github deleted the cp-test-die-on-loss-of-parent branch October 27, 2016 16:54
evanlucas pushed a commit that referenced this pull request Nov 3, 2016
test-child-process-pass-fd.js parent can exit with an error on failure
to fork, in which case it will leak child processes. Limit child
lifetime to that of parent.

Fixes: #9255
PR-URL: #9257
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
MylesBorins pushed a commit that referenced this pull request Nov 17, 2016
test-child-process-pass-fd.js parent can exit with an error on failure
to fork, in which case it will leak child processes. Limit child
lifetime to that of parent.

Fixes: #9255
PR-URL: #9257
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
MylesBorins pushed a commit that referenced this pull request Nov 19, 2016
test-child-process-pass-fd.js parent can exit with an error on failure
to fork, in which case it will leak child processes. Limit child
lifetime to that of parent.

Fixes: #9255
PR-URL: #9257
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Nov 22, 2016
@sam-github
Copy link
Contributor Author

@MylesBorins why don't land on v4?

@sam-github
Copy link
Contributor Author

I see, it conflicts, I can backport, do we care? The reason to backport would be for other PRs to land better, but it doesn't seem to have been a problem. Also, for ci stability.

@gibfahn
Copy link
Member

gibfahn commented Feb 24, 2017

@sam-github I think CI stability is a compelling reason to backport.

@sam-github
Copy link
Contributor Author

@MylesBorins if you will accept, I will backport, but I need some encouragement, I don't want to waste my time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants