Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Allow child.kill() and exec() to kill child's children #1811

Closed
arturadib opened this issue Oct 1, 2011 · 12 comments
Closed

Allow child.kill() and exec() to kill child's children #1811

arturadib opened this issue Oct 1, 2011 · 12 comments

Comments

@arturadib
Copy link

A situation frequently encountered when running a script via exec() is that the script itself spawns other child processes. Specifying the {timeout:...} option to exec() kills only the script but not its children, and it seems desirable to allow Node to kill a child's children for a clean timeout kill.

Would it make sense to add an option to exec(), say e.g. {killSubchildren:true}, to handle such cases?

PS: I poked around the code and noticed /bin/kill is not being used, and a C-version is used instead (in process_wrap.cc). With /bin/kill, we can simply do e.g. kill -TERM -PARENT_ID to kill all children processes of parent PARENT_ID. Not sure how to go about this in process_wrap.cc.

@arturadib
Copy link
Author

It wasn't explicit above, but this will probably require recursively traveling down the child's process tree and killing all found processes, not just the grandchildren.

@bnoordhuis
Copy link
Member

You can accomplish this by passing {setsid:true} to child_process.spawn() so the child and its children run in a new process group and then later on kill said process group with process.kill(-child_pid, signal).

I'm in two minds about adding this to core: I kind of see why it'd be useful but how often will people really need it? It's probably a pain to support on Windows too.

@arturadib
Copy link
Author

I must be missing something trivial, but I can't make setsid spawn a new group ID. Whether or not I set it to true, my child processes all get the same pgid of the node process. For example, here are the processes running after spawning a bash script that calls sleep:

  PID  PPID   GID  RGID  PGID COMM
24449 21855    20    20 24449 node
24451 24449    20    20 24449 /bin/bash
24452 24451    20    20 24449 sleep

So e.g. if I kill process group 24449 it will kill the parent process node too, which I don't want. Tried Node v0.4.7 and latest master branch, doesn't matter.

Any ideas?

PS: Could this be because of the new child_process_uv? I can't seem to track down where setsid() is called from there; neither process_wrap.cc nor process.c seem to call it. node_child_process.cc does seem to call setsid(), but I can't seem to trace it back to child.spawn().

@arturadib
Copy link
Author

I just ran a grep -R 'setsid' * in test/ and nothing came up.

Should we have a unit test for setsid?

@bnoordhuis
Copy link
Member

PS: Could this be because of the new child_process_uv?

Yes, you need to use 0.4.12 or the --use-legacy switch with master. The libuv back end doesn't support setsid.

@dominictarr
Copy link

there is a subtle difference between how PGID and PPID kill -$SIG -$PID (negative pid) kills the process group, not all the children of the PID.
negative pid means PGID, not PPID.

this problem is confusing because exec actually creates a sh which parses the command, and runs it.

this is misleading in the docs...

I encountered this problem the other day, but I made a module to find all the child pids.
https://github.com/dominictarr/ps-tree

@eladb
Copy link

eladb commented Mar 8, 2012

Related to joyent/libuv#338

@jasnell
Copy link
Member

jasnell commented May 15, 2015

@chrisdickinson was a decision ever made to actually close this one?

@jasnell
Copy link
Member

jasnell commented Jun 22, 2015

Closing due to lack of activity.

@jasnell jasnell closed this as completed Jun 22, 2015
richardlau pushed a commit to ibmruntimes/node that referenced this issue Sep 18, 2015
PR-URL: nodejs/node#1996

Notable changes

* module: The number of syscalls made during a require() have been
  significantly reduced again (see nodejs#1801 from v2.2.0 for previous
  work), which should lead to a performance improvement
  (Pierre Inglebert) nodejs#1920.
* npm:
  - Upgrade to v2.11.2 (Rebecca Turner) nodejs#1956.
  - Upgrade to v2.11.3 (Forrest L Norvell) nodejs#2018.
* zlib: A bug was discovered where the process would abort if the
  final part of a zlib decompression results in a buffer that would
  exceed the maximum length of 0x3fffffff bytes (~1GiB). This was
  likely to only occur during buffered decompression (rather than
  streaming). This is now fixed and will instead result in a thrown
  RangeError (Michaël Zasso) nodejs#1811.
seansfkelley pushed a commit to seansfkelley/yerna that referenced this issue Jan 30, 2017
Okay, this is a little shaky, but here's the idea: when doing things like
`yerna run foo`, if `foo` spawned child processes of its own, Yerna is
not able to see that. It successfully kills its immediate children, but
the grandchildren (etc.) keep chugging along. We use a cute hack from
http://azimi.me/2014/12/31/kill-child_process-node-js.html (and also
suggested by a node maintainer a long time ago at
nodejs/node-v0.x-archive#1811 (comment))
to make sure that we kill all of this processes' dependents. The downside
is that rogue children are more likely with severe failures, because the
OS won't auto-kill them when the parent exits.

This change is part of the "always use exitCode instead of exit" work,
otherwise we could just violently process.exit(1) and be done with it
(which I'm not opposed to, but avoiding exit helps track down latent
issues like the exact one that this commit fixes).
@davidrapin
Copy link

davidrapin commented Mar 1, 2017

For those (like me) we arrive late in this discussion and are looking for an answer:

  • for a process to kills it's children when it dies, it has to be the leader of its process group
  • to make a process the leader of its process group, the only option in nodejs (at least since v4) is to use the {detached: true} option in spawn or exec 

see https://nodejs.org/api/child_process.html#child_process_options_detached:

On non-Windows platforms, if options.detached is set to true, the child process will be made the leader of a new process group and session. Note that child processes may continue running after the parent exits regardless of whether they are detached or not. See setsid(2) for more information.

@chesucr
Copy link

chesucr commented Apr 5, 2017

I am using exec() to run a command from shell and I had to face this problem. So, in order to kill the child process of sh I had to install the module ps-tree.

@davisjam
Copy link

davisjam commented May 14, 2017

Late to the party, but here's a recipe for spawning children and killing all of their descendants after a timeout. This particular function is intended to work with an async.parallel call. Works on Linux.

/* Input: (repo, id, timeout, cb)
 * Output: ()
 * Calls cb(obj)
 * obj: keys repo, logdir
 * logdir: name of logdir, 'ERROR' if get-re fails, 'TIMEOUT' if it takes too long
 */
function doStuff (stuff, id, timeout, cb) {
  var pref = 'doStuff ' + id + ': ';

  var resultObj = { "stuff": stuff, "logdir": "ERROR" };

  /* There's two ways to call cb:
   *  - via successful child process completion (perhaps with an error, NBD)
   *  - via timeout
   * Each must prevent the other from calling cb.
   */

  var timeout_timer = null;
  var timed_out = false;

  /* Start a child. */
  var spawnOpts = { 'detached' : true // Allows killing of all of child's descendants.
                                      // cf. http://azimi.me/2014/12/31/kill-child_process-node-js.html
                                      //     https://github.com/nodejs/node-v0.x-archive/issues/1811
                  }
  log(0, pref + 'child_process.spawn: get_stuff <' + get_stuff + '> args [\'' + stuff + '\']');
  var child = child_process.spawn(stuff, [stuff], spawnOpts);

  /* Handle ongoing progress from child. */
  var stdout = '';
  var stderr = '';

  child.stdout.on('data', function (data) { stdout += data; });
  child.stderr.on('data', function (data) { stderr += data; });
  child.on('close', function (rc) {
    /* If we timed_out already, nothing to do. */
    if (timed_out)
      return; // timeout_timer has already called cb

    /* Cancel the timeout_timer, then get logdir and call cb. */

    // Cancel the timeout timer (defined below).
    if (timeout_timer)
      clearTimeout(timeout_timer);

    // Parse stdout for logdir
    var match = /Log files are in (.+)/.exec(stdout);
    if (match)
      resultObj.logdir = match[1];
    else
      log(0, pref + 'couldn\'t parse stdout ' + stdout);

    log(0, pref + 'repo <' + resultObj.stuff + '> logdir <' + resultObj.logdir + '>');
    return cb(resultObj);
  });

  /* Start a timeout timer. */
  timeout_timer = setTimeout(function() {
    /* Kill the child and its descendants, call cb with TIMEOUT. */
    log(0, pref + 'timed out');
    timed_out = true;
    try {
      // Kill child and all its descendants.
      process.kill(-child.pid, 'SIGTERM');
      process.kill(-child.pid, 'SIGKILL');
    } catch (e) {}

    resultObj.logdir = 'TIMEOUT';
    cb(resultObj);
  }, 1000*timeout);

  return;
};

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants