Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Cluster worker.kill(signal) does not pass the signal parameter #6042

Closed
BorePlusPlus opened this issue Aug 12, 2013 · 10 comments
Closed

Cluster worker.kill(signal) does not pass the signal parameter #6042

BorePlusPlus opened this issue Aug 12, 2013 · 10 comments
Labels

Comments

@BorePlusPlus
Copy link

When calling worker.kill(signal) and passing a signal, the signal does not get sent to worker process as per documentation.

http://nodejs.org/api/cluster.html#cluster_worker_kill_signal_sigterm

Example:

var cluster = require('cluster');

if (cluster.isMaster) {
    cluster.fork();

    cluster.on('exit', function(worker, code, signal) {
        console.log('Worker exited:', worker.process.pid);
    });

    setTimeout(function () {
        for (var id in cluster.workers) {
            var worker = cluster.workers[id];
            worker.kill('SIGUSR2');
        }
    }, 2000);
} else {
    process.on('SIGUSR2', function() {
        //WHY DOES THIS NOT GET CALLED? http://nodejs.org/api/cluster.html#cluster_worker_kill_signal_sigterm
        console.log('SIGUSR2 called', process.pid);
    });
    console.log('Worker listening for SIGUSR2', process.pid);
}
@sam-github
Copy link

I think this is a bug.

worker.kill() sets .suicide in the master, but it doesn't send suicide to the worker, so master/worker opinion of suicide is not in sync.

Since master does a process.disconnect() before it does a process.kill(), the workers sees an unexpected disconnect, and exits before the signal gets a chance to be delivered.

Probably fix is for master to send a suicide message before doing a worker.process.disconnect(), similar to what the worker does when kill is called in the child.

@bnoordhuis
Copy link
Member

Been talking it over with Sam and I'm leaning towards removing Worker#kill(). It's an alias for Worker#destroy() right now but it:

  • doesn't really work on Windows due to the lack of signals (libuv emulation notwithstanding),
  • has the overloaded connotations of kill(2), yet
  • is documented as "This function will kill the worker, and inform the master to not spawn a new worker."

Which is weird because e.g. worker.kill('SIGUSR1') doesn't actually kill the process. I propose we retire it and make Worker#destroy() the officially sanctioned method.

Other committers, thoughts?

@trevnorris
Copy link

@bnoordhuis +1 from me.

@sam-github
Copy link

@bnoordhuis, since worker.destroy === worker.kill right now, all the problems you enumerate exist for destroy as much as for kill. Did you intend to change it so it can only be used to terminate the worker?

I suggest changing worker.kill() to be an alias to worker.process.kill(), and removing worker.destroy(). This would parallel how worker.send() is an alias to worker.process.send().

I'm also OK with simply removing both .kill and .destroy, though I think leaving the documented .kill around is a bit friendlier to existing code.

Reasons:

  • I would leave a .kill(), because I've read lots of cluster code (and have my own) that calls kill, which is the currently documented API. .kill() is usually called, as recommended by the docs, some time after .disconnect() is called but the 'exit' event hasn't occurred, indicating the worker is borked, and needs to be killed in a less-graceful manner than disconnect does. As such, a reasonable thing is to call it with SIGTERM, then call it a bit later with SIGKILL if the term didn't work.
  • Having events and functions with the exact same name in worker and child_process is helpful, but only when they do the same thing. Send is like this. Kill is not. This is confusing, and it is not enormously obvious when something is exactly the same, or only similar. Specifically here, it was not obvious to the bug reporter that worker.kill() can't be used to send signals that don't kill the child, something that is fully supported by child_process.kill().
  • Cluster-specific support for signals as IPC mechanism to worker should be removed. It is complex, broken, and unnecessary on every platform, since there is already a builtin .send() for communicating with the worker. And if insist on using signal for IPC to your worker, you can do that already, use worker.process.kill(). It has no cluster-specific magic. It just sends a signal (maybe, depending on platform, but that is fully documented elsewhere, no need to add cluster-specific wrinkles to it).

I'm happy to implement this (yeah, its only 2 lines of code, but then there are all the docs and tests...), if you all want it.

@sam-github
Copy link

/cc @isaacs cf. e428bb7

@jasnell
Copy link
Member

jasnell commented May 26, 2015

@sam-github @bnoordhuis ... is this resolved?

@sam-github
Copy link

Not resolved. cluster.destroy() getting renamed to kill() still makes no sense. It disconnects before sending the signal to its child, pretty much guaranteeing the signal won't get delivered, or at least will be racy.

@diegoperini
Copy link

Regarding a workaround, how safe is it to kill a child process via worker.process.kill()?

@sam-github
Copy link

doing so just sends the signal. whether that is "safe" or not just depends on whether the signal does what you want, which is up to your code. worker.process and its methods are public APIs of worker/cluster, and can be used as doced

@diegoperini
Copy link

Thank you for the quick response @sam-github

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants