-
Notifications
You must be signed in to change notification settings - Fork 7.3k
simple/test-cluster-worker-death.js fails #3198
Conversation
The event on the Worker should be named 'exit'. The cluster module has been basically rewritten, so breaking backwards compatibility is ok in this case. I think @bnoordhuis had pointed out that the worker argument added in 5f08c3c is unnecessary, since it's already So, patch welcome:
|
- changed worker `death` to `exit`. - corrected argument type expected by worker `exit` handler.
worker 'exit' event now emits arguments consistent with the corresponding event in child_process module.
Commits are attached. CLA is signed. In addition to the points @isaacs identified, I've also included a couple of new tests that focus on the behavior of cluster's worker processes when they are killed and/or exited. |
this.process.once('exit', function(exitCode, signalCode) { | ||
prepareExit(self, 'dead'); | ||
self.emit('exit', exitCode, signalCode); | ||
cluster.emit('exit', self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so confusing, why do the worker.on('exit')
have exitCode
and signalCode
, but cluster.on('exit')
don't. But fixing that would be confusing too, since you would have to syntaxes:
worker.on('exit', function (exit, signal) {});
worker.on('exit', function (worker, exit, signal) {});
Personally I would like to see this be leaved as is, because userland can access this information using worker.process.exitCode and worker.process.signalCode. However we should properly extend an example so it shows how to obtain exitCode
and signalCode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than @bnoordhuis already-stated reason, I think another benefit of this change is that it makes worker.on('exit')
match child_process.on('exit')
. This makes sense to me -- ideally, anything a developer learns about using the child_process module, should be transferrable to using the workers produced by the cluster module as well:
using child_process:
var cp = child_process.fork(process.argv[1]);
cp.on('exit', function(code, signal) {
// ...
});
using cluster:
var worker = cluster.fork();
worker.on('exit', function(code, signal) {
// ... works the same as child_process
});
@isaacs if you fell that the |
@AndreasMadsen I was thinking same thing. I actually started, last night, to eliminate the While we're on the subject of confusing :) -- I'd like to offer a thought / make a suggestion: I think it's confusing to have the child_process instance exposed as part of the worker (
more examples of similar conundrums are possible. There are so many different possibilities here, that it would be very difficult to get test coverage for them all. In my opinion, it would be better if the cluster module were designed so that If we did this, here's how I imagine it might turn out: var worker = cluster.fork();
if( cluster.isMaster ) {
// in the Master process...
// `worker.process` would be removed, and all the `ChildProcess`
// api is exposed as part of `Worker`. so `Worker` *is a* `ChildProcess`...
assert.ok( worker.process === undefined );
// properties of child_process are available directly
var child_pid = worker.pid;
// methods too...
worker.kill('SIGHUP');
// methods can be overridden with cluster-specific logic when needed
worker.disconnect()
// Worker can add new methods/properties too, as needed
console.log("Worker ID = %s", worker.uniqueID);
console.log("Suicide? %s", worker.suicide);
worker.destroy();
// the same things are true of events...
// events inherited from child_process
worker.on('exit', function(code,signal) { ... });
worker.on('close', function() { ... });
worker.on('disconnect', function() { ... });
// worker defines additional events as needed...
worker.on('message', function() { ... });
worker.on('online', function() { ... });
worker.on('listening', function() { ... });
if( cluster.isChild ) {
// in the Child process...
// There is no special `Worker` object instance. Instead, let the
// Cluster module "extend" the `process` object itself -- the same
// way the child_process module already does. Keep `cluster.worker`
// for convenience and clarity - but it would just point to the existing
// `process` object.
assert.ok( cluster.worker === process );
// child_process's existing functions (eg: `process.send()`,
// `process.disconnect()`, etc) could be overridden by cluster-specific
// functions as needed. Allowing cluster-specific logic to be injected, while
// preserving the pre-existing interface from child_process.
// calls like `process.disconnect()` would invoke a function from Cluster,
// which would delegate internally to the corresponding child_process function.
// this is the *only* version of `disconnect()` available in the public api
// whether you're coding inside a worker process, or a plain old child_process.
process.disconnect()
// of course, `cluster.worker.*` is equivalent to `process.*`
// so, instead, we could have spelled it like this...
cluster.worker.disconnect();
// Worker adds the `worker.destroy()` method, which is
// not present in child_process.
process.destroy();
// the 'exit' event is already defined by `process`. no changes needed.
process.on('exit', function() { ... });
// child_process adds a few events...
process.on('message', ...);
process.on('disconnect', ...);
// cluster could add specific events too, if needed.
// ...but I don't think there are any additional ones
// for the child at this time (?)
} @AndreasMadsen, I think your recent changes add valuable and much-needed functionality to the Cluster module. Thank you for taking the time and effort to contribute that. I also think that, with a little additional effort (which I'm happy to contribute) --cleaning up, and unifying with I'm interested to hear thoughts, discussion, etc. (Thanks for reading this far). |
Hi @coltrane funny, I too was in the process of writing a long comment about this, before I knew you did so. I have read your comment twice, but it is too late here, so I won't give you a quick responds today, but try to give a well deserved respond tomorrow. However in the light of objectivity I will still give you what I wrote before reading yours. This is mostly taken from my original discussion with my self when I designed the new cluster API. I’m sorry if there is too much irrelevant stuff, I have tried to sort most of it out.
Is the above correct, I don’t know. But trust me I have rewritten that cluster module 4 times and my conclusion so far is that
tl;dr I hope you didn't start here :) I must admit I never thought about the exitCode and signalCode arguments, if those are important and API bloat is not an issue then sure add them. But make it consistent and include it in both the worker and cluster object so userland can switch between cluster.on and worker.on without worrying about argument positions shifting. In the end if userland have to do: var signalCode = this === cluster ? arguments[0].process.signalCode : arguments[1]; they will properly just do var worker = (this === cluster ? arguments[0] : this);
var signalCode = worker.process.signalCode; And then why not just allow them to do: var signalCode = worker.process.signalCode; |
@coltrane Landed on a62dd44. Thanks! @AndreasMadsen I think your points are valid. But it's a bit odd to emit the object itself as an argument to its own event listener, where |
@isaacs today you really surprised me and made me sad on the same time, congratulations. The reason is not just that you closed this, without letting the discussion finish, or at least give me some time to evaluate
And please also remember to remove it from https://github.com/joyent/node/pull/3198/files#r773549
|
@coltrane as promised I will give you my thoughts on you proposed API. To make every thing clear, lets set this up in a pros/cons matrix :)
In the design process I did actually consider your API, it is more clear but lacks the ability to go beyond the immediately capabilities of the given API. However with a simple As for you concerns about testcases, I don’t see your point. A worker is simply an child process with a different API surface because the needs aren’t the same. If you call worker.process.disconnect it will disconnect the child process itself, however if you call @isaacs today we are discovering new land, “nothing else works this way” because there is nothing in node there directly translate to this case. And issacs there are cases where we send extra arguments to an handler, there really aren’t needed. The buffer argument in |
@AndreasMadsen I saw your comment. Yes, failing to remove the worker arg from worker.on('online') and 'listening' was an oversight on my part. That will be fixed shortly, thanks for the reminder :) The .process member is still there as a reference to the underlying ChildProcess object, however, so I'm not sure what you mean when you say it "lacks the ability to go beyond the immediately capabilities of the given API". The bottom line is that it's better to reduce redundancy and make the API surfaces more consistent throughout node. |
Regarding discussion in nodejs#3198. Passing the worker as an argument to an event emitted on the worker is redundant, and an unnecessary break in consistency vs the events on the ChildProcess objects. It was removed from 'exit', but 'listening' and others were overlooked. This corrects that oversight.
@isaacs that was only a respond the API proposed by coltrane, it has nothing todo with the current API in node/master. |
@AndreasMadsen, thanks for your reply. A few thoughts and questions...
Can you explain this further? What do you mean by "go beyond the edge of the cluster module"?
I will try to explain: For example: inside a "regular" child_process, calling This is the very kind of inconsistency that leads me to suggest hiding In either case, it is not sufficient that we have tested the "child_process" module on its own, to get full coverage, we would need to re-test the ChildProcess functions via the Worker. This quickly becomes a complicated task, if you really try to test all the different combinations of methods and events on |
Sure, in many ways it is like the HTTP module. If you create a normal http server: http.createServer(function (req, res) { }); The given API surface is
It is a know issue, and is caused by this code: https://github.com/joyent/node/blob/master/lib/cluster.js#L499-503
I agree, we should definitely test as many use cases as possible, however To translate to the |
@AndreasMadsen that makes sense, thank you for explaining. I'll think about this some more with the new perspective you've provided. |
There are two problems with this test:
1- The test fails initially because it expects
worker.on('exit', ...)
to send astatusCode
, but as of 5f08c3c, that event sends an instance of the newWorker
class.2- After fixing the first problem, the test fails because it expects to receive
cluster.on('death', ...)
, but that event name was changed to'exit'
via the combined effects of 5f08c3c and 90ce5b3.These are both breaks in backward compatibility to 0.6.x. I'm guessing that #1 is not a big deal. The following thoughts pertain to #2.
I suspect that the intent of 90ce5b3 (@isaacs, please correct me if I'm wrong) was only to rename the event that's emitted by the worker, while leaving the cluster's event alone, thus restoring backward compatibility; but 5f08c3c had previously unified that logic so that the event names were tied together.
I'll be glad to prepare a fix, but I need some feedback on how to proceed:
exit
event fromcluster
instead of thedeath
event.