Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate cluster/child_process oddity on macOS #14828

Closed
Trott opened this issue Aug 14, 2017 · 1 comment
Closed

Investigate cluster/child_process oddity on macOS #14828

Trott opened this issue Aug 14, 2017 · 1 comment
Assignees
Labels
child_process Issues and PRs related to the child_process subsystem. cluster Issues and PRs related to the cluster subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. macos Issues and PRs related to the macOS platform / OSX.

Comments

@Trott
Copy link
Member

Trott commented Aug 14, 2017

Version: 9.0.0-pre
Platform: macOS 10.x (seen on 10.10, 10.11, and 10.12)
Subsystem: cluster child_process

A workaround/fix was introduced to test-cluster-send-handle-large-payload in #14780 so that it wouldn't time out on macOS from time to time.

It's not clear if this behavior is a bug in the test, a bug in Node.js, a bug in macOS, or simply not a bug at all.

dtruss -f output

Additional info from #14747

When this times out, the process.send() in the subprocess is getting called, but the message event on worker is not being emitted (or at least the listener is not being invoked).

If I add a callback to process.send(), it never indicates an error in this test, whether the test succeeds or times out.

Judging from #6767, process.send() may be fire-and-forget. Maybe the message gets received and maybe not.

I don't know why macOS would be more susceptible to missing the message than anything else. If this is not-a-bug behavior, I guess we can add a retry. If this is a bug... ¯(ツ)/¯

The .send() from the parent process to the subprocess never seems to fail.

Code from version of test without the fix/workaround:

'use strict';
const common = require('../common');
const assert = require('assert');
const cluster = require('cluster');
const net = require('net');

const payload = 'a'.repeat(800004);

if (cluster.isMaster) {
  const server = net.createServer();

  server.on('connection', common.mustCall((socket) => socket.unref()));

  const worker = cluster.fork();
  worker.on('message', common.mustCall(({ payload: received }, handle) => {
    assert.strictEqual(payload, received);
    assert(handle instanceof net.Socket);
    server.close();
    handle.destroy();
  }));

  server.listen(0, common.mustCall(() => {
    const port = server.address().port;
    const socket = new net.Socket();
    socket.connect(port, (err) => {
      assert.ifError(err);
      worker.send({ payload }, socket);
    });
  }));
} else {
  process.on('message', common.mustCall(({ payload: received }, handle) => {
    assert.strictEqual(payload, received);
    assert(handle instanceof net.Socket);
    process.send({ payload }, handle);

    // Prepare for a clean exit.
    process.channel.unref();
    handle.unref();
  }));
}

/cc @addaleax

@Trott Trott added test Issues and PRs related to the tests. cluster Issues and PRs related to the cluster subsystem. macos Issues and PRs related to the macOS platform / OSX. child_process Issues and PRs related to the child_process subsystem. and removed test Issues and PRs related to the tests. labels Aug 14, 2017
@Trott
Copy link
Member Author

Trott commented Aug 14, 2017

/cc @nodejs/platform-macos

@maclover7 maclover7 added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Dec 25, 2017
santigimeno added a commit to santigimeno/libuv that referenced this issue Feb 13, 2018
On OSX when sending handles via `sendmsg()` it can return `EMSGSIZE` if
there isn't room in the socket output buffer to store the whole message.
In that's the case, return control to the loop and try again in the next
iteration.

Fixes: nodejs/node#14828
santigimeno added a commit to santigimeno/libuv that referenced this issue Feb 19, 2018
On OSX when sending handles via `sendmsg()` it can return `EMSGSIZE` if
there isn't room in the socket output buffer to store the whole message.
In that's the case, return control to the loop and try again in the next
iteration.

Fixes: nodejs/node#14828
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. cluster Issues and PRs related to the cluster subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. macos Issues and PRs related to the macOS platform / OSX.
Projects
None yet
Development

No branches or pull requests

3 participants