Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

child_process 'spawn' event is emitted too soon #37782

Open
JakobJingleheimer opened this issue Mar 17, 2021 · 17 comments
Open

child_process 'spawn' event is emitted too soon #37782

JakobJingleheimer opened this issue Mar 17, 2021 · 17 comments
Labels
child_process Issues and PRs related to the child_process subsystem. confirmed-bug Issues with confirmed bugs. esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem.

Comments

@JakobJingleheimer
Copy link
Member

JakobJingleheimer commented Mar 17, 2021

Version: 15.1.0 thru 15.11.0 (see screenshot below)
Platform: Darwin CALLMT20389 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64

What steps will reproduce the bug?

Run the following simple repro example:

$> node ./parent.mjs
parent.mjs
import { fork } from 'child_process';


const subprocess = fork('./child.mjs');

subprocess.on('close', (...args) => { console.error('[PARENT] received', ...args) });
subprocess.on('disconnect', (...args) => { console.error('[PARENT] received', ...args) });
subprocess.on('error', (...args) => { console.error('[PARENT] received', ...args) });
subprocess.on('exit', (...args) => { console.error('[PARENT] received', ...args) });
subprocess.on('message', (...args) => { console.log('[PARENT] received', ...args) });
subprocess.on('spawn', () => {
  console.log({
    'parent::subprocess.send': subprocess.send({ hello: 'child' })
  });
});
child.mjs
process.on('message', (...args) => { console.log('[CHILD] received', ...args) });

process.send({ hello: 'parent' });

How often does it reproduce? Is there a required condition?

100% of the time (dozens of executions)

What is the expected behavior?

child's process.on('message') should be triggered (parent's message should be received and logged to console).

What do you see instead?

Only parent's subprocess.on('message') is triggered:

$> node ./parent.mjs
{ 'parent::subprocess.send': true }
[PARENT] received { hello: 'parent' }

image

Additional info

I installed node via nvm. I verified the version of node actually running is truly 15.11.0 by console logging process.version in both parent.mjs and child.mjs (both output v15.11.0)

@JakobJingleheimer
Copy link
Member Author

JakobJingleheimer commented Mar 17, 2021

Interestingly, if I wrap the subprocess.send() (still inside on('spawn')) in a setTimeout with at least a 29 millisecond delay, it almost always works (~80–90% success rate).

The setTimeout trick outside on('spawn') works but needs a much larger number, like 3 seconds.

@Linkgoron
Copy link
Member

This looks like a race condition where children .mjs take a bit longer to start, and the spawn is called "too early".

Look at the following change and console logs:

parent.mjs:

import { fork } from 'child_process';

const subprocess = fork('./tst/child.mjs');
subprocess.on('message', (...args) => { console.log('[PARENT] received', ...args) });
subprocess.on('spawn', () => {
  console.log('spawn was called');
  console.log({
    'parent::subprocess.send': subprocess.send({ hello: 'child' })
  });
});

child.mjs:

console.log('pre-message-handler');
process.on('message', (...args) => { console.log('[CHILD] received', ...args) });

process.send({ hello: 'parent' });

It prints:

spawn was called
{ 'parent::subprocess.send': true }
pre-message-handler
[PARENT] received { hello: 'parent' } undefined

The parent sends the message before the child adds the message handler.

@JakobJingleheimer
Copy link
Member Author

Yep, that's what I surmised.

The docs for Event: 'spawn' say:

the 'spawn' event is emitted once the child process has spawned successfully.

It would appear the doc is either incorrect or (more likely) it's a bug.

@JakobJingleheimer
Copy link
Member Author

JakobJingleheimer commented Mar 17, 2021

As a workaround, I set up a handshake wherein the child immediately sends a "READY" message to its parent, resolving a promise in the parent:

parent.mjs

const settleReadyState = {};

const subprocess = fork('./child.mjs');

subprocess.ready = new Promise((resolve, reject) => Object.assign(settleReadyState, { resolve, reject }));

subprocess.once('message', () => {
  settleReadyState.resolve();
  subprocess.on('message', console.log);
});

subprocess.ready.then(() => subprocess.send({ hello: 'child' });
});

child.mjs

process.on('message', console.log);

process.send('READY');
process.send({ hello: 'parent' });

@JakobJingleheimer JakobJingleheimer changed the title subprocess.send() fails silently child_process 'spawn' event is emitted too soon Mar 17, 2021
@benjamingr benjamingr added child_process Issues and PRs related to the child_process subsystem. confirmed-bug Issues with confirmed bugs. esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem. labels Mar 18, 2021
@schamberg97
Copy link
Contributor

schamberg97 commented Mar 18, 2021

What is interesting is that I found out that it only reproduces, if the child script's extension is .mjs, even if import { fork } from 'child_process'; is used in the parent script. Probably ESM is a bit slower at Node startup and revealed a race condition elsewhere, only in the child itself.

child.js:

nikolayshamberg@m1001 /tmp % node ./parent.mjs
{ 'parent::subprocess.send': true }
[PARENT] received { hello: 'parent' } undefined
[CHILD] received { hello: 'child' } undefined

child.Mjs:

nikolayshamberg@m1001 /tmp % node ./parent.mjs
{ 'parent::subprocess.send': true }
[PARENT] received { hello: 'parent' } undefined

What is even weirder is that subprocess.connected is always true, unless disconnect was called. From documentation:

The subprocess.connected property indicates whether it is still possible to send and receive messages from a child process. When subprocess.connected is false, it is no longer possible to send or receive messages.

Obviously, true by default... Is just not true in our case ;)

Otherwise, if subprocess.connected was meaningful, we could just check in JS whether the child is already available to interact with. Perhaps subprocess.connected should be set to true only after the check akin to that of @JakobJingleheimer's is done first? I think that would be an elegant solution and we wouldn't need to deal with ESM issue

@JakobJingleheimer
Copy link
Member Author

JakobJingleheimer commented Mar 18, 2021

I wonder if it's because ESM loader is always async?

Update: The handshake workaround above seems to be working pretty well so far, and is backwards compatible (since the spawn event was added in 15.1, but child_process has existed since well before then).

I'd be happy to add it to the docs.

@JakobJingleheimer
Copy link
Member Author

Yes and yes

I would guess subprocess.connected is also a bug—possibly with the same cause?

But the 'spawn' event is still pretty necessary (otherwise, without the handshake workaround, we'd have to poll subprocess.connected til it switches to true 😱).

I think subprocess.send() returning true here is also a bug. If not a bug, I can't think of a use-case for it (as-is, it seems like a lie).

@Linkgoron
Copy link
Member

Linkgoron commented Mar 18, 2021

I disagree, because the connected property and spawn event are not Node cp specific. If there's an issue here, and I'm not 100% sure that there is one (except maybe better spawn documentation), it's (IMO) in the Node child-process that should probably be able to receive the messages that were sent before a listener could be added, and not in the parent.

@JakobJingleheimer
Copy link
Member Author

If the spawn event is not specific to child_process, do the others maybe also suffer this problem?

IMHO, the spawn event etc should have the same behaviour for both commonjs and esm.

@schamberg97
Copy link
Contributor

schamberg97 commented Mar 18, 2021

I think subprocess.send() returning true here is also a bug. If not a bug, I can't think of a use-case for it (as-is, it seems like a lie).

As far as I understand documentation, it only shows, whether comm channel was terminated or not. If channel is terminated, it becomes false. It will always show true, even before IPC is established in the first case.

@schamberg97
Copy link
Contributor

schamberg97 commented Mar 18, 2021

I disagree, because the connected property and spawn event are not Node cp specific. If there's an issue here, and I'm not 100% sure that there is one (except maybe better spawn documentation), it's (IMO) in the Node child-process that should probably be able to receive the messages that were sent before a listener could be added, and not in the parent.

I think it's impossible to listen for events retroactively. However, AFAIK connected property is specific to only child_process and cluster module (since the latter depends on the former). Otherwise, it is likely I misunderstood you somewhere, @Linkgoron

@Linkgoron
Copy link
Member

I disagree, because the connected property and spawn event are not Node cp specific. If there's an issue here, and I'm not 100% sure that there is one (except maybe better spawn documentation), it's (IMO) in the Node child-process that should probably be able to receive the messages that were sent before a listener could be added, and not in the parent.

I think it's impossible to listen for events retroactively. However, AFAIK connected property is specific to only child_process and cluster module (since the latter depends on the former). Otherwise, it is likely I misunderstood you somewhere, @Linkgoron

I made a mistake thinking that the connected property isn't special to a child-process which is a Node process, but reading the documentation more carefully, I'm wrong about that one. So I take back what I've stated previously. The spawn event, however, should emit for any type of child-process, and not just Node child-process.

So, I agree that the connected true in the parent seems like a bug, however I still think that the bug is in the child-process, and not the parent itself. I think that the Node process itself, which was already spawned and I assume really is connected to the parent, should buffer the messages until the "main" starts (so it can feed the message to a synchronously added listener in the main), but I'm not sure if that's possible or feasible.

@schamberg97
Copy link
Contributor

schamberg97 commented Mar 18, 2021

@Linkgoron I have been checking the source code in /lib/internal/child_process.js and I also made a mistake, apparently connected should only be set to true when IPC is setup. So you are right, it is also a bug and not a feature, as I thought previously.

I've also been trying different versions of Node.js, the bugs seem to have been present since major 12. Regarding buffering the messages, I agree with your idea, but in a reverse. The messages are getting sent into nowhere, since IPC appears to be unavailable. However, the messages can be potentially buffered in the parent process ;) But I think it'd be too complicated and not really worth the effort

@schamberg97
Copy link
Contributor

I think I've identified the issue. Recompiling node. Let's hope I return with the good news :D

@Linkgoron
Copy link
Member

Linkgoron commented Mar 18, 2021

@Linkgoron I have been checking the source code in /lib/internal/child_process.js and I also made a mistake, apparently connected should only be set to true when IPC is setup. So you are right, it is also a bug and not a feature, as I thought previously.

I've also been trying different versions of Node.js, the bugs seem to have been present since major 12. Regarding buffering the messages, I agree with your idea, but in a reverse. The messages are getting sent into nowhere, since IPC appears to be unavailable. However, the messages can be potentially buffered in the parent process ;) But I think it'd be too complicated and not really worth the effort

@schamberg97 An ipc channel is automatically created when fork is called. The message is received by the child process, but is emitted "too fast", before the listener is added in child.mjs.

I've added console.log('emitted', process.pid, event, message); into function emit(event, message, handle) { in lib/internal/child_process.js (the function above handleMessage).

This is the parent file:

import { fork } from 'child_process';

console.log('parent pid', process.pid)
const subprocess = fork('./child.mjs');
subprocess.on('spawn', () => {
  console.log('spawn was called');
  console.log({
    'parent::subprocess.send': subprocess.send({ hello: 'child' })
  });
});

This is the child:

console.log('child pid:', process.pid);

this is what was printed:

parent pid 93789
spawn was called
{ 'parent::subprocess.send': true }
emitted 93790 message { hello: 'child' }
child pid: 93790

@ErickWendel
Copy link
Member

This problem was solved on #41221

@QNimbus
Copy link

QNimbus commented May 23, 2022

public spawn = (): Promise<void> => {
    return new Promise((resolve, reject) => {
      const cp = spawn(cmd, [], {
        cwd: process.cwd(),
        env: process.env,
        stdio: ['pipe', 'pipe', process.stderr ],        
      });
  
      // Register listeners once the child has spawned
      cp.on('spawn', function () {
        // Resolve promise
        return resolve();
      });

      cp.on('error', (err: Error) => {
        // Reject promise
        return reject(err);
      });
    });
  };

I seem to experience this issue too - the 'error' event never fires and my child process seems to spawn succesfully (judging by the output on my console) but the 'spawn' event also never fires. Or maybe it fires before the event handler is registered, but either way - the above code does not work for me.

I've managed to 'fix' this issue with a bit of a hack - With the absence of the 'error' event getting thrown, I'm running a while-loop until cp.connected == true after which I return the resolved promise:

while (!this._childProcess.connected); return resolve();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. confirmed-bug Issues with confirmed bugs. esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem.
Projects
None yet
Development

No branches or pull requests

6 participants