Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domains: fix handling of uncaught exceptions #3654

Conversation

misterdjules
Copy link

Fix node exiting due to an exception being thrown rather than emitting
an 'uncaughtException' event on the process object when:

  1. no error handler is set on the domain within which an error is thrown
  2. an 'uncaughtException' event listener is set on the process

Also fix an issue where the process would not abort in the proper
function call if an error is thrown within a domain with no error
handler and --abort-on-uncaught-exception is used.

Finally, change the behavior of --abort-on-uncaught-exception so that,
if the domain within which the error is thrown has no error handler, but
a domain further up the domains stack has one, the process will not
abort.

Fixes #3607 and #3653.

@misterdjules misterdjules added the domain Issues and PRs related to the domain subsystem. label Nov 4, 2015
@misterdjules
Copy link
Author

This PR is based on top of #3640. We might want to consolidate both PRs into one, but since this one adds a commit with more substantial changes, I thought I would submit it separately.

Landing #3640 and not this PR creates the problem where the following code:

const domain = require('domain');
d.run(function() { throw new Error('boom'); });

would not make node abort when using --abort-on-uncaught-exception.

So depending on how urgent it is to fix #3607 which #3640 fixes, and how long it would take to review the second commit in this PR, we could either land both commits at once into one commit, or land #3640 first and then this PR.

Once the review for this PR targeted to master is done, I will back port it to the v0.10 and v0.12 branches.

/cc @nodejs/tsc @nodejs/lts @nodejs/collaborators

// let the process know we're using domains
const _domain_flag = process._setupDomainUse(_domain);
const _domain_flag = process._setupDomainUse(_domain, stack);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use camel case, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, yes, although it seems it's not enforced by the linter. Are you saying that I should fix _domain_flag's name to be camel cased? This is not an addition from this PR, so i'd rather focus on the changes that this PR makes and fix that later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @misterdjules

return false;

Local<Value> domain_error_listeners_v =
domain_event_listeners_o->Get(env->error_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: these cases should be indented 4 spaces. the linter not catching this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevnorris It seems the linter is not catching this, but I've updated the PR as suggested. Thanks!

@misterdjules misterdjules force-pushed the fix-abort-on-uncaught-domain-error-handlers branch from 00eea60 to 8c835d2 Compare November 4, 2015 20:34
@misterdjules misterdjules force-pushed the fix-abort-on-uncaught-domain-error-handlers branch from 8c835d2 to c850253 Compare November 4, 2015 20:48
@jasnell
Copy link
Member

jasnell commented Nov 5, 2015

hmm... the change looks good but given that this changes the error handling, this would need to be semver-minor or semver-major wouldn't it? @nodejs/ctc

@misterdjules
Copy link
Author

@jasnell The issue that the first commit in this PR fixes is a regression from node v4.1.2. That is, #3607 cannot be reproduced with node v4.1.2, but can be reproduced since node v4.2.0.

The first commit in this PR fixes that issue, but it also exposed a bug in the way we handle uncaught exceptions within domains when --abort-on-uncaught-exception is used on the command line. The second commit fixes that bug.

The change in behavior is in the following situation:

var domain = require('domain');
var d1 = domain.create();
var d2 = domain.create();

d1.on('error', function() {
});

d1.run(function() {
  d2.run(function() {
    throw new Error('boom');
  }
}

Previously, when using --abort-on-uncaught-exception, this code would abort. Now, it doesn't because we consider that the domain d1 can handle the error.

Note that in the case of an async operation's callback throwing the error:

var domain = require('domain');
var d1 = domain.create();
var d2 = domain.create();

d1.on('error', function() {
});

d1.run(function() {
  d2.run(function() {
    setImmediate(function() {
      throw new Error('boom');
    });
  }
}

running this code with --abort-on-uncaught-exception still aborts because the domains stack at the time the setImmediate callback is called only contains the domain d2, which doesn't have any error handler set.

I agree that this specific behavior change could be a breaking change, but this PR can be easily changed to not change this behavior and still fix the rest of the issues it fixes.

@jasnell
Copy link
Member

jasnell commented Nov 6, 2015

hmm ok. we may need to do split it up that way then... which I know is a pain but we need to be very careful about potentially breaking changes landing in LTS.

@misterdjules
Copy link
Author

@jasnell No worries and thank you for discussing how to come up with a good solution, I don't consider it as being a pain.

Is:

  1. Submitting a PR targeted to v5.x without this behavior change
  2. Cherry-picking the commit in v4.x when it lands in v5.x
  3. Keeping this PR on master so that it can actually land on master with the behavior change (to eventually be in v6.x)

the right way to move forward?

@jasnell
Copy link
Member

jasnell commented Nov 13, 2015

Sorry @misterdjules ... just now saw your response on this one. We might be able to get away with being more lenient on landing the behavior change in v5.x. I would say land in master, cherry pick to v5.x, then cherry pick a subset (without the behavior change) to v4.x. Then, in the next v4.x release, we can document the issue in the known issues and indicate that it's been fixed fully in v5.x and master.

@rvagg @Fishrock123 ... does that seem reasonable?

@misterdjules
Copy link
Author

@jasnell OK, so I will:

  • Submit a new PR targeted to v4.x-staging without the behavior change.
  • Update the PRs on v0.12 and v0.10 to not contain the behavior change.
  • Keep the PR as it is on master (except for squashing the two commits into one).

Then I'll tag that PR on master as "land-on-v5.x" (whatever the proper spelling for that tag is), and we can decide how we want to land it there. I personally don't see any need to land it with the behavior change on v5.x, as long as it lands with the behavior change on master.

The goal being that eventually the behavior of domains used with --abort-on-uncaught-exception is more consistent with when --abort-on-uncaught-exception is not used, but it doesn't need to happen in v5.x.

@jasnell
Copy link
Member

jasnell commented Nov 13, 2015

+1

@misterdjules
Copy link
Author

@jasnell @nodejs/ctc @nodejs/collaborators Squashed into one commit, updated commit message and original description of the PR. Please take a look.

@@ -27,8 +27,13 @@ Object.defineProperty(process, 'domain', {
}
});

// it's possible to enter one domain while already inside
// another one. the stack is each entered domain.
var stack = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use const and properly capitalize the comment while you're here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix!

@misterdjules
Copy link
Author

@DonutEspresso Going through the comments I actually found a few typos that I will fix shortly. Thank you, very much appreciated!

@chrisdickinson
Copy link
Contributor

With the current state of this PR, it would be caught by the tests suite: at least one test would fail if EventEmitter.prototype._events was changed in a way that breaks uncaught exceptions handling. I have the feeling this is enough, and that we would probably need to add a lot of similar comments elsewhere if we add one there. But I don't have a strong opinion, so if you'd still like me to add a comment I'll be happy to do so!

That's totally fair! If the tests break then that should be a good enough signal to go on :)

Other than the missing test case, LGTM!

@misterdjules
Copy link
Author

@chrisdickinson Added a second commit that adds a test in test/parallel/test-domain-throw-error-then-throw-from-uncaught-exception-handler.js. Since it needs to check if a node process actually aborts in one case, and this also needs to be done in other tests, I factored that logic in a new nodeProcessAborted function defined in test/common.js.

A third commit also fixes some typos in the tests' comments.

Can you please take a look, and if these changes look good to you, I'll squash all commits in this PR into one commit.

@chrisdickinson
Copy link
Contributor

LGTM!

@misterdjules misterdjules force-pushed the fix-abort-on-uncaught-domain-error-handlers branch from 375d3a8 to 28ef2a1 Compare December 11, 2015 07:17
@misterdjules
Copy link
Author

@chrisdickinson I've actually changed test/parallel/test-domain-throw-error-then-throw-from-uncaught-exception-handler.js a bit to avoid creating core files, which is what's done in any other similar test.

I took that opportunity to check that the error messages on stderr were consistent with what's expected. It doesn't change the nature/implementation of the test too much though, so I'll consider your LGTM still stands and land that once CI tests results are OK.

Thank you once again!

@misterdjules
Copy link
Author

Tagging as semver-minor as per a previous discussion that happened a while ago in this PR.

@misterdjules misterdjules added the semver-minor PRs that contain new features and should be released in the next minor version. label Dec 11, 2015
@misterdjules
Copy link
Author

Previous CI tests results had a failure on Windows in test/parallel/test-domain-throw-error-then-throw-from-uncaught-exception-handler.js. I pushed an additional commit that fixes the issue, and restarted CI tests.

Fix node exiting due to an exception being thrown rather than emitting
an `'uncaughtException'` event on the process object when:
1. no error handler is set on the domain within which an error is thrown
2. an `'uncaughtException'` event listener is set on the process

Also fix an issue where the process would not abort in the proper
function call if an error is thrown within a domain with no error
handler and `--abort-on-uncaught-exception` is used.

Finally, change the behavior of --abort-on-uncaught-exception so that,
if the domain within which the error is thrown has no error handler, but
a domain further up the domains stack has one, the process will not
abort.

Fixes nodejs#3607 and nodejs#3653.
@misterdjules misterdjules force-pushed the fix-abort-on-uncaught-domain-error-handlers branch from 46f36ac to c76959a Compare December 11, 2015 22:31
@misterdjules
Copy link
Author

@nodejs/ctc CI tests all passed, except for existing flaky tests. Will land asap.

misterdjules pushed a commit that referenced this pull request Dec 11, 2015
Fix node exiting due to an exception being thrown rather than emitting
an `'uncaughtException'` event on the process object when:
1. no error handler is set on the domain within which an error is thrown
2. an `'uncaughtException'` event listener is set on the process

Also fix an issue where the process would not abort in the proper
function call if an error is thrown within a domain with no error
handler and `--abort-on-uncaught-exception` is used.

Finally, change the behavior of --abort-on-uncaught-exception so that,
if the domain within which the error is thrown has no error handler, but
a domain further up the domains stack has one, the process will not
abort.

Fixes #3607 and #3653.

PR: #3654
PR-URL: #3654
Reviewed-By: Chris Dickinson <chris@neversaw.us>
@misterdjules
Copy link
Author

Landed in 425a354.

misterdjules pushed a commit that referenced this pull request Dec 15, 2015
Fix node exiting due to an exception being thrown rather than emitting
an `'uncaughtException'` event on the process object when:
1. no error handler is set on the domain within which an error is thrown
2. an `'uncaughtException'` event listener is set on the process

Also fix an issue where the process would not abort in the proper
function call if an error is thrown within a domain with no error
handler and `--abort-on-uncaught-exception` is used.

Finally, change the behavior of --abort-on-uncaught-exception so that,
if the domain within which the error is thrown has no error handler, but
a domain further up the domains stack has one, the process will not
abort.

Fixes #3607 and #3653.

PR: #3654
PR-URL: #3654
Reviewed-By: Chris Dickinson <chris@neversaw.us>
cjihrig added a commit that referenced this pull request Dec 15, 2015
Notable changes:

* domains:
  - Fix handling of uncaught exceptions.
    (Julien Gilli) #3654.
* https:
  - Added support for disabling session caching.
    (Fedor Indutny) #4252.
* repl:
  - Allow third party modules to be imported using
    require(). This corrects a regression from 5.2.0.
    (Ben Noordhuis) #4215.
* deps:
  - Upgrade libuv to 1.8.0.
    (Saúl Ibarra Corretgé) #4276.

PR-URL: #4281
cjihrig added a commit that referenced this pull request Dec 16, 2015
Notable changes:

* buffer:
  - Buffer.prototype.includes() has been added to keep parity
    with TypedArrays. (Alexander Martin) #3567.
* domains:
  - Fix handling of uncaught exceptions.
    (Julien Gilli) #3654.
* https:
  - Added support for disabling session caching.
    (Fedor Indutny) #4252.
* repl:
  - Allow third party modules to be imported using
    require(). This corrects a regression from 5.2.0.
    (Ben Noordhuis) #4215.
* deps:
  - Upgrade libuv to 1.8.0.
    (Saúl Ibarra Corretgé) #4276.

PR-URL: #4281
cjihrig added a commit that referenced this pull request Dec 16, 2015
Notable changes:

* buffer:
  - Buffer.prototype.includes() has been added to keep parity
    with TypedArrays. (Alexander Martin) #3567.
* domains:
  - Fix handling of uncaught exceptions.
    (Julien Gilli) #3654.
* https:
  - Added support for disabling session caching.
    (Fedor Indutny) #4252.
* repl:
  - Allow third party modules to be imported using
    require(). This corrects a regression from 5.2.0.
    (Ben Noordhuis) #4215.
* deps:
  - Upgrade libuv to 1.8.0.
    (Saúl Ibarra Corretgé) #4276.

PR-URL: #4281

Conflicts:
	src/node_version.h
@saper
Copy link

saper commented Jan 6, 2016

Everyone: do you think sass/node-sass#1283 is instance of this bug?

@misterdjules
Copy link
Author

@saper Do you mean:

  1. Was Exit status 3221225477 returned from node scripts/build.js / Windows XP sass/node-sass#1283 fixed by this PR?
  2. Was Exit status 3221225477 returned from node scripts/build.js / Windows XP sass/node-sass#1283 introduced by this PR?

I don't see how it could be 1), but it could be 2), although it seems unlikely. Ideally we would have a callstack at the time the process aborted. But that's a discussion that should happen in sass/node-sass#1283 first.

@saper
Copy link

saper commented Jan 6, 2016

I haven't checked yet which versions of node this patch is in, so I don't know whether it's 1 or 2. We started getting those reports only very recently (and it seems we have great deal of Windows XP users #3604 ). I will try to reproduce that myself on the XP box.

@saper
Copy link

saper commented Jan 6, 2016

Sorry for noise, it looks like it is a regression in our code.

scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Fix node exiting due to an exception being thrown rather than emitting
an `'uncaughtException'` event on the process object when:
1. no error handler is set on the domain within which an error is thrown
2. an `'uncaughtException'` event listener is set on the process

Also fix an issue where the process would not abort in the proper
function call if an error is thrown within a domain with no error
handler and `--abort-on-uncaught-exception` is used.

Finally, change the behavior of --abort-on-uncaught-exception so that,
if the domain within which the error is thrown has no error handler, but
a domain further up the domains stack has one, the process will not
abort.

Fixes nodejs#3607 and nodejs#3653.

PR: nodejs#3654
PR-URL: nodejs#3654
Reviewed-By: Chris Dickinson <chris@neversaw.us>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Notable changes:

* buffer:
  - Buffer.prototype.includes() has been added to keep parity
    with TypedArrays. (Alexander Martin) nodejs#3567.
* domains:
  - Fix handling of uncaught exceptions.
    (Julien Gilli) nodejs#3654.
* https:
  - Added support for disabling session caching.
    (Fedor Indutny) nodejs#4252.
* repl:
  - Allow third party modules to be imported using
    require(). This corrects a regression from 5.2.0.
    (Ben Noordhuis) nodejs#4215.
* deps:
  - Upgrade libuv to 1.8.0.
    (Saúl Ibarra Corretgé) nodejs#4276.

PR-URL: nodejs#4281

Conflicts:
	src/node_version.h
@misterdjules misterdjules deleted the fix-abort-on-uncaught-domain-error-handlers branch July 24, 2017 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain Issues and PRs related to the domain subsystem. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Domains inhibit uncaughtException error handling
10 participants