Skip to content
This repository has been archived by the owner on Sep 26, 2019. It is now read-only.

RFC: adding built-in error handling support to zones #9

Open
domenic opened this issue Apr 14, 2016 · 58 comments
Open

RFC: adding built-in error handling support to zones #9

domenic opened this issue Apr 14, 2016 · 58 comments

Comments

@domenic
Copy link
Owner

domenic commented Apr 14, 2016

This is a request for comments, especially from the Node community (/cc @bmeck @trevnorris; please bring in others). Our original plan of leaving error handling out of the base zone primitive, and letting it be handled entirely by host environments or future spec extensions, is starting to show some cracks. This is mainly because it impacts the recommendations around how to "wrap" functions for zones. We would like the base zones proposal to have a strong recommendation: "to properly propagate zones while doing async work, do X."

The problem

Currently our recommendation for how to propagate zones is "use Zone.current.wrap", but @mhevery has shown me how that doesn't quite work in certain scenarios once error handling is introduced. In most cases of user-mode queuing, you want wrapped functions to send any errors to the relevant zone. However, in some cases, mainly when building control flow abstractions like promise libraries, you want to handle errors yourself.

So, if we later introduced error handling, some fraction of the uses of Zone.current.wrap would be wrong.

Similar problems apply to zone.run, since one of the important uses of run for library authors is to manually do wrapping (i.e., store the callback and current zone, then do storedZone.run(storedCallback), instead of storing the wrapped callback).

The solution

The way to fix this is to make the two use cases explicit. Instead of wrap and run, we have wrap, wrapGuarded, run, and runGuarded, where the non-guarded variants are used when the library explicitly wants to handle thrown exceptions: like for implementing promises or observables or similar, where you transform thrown exceptions into a different form.

However, it's pretty pointless to introduce these two functions if the zones proposal doesn't actually have error handling built-in. So this takes us down the path of introducing error handling into the base zone primitive, instead of saving it for a future extension.

In other words, keeping things in a future extension is fine, unless that requires you to change your code now. If you require people to change their code now, you might as well give them the benefits (zone error handling) that you're asking them to pay for.

Details of error handling proposal

The zone fork options get a handleError option, which takes a thrown error object:

const z = Zone.current.fork({
  name: 'http request zone',
  handleError(e) {
    sendAnalytics(e);
    return true; // error handled. Or, should we just make you rethrow to indicate not-handled?
  }
});

This is pretty simple. The trick is then figuring out when and how we should route errors to the error handler. To discuss that, we need to talk about the “guarded” functions introduced above.

Details of {wrap,run}{Guarded}

Currently, we have (essentially)

Zone.prototype.run = function (f) { // current
  const oldZone = Zone.current;
  setCurrentZone(this); // privileged API
  try {
    return f();
  } finally {
    setCurrentZone(oldZone);
  }
};

Zone.prototype.wrap = function (f) {
  const thisZone = this;
  return function () {
    return thisZone.run(() => f.apply(this, arguments));
  };
};

We would then introduce:

Zone.prototype.runGuarded = function (f) { // new
  const oldZone = Zone.current;
  setCurrentZone(this); // privileged API
  try {
    return f();
  } catch (e) {
    // actually stored in an internal slot, not a _-prefixed property
    if (!this._handleError || !this._handleError(e)) {
      throw e;
    }
  } finally {
    setCurrentZone(oldZone);
  }
};

Zone.prototype.wrapGuarded = function (f) {
  const thisZone = this;
  return function () {
    return thisZone.runGuarded(() => f.apply(this, arguments));
  };
};

How to think about these

The TL;DR is: most libraries use wrapGuarded. Most apps use run.

In a bit more detail: the user code and the library collaborate to figure out how errors are handled, with the following inputs:

  • Does the library know how to handle errors? If so, it will use wrap (or run), in combination with a try/catch at the call site. Otherwise (most cases, like setTimeout or event listeners), it will use wrapGuarded (or runGuarded), to say: “I don’t know how to handle the error, and if I let it propagate up the call site it would just immediately reach top level, so instead let’s route errors to the right zone.”
  • Does the user code want the zone to handle errors? If so, it will use runGuarded to kick things off. Otherwise, it will use run, and let the error propagate up the call stack as usual. Using runGuarded in this way is fairly unusual; it’s a kind of “async try/catch” and programs are generally better served by letting the error bubble.

Example in action

This example shows how, if you follow the TL;DR above, everything “just works”:

/////// LIBRARY CODE

sql.addListener = function (f) {
  this.storedFunction = Zone.current.wrapGuarded(f);
};

sql.doStuff = function () {
  this.storedFunction();
};

/////// APP CODE

const rootZone = Zone.current;

const zone1 = rootZone.fork({
  handleError: handleError1
});
const zone2 = rootZone.fork({
  handleError: handleError2
});

zone1.run(function a() {
  sql.addListener(function b() {
    rootZone.run(function c() {
      throw new Error("boo");
    });
  });
});

zone2.run(function d() {
  setTimeout(function e() {
    sql.doStuff();
  }, 0);
});

At the time the error is thrown, the call stack is:

  • c (top)
  • rootZone.run
  • b
  • this.storedFunction (wrapper around b to run it in zone1)
  • sql.doStuff
  • `e
  • wrapper around e to run it in zone2 (generated by setTimeout)
  • setTimeout task (bottom)

Error propagation and handling then occurs like so:

  • The error is not caught by rootZone.run (run does not handle errors at all).
  • The error is next caught by this.storedFunction, which is a wrapper around b to run it guarded in zone1. That sends it to errorHandler1.
  • If errorHandler1 doesn't return true, the error is next caught by the wrapper around e to run it in zone2. So it's sent to errorHandler2.
  • If even errorHandler2 doesn't try it, we call the error unhandled, and it goes to window.onerror or "uncaughtException" as usual.

Issues to discuss

Most importantly: does this sound like something that is acceptable to potential zone-using communities? We’d like to have everyone on board, and we spent a lot of time trying to get the details right here (drawing on things like the domain module postmortem from Node.js), so hopefully it’s not that bad.

Less important issues:

  • How should unhandled/handled be done? I did returning true for handled, false for unhandled, but I think another plausible design is that it’s handled by default, and you rethrow the error if you want it to be unhandled. (Assume for the purposes of discussion that we properly specify Error.prototype.stack as being captured at error creation time, not at error throwing time, so rethrowing does not hurt stack information.)
  • Should we consider only adding run + wrapGuarded, since those are the “happy path”? I think it’s weird to have an asymmetric API like that, and there are definitely use cases for the other two. But on the other hand, you can derive runGuarded from wrapGuarded, and derive wrap from run, so it’s not necessary.
@indexzero
Copy link

This is a lot to digest on a Friday afternoon (when I saw this). I'm back in NYC now, perhaps discuss next week IRL? As you mentioned this sounds a lot like domains which basically amounts to:

morpheus-dream

... hoping we can avoid those pitfalls here.

@trevnorris
Copy link

Landed the "official" domain postmortem today (nodejs/node@4a74fc9). It has additional information and examples. I'm in the middle of writing a post for this issue that has Zone specific usage. There are some key differences that avoid some of the pitfalls of domains, and I'm still investigating how other different features would be effected.

@trevnorris
Copy link

trevnorris commented Apr 25, 2016

Been writing up examples to make sure I understand the implications of the API, and have some questions before I can give a full assessment.

How are implementers supposed to know whether to run the callback in run() or runGuarded()? Currently propagating the correct Zone and then calling the associated callback within the zone is the responsibility of the implementer. But there's no indication I can see that will tell the user how the callback is supposed to be run. For example:

function DoStuff() {
  this._queue = [];
}

DoStuff.prototype.addone = function addone(fn) {
  this._queue.push({ zone: Zone.current, callback: fn });
};

DoStuff.prototype.runall = function runall() {
  while (this._queue.length > 0) {
    const item = this._queue.shift();
    const zone = item.zone;
    const callback = item.callback;
    zone.run(callback);  // Or should this be runGuarded?
  }
};

const ds = new DoStuff();

Zone.current.fork({ name: 'z1', handleError: (e) => console.error(e.message) })
    .run(() => ds.addone(() => console.log('d1')));

Zone.current.fork({ name: 'z2' })
    .runGuarded(() => ds.addone(() => console.log('d1')));

Here z1 has a handleError, but only calls using run(). So how is the implementer to know which to use? I can't see this defined anywhere.

@trevnorris
Copy link

It should also be mentioned that error routing is one reason why domains failed. Users are allowed to create a private object that prevents bubbling of other's exceptions.

///// LIBRARY /////

const libZone = new Zone({
    name: 'libZone',
    handleError(e) {
      // Log error and continue happily.
      return true;
    },
});

function ConnectDB() {
  // Make connection to database.
}

ConnectDB.prototype.query = function query(options, callback) {
  const obj = { options, callback, zone: Zone.current };
  libZone.fork().run(() => makeRequest({ obj, completed }));
};

function completed(obj, data) {
  const result = parseData(data);  // Uh oh. Bad code here that throws.
  obj.zone.run(() => obj.callback(result));
}

///// USER CODE /////

const zone = Zone.current.fork({ /* handleError 'n stuff */ });
const db = new ConnectDB();

zone.run(() => db.query({ /* options */ }, callback));

The difference between this and adding try/catch is that calling the user's callback from the handleError() is probably impossible (or even wanted?). Meaning the callback will never be called to let the user know their query completed. This leads to frustrating debugging scenarios.

You may say that the implementer is Doing It Wrong, and I'd agree with you. But the fact remains that it will be used like this, as were domains, and may possibly leave a bad taste in the developer's mouth preventing them from using it in the future, as it happened with domains.

In short, there are a lot of ways error routing can be abused by the user, and it's impossible to create an API that prevents it. Theoretically error routing is a great idea, but domains already went through that gauntlet and failed.

@bmeck
Copy link

bmeck commented Apr 27, 2016

@domenic just to be clear guarded functions are not applied to CallInZone? So promise reactions won't add a try/catch?

@domenic
Copy link
Owner Author

domenic commented May 11, 2016

@trevnorris @bmeck sorry for the massive delay in response. I will be a lot more responsive going forward; many apologies.

How are implementers supposed to know whether to run the callback in run() or runGuarded()?

Basically: are you going to handle errors thrown by your callback with a try/catch, like a promise or observable library? Use run(). Otherwise, if your errors are just going to go to uncaughtException anyway, you should use runGuarded(), to let the appropriate zone get a crack at them.

But there's no indication I can see that will tell the user how the callback is supposed to be run. For example:

Whether you should use zone.run() or zone.runGuarded() here depends on how you expect ds.runall() to be used, but I would expect it to be runGuarded().

The situation where run() is appropriate is if you think callers of runall expect:

  • Any errors in any of the queued callbacks should stop all future callbacks in that runall from executing
  • And such errors will be propagated, so you can do try { ds.runall(); } catch (e) { ... } to catch that first-thrown-error.

Given that "DoStuff" is not a very real-world class, I can't be sure that your intention isn't the above, but I'd doubt it. More likely, runGuarded() is appropriate, as callers would not be expected to try/catch the runall() call, and callers probably prefer if execution continued without stopping on first throwing callback.

Here z1 has a handleError, but only calls using run(). So how is the implementer to know which to use? I can't see this defined anywhere.

Sorry if this wasn't clear. I tried to make it so, but it probably got lost in the wall of text.

You may say that the implementer is Doing It Wrong, and I'd agree with you. But the fact remains that it will be used like this, as were domains, and may possibly leave a bad taste in the developer's mouth preventing them from using it in the future, as it happened with domains.

In short, there are a lot of ways error routing can be abused by the user, and it's impossible to create an API that prevents it. Theoretically error routing is a great idea, but domains already went through that gauntlet and failed.

Yes, I'm aware of this concern, and that's the primary reason I wanted to ask Node people to review it. My position is that doing error routing correctly is indeed tricky and, in subtle cases, can definitely be done wrong. But I hope that simple advice will help, and coupled with fixing other issues domains have, it would still be worthwhile.

The alternative is to abandon error routing in zones entirely, which seems like a missed opportunity. If we want to add zones for all the other reasons (zone-local storage, async stack trace marking, async task tracking) then ideally we should be able to also solve the "something better than uncaughtException" problem at the same time, given all the machinery already in place. The problem is the extra cost of 4 methods (run/runGuarded + wrap/wrapGuarded) instead of 2, which I think is manageable, but wanted to see what others thought.


@bmeck

domenic just to be clear guarded functions are not applied to CallInZone? So promise reactions won't add a try/catch?

I don't really understand this question. What does "applied to CallInZone" mean? What does it promise reactions adding a try/catch mean?

@bmeck
Copy link

bmeck commented May 12, 2016

@domenic in your example function e is guarded by zone2 even though it is invoked via setTimeout, however I don't see how:

Zone.prototype.runGuarded = function (f) { // new
  const oldZone = Zone.current;
  setCurrentZone(this); // privileged API
  try {
    return f();
  } catch (e) {
    // actually stored in an internal slot, not a _-prefixed property
    if (!this._handleError || !this._handleError(e)) {
      throw e;
    }
  } finally {
    setCurrentZone(oldZone);
  }
};

Zone.prototype.wrapGuarded = function (f) {
  const thisZone = this;
  return function () {
    return thisZone.runGuarded(() => f.apply(this, arguments));
  };
};

Is enforcing the guard. Does this mean, all async queued tasks would fire CallInZone with a guarded vs unguarded flag like I mocked in https://gist.github.com/bmeck/05957f8721e9f41039fbb0f321fe943a ?

Essentially the question is, does the following code get a guard?

const zone1 = rootZone.fork({
  handleError: handleError1
});
zone1.run(function a() {
  Promise.resolve(0).then(function b() {
    // am I guarded by rootZone?
    throw Error();
  })
});

And does the following continue to get a guard?

const zone1 = rootZone.fork({
  handleError: handleError1
});
zone1.runGuarded(function a() {
  Promise.resolve(0).then(function b() {
    // am I guarded by zone1?
    throw Error();
  });
});

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

in your example function e is guarded by zone2 even though it is invoked via setTimeout, however I don't see how:

setTimeout uses the equivalent of wrapGuarded (or, stores the current zone at the time it's called, then uses runGuarded in that zone).

Essentially the question is, does the following code get a guard?

setTimeout and Promise.prototype.then are different, so this is a different question than your above one.

Promise.prototype.then knows that when the passed function is called, the promise machinery will handle the errors. So (essentially) it uses wrap, not wrapGuarded.

This the key difference I'm trying to get across in the post. If your library wants to catch errors and process them specially, like a promise implementation does, then use wrap/run. If you don't, then use wrapGuarded/runGuarded.

@mhevery
Copy link

mhevery commented May 12, 2016

The question of run/wrap{Guarded} is simply: Can the code do anything useful with the exception?

In the case of the library, only Promise and Observable have special semantics for exceptions and so they are the only libraries I can think of which should use wrapUnGuarded all other use cases should use wrapGuarded

In the case of application code entering zone, the application code should handle/fail fast on exception, and so there is almost never a reason to use runGuarded. All cases I can think of should be runUnGuarded.

So the rule of thumb is wrapGuarded / runGuarded for libraries, and run for application code entering zone. (Only exception to it is Promise and Observable)

@bmeck
Copy link

bmeck commented May 12, 2016

setTimeout uses the equivalent of wrapGuarded (or, stores the current zone at the time it's called, then uses runGuarded in that zone).

Should be using wrap, it is lib code to my understanding.

This the key difference I'm trying to get across in the post. If your library wants to catch errors and process them specially, like a promise implementation does, then use wrap/run. If you don't, then use wrapGuarded/runGuarded.

This is very confusing, use wrapGuarded if you want errors handled, but not specially?

The fact that setTimeout magically gets guarded can lead to more swallowed errors that are unsafe since the code being swallowed may have turned into corrupted internal state. This corrupted state generally causes a throw but could then be silenced accidentally. I think a flag is needed somehow to delineate what a behavior vs system error is for this not to have the flaws of domains.

I see no reason this cannot be implemented later? TC39 has this marked as ready to move to stage 1, this feature of error handling is not ready to be stage 1 as it basically is domains including the flaws of them (infectious behavior is a big one).

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

Should be using wrap, it is lib code to my understanding.

No; I'm not sure where you got this "library code should be using wrap" idea from. wrapGuarded is often more appropriate.

This is very confusing, use wrapGuarded if you want errors handled, but not specially?

No. Use wrapGuarded if you don't want to handle errors with try/catch (the guard will handle them).

The fact that setTimeout magically gets guarded can lead to more swallowed errors that are unsafe since the code being swallowed may have turned into corrupted internal state.

This is not true. The behavior of setTimeout is exactly the same here as it is in the existing specs (where setTimeout calls to report the exception). That is all that wrapGuarded does: it says, invoke the thing, but if it throws an error, hand it to the current zone for exception handling.

The root zone's exception handling is just HTML's "report the exception" (or Node's uncaughtException), so unless you introduce intermediate zones with their own error handling, that will always be triggered. There is no swallowing.

I see no reason this cannot be implemented later?

Hmm, I thought I tried to make that extremely clear in the OP. It needs to be implemented now since it impacts how people will use zones:

So, if we later introduced error handling, some fraction of the uses of Zone.current.wrap would be wrong.

TC39 has this marked as ready to move to stage 1, this feature of error handling is not ready to be stage 1 as it basically is domains including the flaws of them (infectious behavior is a big one).

Well, this RFC is trying to get some buy-in on the error handling feature being ready. I think you must still be misunderstanding it if you think it is basically domains, and that it has infectious behavior. Please continue asking questions until I can make it clearer.

@trevnorris
Copy link

I think you must still be misunderstanding it if you think it is basically domains, and that it has infectious behavior.

TBH this has worse implications in some cases than domains, if I'm understanding the spec correctly. With domains if I require within a domain it's still possible to escape. e.g.

var my_module;
domain.create().run(() => my_module = require('./my_module'));

// my_module.js
if (process.domain)  // wtf?
  process.domain = null;

Whereas I can't see a way to get around this with Zones. Nor would I be able to see that I'm not using the root Zone (meaning there'd be no .parent). e.g.

// Don't want modules knowing this isn't the root zone.
const z = new Zone({
  name: '(root zone)',
  handleError(e) {
    // Nothing should die
    return true;
  }
});

var my_module;
z.runGuarded(() => my_module = require('./my_module'));

Now, the real problem here is that a module is only ever evaluated once. So if my_module was required somewhere else after this there'd be unexpected side effects.

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

Hmm, maybe I am not understanding. You can just check if Zone.current.parent === null or not.

It's true that if your caller opts you in to running in a zone, you can't escape. But this is similar to any of the other ways in which your caller can mess up your environment (messing with globals or require.cache; try/catching around the code and doing weird things with the exceptions; etc.). I don't think zones (or domains) were designed to be used in a combative environment.

@bmeck
Copy link

bmeck commented May 12, 2016

In the example given we have:

    rootZone.run(function c() {
      throw new Error("boo");
    });
  1. rootZone has no parent
  2. the rethrow is still caught by zone1

The more I look at this the more I think the problem is the opt-out requirement for unsafe errors. By default errors tend towards unsafe. The example here shows a potentially unsafe error being swallowed. In order to opt-out (saying your code is not 100% safe), you should use zone.run, but here we see an example that your code is treated as safe still.

I personally would think errors default to unsafe to catch, and you would need to be very specific about what errors are able to be handled. Though lets set that aside for a bit.

An error is being caught, implicitly that was not marked as being safe to catch; this is mentioned in the domains post mortem. That is an infection. JS has a problem with try{}catch(e){} being a catch all, where you must opt-in to rethrow an error if it is not handled. This problem was highlighted in an async environment like the ones Zone/Domains aim to work with since shared mutable state is commonplace in JS. Having the implicit catch be a catch all rather than the throw location denoting safety to catch makes this worse.

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

the rethrow is still caught by zone1

This is just not true...

@bmeck
Copy link

bmeck commented May 12, 2016

Error propagation and handling then occurs like so:

The error is caught by rootZone.run. rootZone has no error handler, so rethrow the error.
The error is next caught by this.storedFunction, which is a wrapper around b to run it guarded in zone1. > That sends it to errorHandler1.

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

Yeah, looks like that is a typo. It should be "The error is not caught by rootZone.run". I will correct the OP.

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

Looking at the rest of your post, there still seems to be a misconception. You think the try/catch is infectious and catches things unsafely. As I tried to explain, all it does is send things to the zone's error handler---which in the default case is just uncaughtException.

Zones allow overriding that default behavior inside a zone, so that instead of one global mutable state (the uncaughtException handler chain) you can scope it to smaller parts of your program which know how to deal with the errors in a more natural way. Of course, someone could write a pathological handler that ignores all errors---but they could do that for process.on("uncaughtException" too. The difference is with zones the damage they do would be limited to code that they themselves run, instead of global to the whole program.

@bmeck
Copy link

bmeck commented May 12, 2016

I am not so concerned with pathological ignoring of errors, but accidental errors that cause invalid state to be treated as safe.

@bmeck
Copy link

bmeck commented May 12, 2016

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

Sure. As I said, all zones do is give people a solution that is less footgun prone since it's scoped to their own code, instead of globally shared by the entire program. This seems like a good thing?

@bmeck
Copy link

bmeck commented May 12, 2016

I don't see how it is confined to their own code. Right above we see zone1 start to handle function c, so how is that isolated?

Anything on the stack can catch it still. No concurrency/memory isolation. In your example, programmers need to carefully disseminate in handleError1 and handleError2 if the error is expected; they do not know if function c is in a good state / what it has done. They also do not know if that state is going to affect other concurrent things that are accessing variables that function c may have left in a bad state.

In the example, function c is isolated as best it can be with rootZone.run but zone1 may have no care if it is in a right state. That knowledge is left up to the programmer who may not even own the code for function c knowing that function c is still in a good state, even though it just threw an Error.

In synchronous situations we have the same problem using try{}catch(e){}. Domains and Zones try to introduce an implicit behavior that is not present in those cases by following the async stacks. Once you are in a zone, you are basically there for life. Just like with Promises, safely bailing from unexpected errors means: wrap a try{}catch(e){} around things and abort if the error is a completely unexpected. Rethrow if expected.

@trevnorris :

In order to escape to let unexpected errors escape, you actually need to do:

const z = new Zone();
// this will be like z.runGuarded(c)
z.run(()=>setImmediate(c));

This only works because Zones are not doing stack propagation like domains. I don't really see a way to get to "unguarded" guaranteed except this. If this is done however, you will need to wrap c in a try catch if there are handle-able errors that you wish to not abort.

@trevnorris
Copy link

trevnorris commented May 12, 2016

@domenic Could you clarify one thing for me:

const zone1 = Zone.current.fork();
zone1.run(() => net.createServer((c) => {
  const zone2 = Zone.current.fork();
  zone2.run(() => {
    c.on('data', () => {
      // Zone.current == zone1 or zone2?
    });
  });
}).listen(8080));

@domenic
Copy link
Owner Author

domenic commented May 12, 2016

@bmeck will try to respond to the rest of your comment, but again you seem to misunderstand the proposal.

In order to escape to let unexpected errors escape, you actually need to do:

This is just false. z.run(c) suffices. Errors escape. Look at the implementation of z.run. It does not have any catch blocks.

@trevnorris

Could you clarify one thing for me that's unclear:

In that example Zone.current is zone2. (Unlike domains, there is no special idea of event emitters being bound to a zone. .on stores the passed callback alongside the zone that is current when .on is called, which is zone2.)

@bmeck
Copy link

bmeck commented May 12, 2016

@domenic I can see that rootZone.run(function c() ... does not introduce a try catch. I, however, do see that zone1 catches function c's throw. That is the problem, that is the implicit behavior mentioned in the post mortem. Please reread the post mortem. Focus on Implicit Behavior and Resource Cleanup on Exception.

An alternative is not to re-use the throw statement. Me and @trevnorris were talking about an opt-in "this is safe to handle" mechanism. I am not entirely sure what this would look like, Zone.current.throw however does look sane to avoid the implicit catching of unexpected errors if it were to only propagate through that zone's .parent chain.

@bmeck
Copy link

bmeck commented May 13, 2016

@domenic this may be clearer https://github.com/bmeck/zomains/blob/master/example/accidental-guard.js , backing implementation should match my understanding of Zone's behavior.

@mhevery
Copy link

mhevery commented May 13, 2016

@bmeck let me see if I can shine some light.

'use strict';
require('../');
const UnexpectedError = require('./COMMON').UnexpectedError;
let tocall = 1;
process.on('exit', ()=>{
  if (tocall !== 0) {
    process.reallyExit(1);
  }
});
const child = Zone.current.fork({
  handleError() {
    tocall--;
  }
});
child.run(()=>{
  setTimeout(()=>{
    throw Error('Now handled?!');
  })
});

I think the confusion is about who handles the Error. So if setTimeout throws an error synchronously then it would be not handled by child because child got entered by run. setTimeout could throw synchronously if for example the arguments to it would be incorrect.

(SIDENOTE, There is some other zone above the child. In this example we can assume that it is the root zone, so the synchronous error would propagate to root zone which would forward to uncaughtException. Why? because each VM turn is wrapped in runGuarded)

But in the above example the callback of setTimeout is invoked asynchronously. This means that internally when setTimeout stored the callback it used wrapGuarded, which means that when the callback is invoked it will be invoked as child.runGuarded(() => throw Error(...)). This means that the error will be presented to the child handleError which will tocall-- (and swallow the error. because it did not rethrow)

I believe that is the correct behavior?

@bmeck
Copy link

bmeck commented May 13, 2016

@mhevery it actually does not swallow since it doesn't return true. The code does run in node v4+, and you can see the propagation mechanism in https://github.com/bmeck/zomains/blob/master/example/stop-propagation.js

@domenic
Copy link
Owner Author

domenic commented May 13, 2016

(and swallow the error. because it did not rethrow)

This is still under debate; the OP's semantics are that you have to return true (or maybe a truthy value?) to mark the error "handled." I had a question under "Issues to discuss" to ask whether we should use true, or rethrow.

Otherwise, @mhevery's explanation is on point. I'm not sure what the linked example was meant to be illustrating, though. I'm also not sure what function c @bmeck's previous post was referring to.

I certainly have read the postmortem, multiple times. I am not sure what your contention is. Is it that neither zones nor domains are perfect for resource cleanup? Yes, this is pretty clear; you can mess up with both (and with promises). A scenario like in the postmortem is complex, and cleaning up after it is complex. The "implicit behavior" section is also fairly inapplicable. There is no global mutable domain.enter() that overrides for all future code; you have to explicitly run code in a zone with zone.run or zone.runGuarded, and as explained upthread, the impact is limited.

@bmeck
Copy link

bmeck commented May 14, 2016

@domenic the fact that Zones are re-enabling guards whenever async thresholds are crossed, even when the original stack was not placing guards is an issue. Another issue is that, when these guards are in place there is not a concept of an unexpected error vs expected error. If the guards could be limited in some way to handle these I would not really have complaints.

I'm going to change from examples to tables perhaps as another means of discussion:

Zones Domains
Notification Sync Sync
Guarding Automatic Automatic
Escape Partial [1] Full [2]
Default w/o explicit Guard Throw Throw
Default w/ explicit Guard Throw Swallow
Propagation Interruptible All Handlers

[1] - cannot escape parent Scope's guard synchronously
[2] - dispose, manual EE removal, cannot manually remove callbacks

It is the automatic guarding without a full escape mechanism that seriously hoses my concept of this being a good idea. There is no means to skip the zones and propagate directly to the environment synchronously. There is a severe distrust of things that take this approach after the advice and documentation of Domains having little to no effect on people accidentally catching errors. If there was a more robust error handling mechanism / routing I would be less worried. Right now it is better than domains, but still suffers from one of the biggest troubles caused by them.

On a personal note, I think guarding should not be automatic and people should need to guard at task they queue due to concurrency and shared access concerns.

@domenic
Copy link
Owner Author

domenic commented May 14, 2016

It may help to recall that this is a language feature and needs to be able to serve use cases in many environments---both Node and the browser. Being maximally inclusive is the best path for foundational language features in this way.

@bmeck
Copy link

bmeck commented May 14, 2016

I think error handling needs a closer look than merely enabling all use cases. However, I don't think discussion of domains or other existing attempts and the experience of problems they introduce is going to be relevant here if they are seen as paternalistic arguments.

If the goal is to change how you can contain errors, and encourage behavior similar to uncaughtException in all possible situations, I agree that this would enable such behavior. However, my experiences would lead me to avoid this attempt when defaulting the behavior as described and not having an escape hatch.

If such experience is irrelevant, I don't really have anything to say / am not sure what I was being asked to come and look at?

@mhevery
Copy link

mhevery commented May 16, 2016

@bmeck I think your experience with Domains is valuable, and we do want your input. It feels to me that we don't understand each other proposals and shortcomings. Is there any way we could have a face to face / Video meeting?

@trevnorris
Copy link

As I said, this is no different from today's uncaughtException; it simply allows tighter scoping so that you don't catch and prevent propagation of all exceptions.

It should be clarified for posterity that Zones and uncaughtException are different in important ways. First is our ability to be notified when an uncaughtException handler has been set:

process.on('newListener', (e) => print(e, (new Error()).stack.substr(5)));

process.on('uncaughtException', () => {});

While with Zones there is no mechanism that allows observation into Zone creation. The second important distinction is that all uncaughtException handlers can be escaped at any time with a simple

process.removeAllListeners('uncaughtException');

The third difference is that uncaughtException is a global toggle. Unlike Zones which propagate via API calls and asynchronous call stacks. So as far as debugging an application in development, the observability (via public API) of what's happening with uncaughtException make it easier to reason with.

Just like anything else a caller can do to control your execution (modify globals, try/catch, etc.), the way to avoid such things if your code doesn't want to run under them is...

To quickly address the later part of this statement I'd like to point out this demonstrates avoidance of the reality of node's ecosystem. If I don't want to use a Promise-based API, that's easy enough, but Zones propagate and attach themselves to everything automatically. First I'll take the former remark about try/catch as an example:

// mine.js
const Theirs = require('./theirs');
const t = new Theirs();
t.on('complete', () => {});
t.getData();

// theirs.js
const EE = require('events');
const req_caller = require('req_caller');
class Theirs extends EE {
  constructor() { super() }
  getData() {
    try {
      req_caller(info, (data) => this.emit('complete', data));
    } catch (e) {
      // oh no!!! something went wrong in req_caller()
    }
  }
};
module.exports = Theirs;

Here we can see that the try/catch wrapping req_caller() in case a bad argument was passed, or some such. Allowing the execution of the complete callback to happen outside its confines. Unfortunately a Zone error handler would propagate through the req_caller() call and potentially swallow any errors thrown by the application. In the case of domains and uncaughtException it's possible to synchronously escape any error handling. I don't see how that'd be possible with Zones.

I understand what you're saying about each Zone error handler needing to explicitly return true, thus disregarding the potentially problematic behavior of indefinitely propagating Zones through async calls. I'd like to convey that history has shown developers aren't rigorous enough to properly clean up and needing to place a return true; at the bottom of the function isn't enough to make them start.

Combine that with Zone's design to scope functionality (generally a good thing) removes the ability to observe how my own errors are being handled. Going back to the comment about just not using code that uses Zones, I assume you are proposing this in order for as much code as possible to use them, and in the not so uncommon case of having 50+ dependencies/sub-dependencies it will complicate debugging when one of those thinks it's clever to do:

// a.js
const z = new Zone();
let modb;
// First module to require('b'), now a owns it.
z.run(() => modb = require('b'));

When the application uses b it probably expects there's nothing in between it and the module it's using. Unfortunately that may not be the case, leading to confusion:

// mycode.js
const modb = require('b');
const z = Zone.current.fork();
z.run(() => {
  getResources(info, (data) => {
    // I'd expect if modb has an exception that's unhandled by
    // its own handler then it'd immediately propagate to me.
    modb.processData(data, oncomplete);
  });
});

// Here a.js was able to intercept my exception b/c it is in modb's
// Zone propagation chain.
function oncomplete(result) {
  if (result === -1)  // Something bad happened
    throw new Error('ah crap!');
}

The above may only be a hypothetical now, but please believe me that this will happen in the wild, and as module dependencies get deeper and as applications get more complex it may be a while before this would even be noticed, and even harder to find.

An aside, the error handling mechanics proposed, down to returning true to indicate the exception was handled, was implemented in my original AsyncListener API during node v0.11 over two years ago (note: was removed in Dec '14, but added in Sep '13). When I implemented this it seemed like a great idea, but in the end the entire thing was removed because it was found to add too much complexity and made reasoning about error handling difficult (ref: nodejs/node@0d60ab3e). I have been down this road, 3 months of work to get an API very similar to this into node, so believe me when I tell you that a large part of my opposition is from having seen this fail in practice.

@mhevery
Copy link

mhevery commented May 16, 2016

@trevnorris i don't think we are on the same page. I left some comments in the code marked with MISKO

// mycode.js
const modb = require('b');
const z = Zone.current.fork();
z.run(() => {
  getResources(info, (data) => {
    // I'd expect if modb has an exception that's unhandled by
    // its own handler then it'd immediately propagate to me.
    // MISKO: Why would you think that zones would prevent that?
    modb.processData(data, oncomplete);
  });
});

// Here a.js was able to intercept my exception b/c it is in modb's
// Zone propagation chain.
function oncomplete(result) {
  if (result === -1)  // Something bad happened
    throw new Error('ah crap!');
}

Zone's proposal deals with what happens with errors which are unhandled (which get passed to uncaughtException). Zone's do not influence errors which are handled in the library.

This is why I think we are not on the same page and we should step back and better understand what each side is proposing.

Just an FYI, we have been using Zone's in Angular in the browser, and in Dart both in browser and server, where we have had very good experience with them. I understand that dart and browser is not node, but given that we both have had such different experiences, tells me that there is something we are overlooking about the other. I would like to get to that nugget which resulted in such different experiences.

@mhevery
Copy link

mhevery commented May 16, 2016

And as as side note.

in your example:

// a.js
const z = new Zone();
let modb;
// First module to require('b'), now a owns it.
z.run(() => modb = require('b'));

Because your zone does not specify an error handler, then such a zone would be indistinguishable from no zone. My point is that you really have to go out of your way to mess up error handling. It's not something which happens because one forgets a wrong return value.

@trevnorris
Copy link

Because your zone does not specify an error handler

Sorry, was being lazy. I'm aware. Will be more explicit in the future.

// MISKO: Why would you think that zones would prevent that?

Brain puttered out near the end of last post. Let me attempt this again. Each step is numbered via (N):

// main.js
// (1) Require 'modb'.
const modb = require('modb');
// (5) Also require sutil for my own needs.
const sutil = require('sutil');
// (6) Fork off a new Zone with error handler.
const z1 = Zone.current.fork({ handleError(e) { } });

// (7) Run command in scope of z2.
z1.run(() => {
  // (8) getResources is defined somewhere in main.js. Is an async
  //     call to retrieve something. If something goes wrong I want it
  //     to be caught by "z1".
  getResources(info, (data) => {
    // (9) Make async call to processData().
    sutil.processData(data, oncomplete);
  });
});

function oncomplete(result) {
  // (11) This callback has been called from "z4" in "sutil.js" below.
  //      Because "z2", created in "modb.js", is in the .parent chain,
  //      the exception will propagate to, and be handled, there before
  //      it hits "z1".
  if (result === -1) throw new Error('ah crap!');
}
// modb.js
// (2) Create, don't fork, a new Zone with error handler.
const z2 = new Zone({ errorHandler() { return true; }, name: '(root zone)' });
let sutil;
// (3) Require sutil in scope of z2, so if sutil uses Zone.current.fork() the
//     .parent chain propagates exceptions to modb.js.
z2.run(() => sutil = require('sutil'));
// sutil.js
// (4) Fork from current zone for error cleanup/logging.
const z3 = Zone.current.fork({ errorHandler(e) { }});

module.exports.processData = function processData(data, cb) {
  // (10) Fork from z3 to funnel error logging, and use this for
  //      data propagation.
  const z4 = z3.fork();
  z4.data = data;
  z4.run(() => yetAnotherAsyncCall((r) => cb(r)));
};

Hopefully this is explicit enough to demonstrate the scenario I described in the previous post. In this case a seemingly unrelated module was able to inject itself info the .parent path of exception bubbling.

My point is that you really have to go out of your way to mess up error handling.

I understand this argument, and to some point it's valid. But in an ecosystem where applications have 50+ dependencies it only takes one to really screw everyone up. Unlike uncaughtException and domains, Zone's scoping will make it difficult to see who's swallowing the error. I can of course inspect Zone.current in all of my callbacks, but I can't force what is Zone.current (can do with domains) nor is there an API call to see where that one is being invoked (can do with uncaughtException).

Also I think you give module authors too much credit in how vigorous they'll be cleaning up resources. If they're using domains today then they'll be using errorHandler() tomorrow, and adding return true at the bottom isn't going to phase them. Hence one reason why we have Warning: Don't Ignore Errors! in big bold letters at the top of the domains doc, and XXX WARNING! BAD IDEA! at the top of the first example.

I would like to get to that nugget which resulted in such different experiences.

Thanks. One failure of this type of error handling is the user's ability to properly clean up after themselves. I don't say this hypothetically. This comes from companies complaining about processes with run-away resource usage because they weren't able to handle error properly.

Here's a simple resource tree scenario for incoming requests:

  • New http request triggers (1) opening a new file; (2) begin a series of async requests to backend; (3) transform stream to put together the response message.
  • (2) the async requests are not all done in parallel, since subsequent requests require the response from the previous.
  • (3) some of the data returned from responses is placed in the transform stream to generate the client response.
  • (1) as data is written out it's also forked to disk for caching.

Now, to handle this properly any failure must propagate to all other resources to cleanly shutdown. e.g. if writing to the file fails then we need to tell requests to stop, the transform stream needs to be cleaned up and the connection then needs to be sent the error message and closed.

Doing this sort of cleanup reliably, so the application doesn't hold on to gigabytes of Buffers or breaks from EMFILE, etc., is difficult to do. One big difference between node's existing async error handling methods and Zones is that in node every error needs to be explicitly handled. Whereas Zones allow a blanket to cover all of this. Hypothetically this is great, but I've seen too many side effects that leave the application in a bad state. One of the most dramatic I've heard is from @bmeck where the wrong data was sent to the wrong connection. This wouldn't even be that hard to replicate.

Anyway, hope we can talk more about this. Thanks for responding.

@mhevery
Copy link

mhevery commented May 17, 2016

// (1) Require 'modb'.

You bring up an excellent point here. We should make sure that require always executes code in root zone. This way it does not matter who, or which order you got loaded.

// (2) Create, don't fork, a new Zone with error handler.

I see you are trying to make a second root zone. Not sure that this should be allowed since there could only be one root zone. But I don't think being a child of root would change your example.

// (3) Require sutil in scope of z2, so if sutil uses Zone.current.fork() the
// .parent chain propagates exceptions to modb.js.

See point 1. require should load in root zone, so the parent zone would be irrelevant.

// (8) getResources is defined somewhere in main.js. Is an async
// call to retrieve something. If something goes wrong I want it
// to be caught by "z1".

I think this is where we diverge. In order for this to work few things have to be true

  1. Zone forking should be done as a local variable in the stack, not as a top level file variable. Only current stack frame has the sufficient contextual information where forking makes sense. Fork should be once per request, not once per file. Yes if you fork once at root, weird things can happen.
  2. In my mind setting up a zone with errorHandler which swallows is no different from try{}catch(){} which swallows. One just happens to by async the other sync.

I have a hunch that you are looking at Zones as the thing you set up for a file and all exceptions will be funneled to that Zone, where as I think of it in the context of stack frames and who called who. Where a particular function/method is declared should not have any relevance to what zone it will be in, just like a method can't assume what stack frames will be above it.

Method/function does not live in a zone, it is executed in a zone

Resource release and proper cleanup, can't be done with Zones in the current form. In order to do that, we need to introduce a concept of Tasks, (which we have in Zone.js implementation) but which we don't want to bring to TC39, as it would greatly complicate the discussion.

The way to think about tasks is that each stack frame is associated with a zone. In addition the top most frame in a stack is associated with a Task, which is similar to Zones, but it deals with VM turns rather than Zones. Zones can enter and leave at any point, but Tasks can't. Once you have a concept of a task, then a Zone can know how many outstanding Tasks there are which could have access to a particular resource. When the number of outstanding tasks goes to zero, it should be safe to clean up. The above is not fool proof, but it is better then the ad-hoc solution that we have above. But I digress.

@trevnorris
Copy link

We should make sure that require always executes code in root zone. This way it does not matter who, or which order you got loaded.

Cool. So take const global_zone = Zone.current; at bootstrap then do global_zone.run() in require()?

I see you are trying to make a second root zone. Not sure that this should be allowed since there could only be one root zone.

I'm not sure how to determine if a Zone is the root other than by name. Does this mean creating a root with the name '(root zone)' would throw, or there some other mechanism to determine this?

In my mind setting up a zone with errorHandler which swallows is no different from try{}catch(){} which swallows. One just happens to by async the other sync.

Async, and automatically propagates. After N async calls an exception could propagate back through that many stacks, and a number of modules, to be handled. This can cause a debugging nightmare. Here's a simpler example of how a library can intercept an application's exception:

// main.js
const mlib = require('mlib');
const net = require('net');
net.createServer((c) => {
  mlib.doStuff(c, (data) => {
    if (data === null)  // ah crap
      throw new Error('goodbye world');
  });
}).listen(8080);

// mlib.js
module.exports.doStuff = function doStuff(c, cb) {
  const z1 = Zone.current.fork({ errorHandler() { return true } });
  z1.req = { zone: Zone.current, cb };
  z1.runGuarded(() => doSomeAsyncStuff(c, doneWithAsync));
};
function doneWithAsync(data) {
  const req = Zone.current.req;
  req.zone.run(() => req.cb(data));
}

As the application author I'd expect the throw in the doStuff() callback to crash the application. Since I haven't explicitly setup up an exception handler. Instead the module I used will intercept that exception. There's never a good reason for a library to handle an application's exception, where the application hasn't explicitly told the library to do so. The only way I can see to get around this at the moment is:

// mlib.js
module.exports.doStuff = function doStuff(c, cb) {
  const z1 = Zone.current.fork({ errorHandler() { return true } });
  z1.req = { zone: Zone.current, cb };
  z1.runGuarded(() => doSomeAsyncStuff(c, doneWithAsync));
};
function doneWithAsync(data) {
  const req = Zone.current.req;
  (new Zone()).run(() => setImmediate(() => {
    req.zone.run(() => req.cb(data));
  }));
}

Domains had a hard enough time getting developers to do this properly, and all libraries had to do was call .exit() before running the callback.

@mhevery
Copy link

mhevery commented May 17, 2016

Cool. So take const global_zone = Zone.current; at bootstrap then do global_zone.run() in require()?

Zone.current.root gives you root.

I'm not sure how to determine if a Zone is the root other than by name. Does this mean creating a root with the name '(root zone)' would throw, or there some other mechanism to determine this?

Each Zone has a parent. The root zone is the one which has no parent, and behaves indistinguishable from the current platform behavior. Only the platform is allowed to create a root zone, everyone else must fork.

I am a bit confused about your first example. You set up z1 which swallows exceptions, and then you are concerned that exceptions get swallowed? How is this different from setting uncaughtException to ignore errors?

Why do you want mlib.js to create zones in first place? Is your goal to do error logging? (I guess my assumption is that forking zones should be pretty rare occurrence, which I don't think you share, and I am trying to understand this.)

In my mind the only time a library should fork or capture a zone is if it does user queue operations. (Such as implementing work queue)

Could we move this discussion from hypothetical here-is-how-i-can-break-zones to concrete use cases where zones fails to perform as expected. Sorry, it is hard for me to follow, and I think the discussion would be more productive.

@bmeck
Copy link

bmeck commented May 18, 2016

@mhevery Zone.current.root does not currently give you access to the Realm's "root" zone. Spec might need to be amended?

@bmeck
Copy link

bmeck commented May 18, 2016

@mhevery code examples were given that show what were problems for domains; there is a distinct difference in:

  1. behavioral expectations
  2. use case goals
  3. necessary first steps

We should setup a call sometime next week. I setup a doodle to schedule a time:

http://doodle.com/poll/e2m2aezi3egzqvrt#table

@domenic
Copy link
Owner Author

domenic commented May 18, 2016

You can tell if a zone is the root zone via z.parent === null. You don't need to check the name.

Currently the spec does not prevent creating new root zones via the Zone constructor. Maybe that should be prevented, although in general giving the UA magic powers (i.e. the ability to create a zone without using the constructor) is always a bit weird to me. But we do it all over the web platform so it's probably fine.

You can find the root zone by climbing the Zone.current.parent.parent.parent... chain. A convenience accessor (Zone.root) seems like a reasonable addition, although it's not clear that it's something we should make convenient. (The only use case presented so far is require, which is specialized enough it can do its own tricks.)

Next week is TC39 so I will not be available most days. I'll fill out the doodle but @mhevery can probably substitute.

@bmeck
Copy link

bmeck commented May 18, 2016

@domenic that only tells if the zone has a parent, zones created like https://domenic.github.io/zones/#sec-zone with a parent option of null also have .parent == null. I think having .root defer to a Realm root zone would be a preferred way to ensure the root zone is still in place even when you want to escape the current stack of zones.

@domenic
Copy link
Owner Author

domenic commented May 18, 2016

Right, as I said, there can currently be multiple root zones, and maybe we should disallow that.

@bmeck
Copy link

bmeck commented May 18, 2016

@domenic is there a case for multiple root zones? I would consider root zones as a realm intrinsic personally.

@domenic
Copy link
Owner Author

domenic commented May 18, 2016

Yeah, there probably isn't; it fell out naturally of the specification (and from my desire not to allow the user agent to do magic things). Happy to fix.

@trevnorris
Copy link

@mhevery Despite my best efforts, I've failed to communicate my concerns. I'll try again, and be as clear and concise as I can be (which doesn't say much). First I'd like to address a few points:

I am a bit confused about your first example. You set up z1 which swallows exceptions, and then you are concerned that exceptions get swallowed?

This must be my failing. When I stated "the module I used" and "[t]here's never a good reason for a library to handle an application's exception" thought it would be apparent that mlib.js was a third-party module, and one I had control over. Apologies.

In my mind the only time a library should fork or capture a zone is if it does user queue operations. (Such as implementing work queue)

That's great you have an idea of how they should be used, but that says little of how they will be used. Unless you enforce behavior via the syntax, users will do many strange things. For example, tj/co uses generators to achieve async/await like behavior. Did anyone on the standards body consider this as a use case for generators? (this isn't rhetorical, I am curious)

Could we move this discussion from hypothetical here-is-how-i-can-break-zones to concrete use cases where zones fails to perform as expected.

Thus far nothing I've said has been simply an attempt to break the spec. It is all based on use cases I've seen in the wild. Please take my examples from this point-of-view. If I'm trying to break the spec with edge case or unrealistic scenarios I'll explicitly say so.

Now for another code example that'll hopefully explain some of my reservations. The following spec assumptions were made:

  1. Users can no longer use new Zone().
  2. Zone.root points to the root Zone.
  3. Zones propagate through calls to .on() (as clarified by @domenic in RFC: adding built-in error handling support to zones #9 (comment))

Now the example code:

// module.js
module.exports = getFile;

function getFile(path, callback) {
  const gfz = Zone.root.fork({
    name: 'module forked Zone.root',
    handleError: () => true,
  });
  gfz.callback = callback;
  gfz.zone = Zone.current;
  gfz.runGuarded(() => {
    require('fs').readFile(path, moduleCallback);
  });
}

function moduleCallback(er, data) {
  const gfz = Zone.current;
  gfz.zone.run(() => {
    gfz.callback(er, data);
  });
}


// main.js
const getFile = require('./module');

require('net').createServer((c) => {
  const nz = Zone.root.fork({ name: 'app forked Zone.root' });
  nz.connection = c;
  nz.run(() => {
    c.on('data', connectionData);
  });
}).listen(8080);

function connectionData(chunk) {
  getFile(chunk.toString(), fileGotten);
}

function fileGotten(er, data) {
  if (er) throw er;

  const nz = Zone.current;
  nz.connection.end(data);
}

main.js is my application code. It forks from Zone.root on every 'data' event and makes a call to getFile(). main.js uses Zones to propagate the connection object thereby removing the need for nested functions. It does not wish to handle exceptions, and wishes for those exceptions to bring down the application.

module.js is third-party library code that my application is using. It uses Zones as a way to handle errors from any system calls it performs (note: the blatant error handling behavior is only for demonstration). It has no need of handling anyone else's exceptions. module.js is a good citizen and makes sure to call the fileGotten() callback within the Zone getFile() was invoked under.

Execution steps:

  • connectionData() is called from main.js, within a Zone created for the unique connection, where getFile() is called from module.js.
  • getFile() creates a new Zone for the individual request in order to handle any exceptions thrown while performing system calls.
  • When the operation completes moduleCallback() is called with the results.
  • Using the same Zone that was used to call getFile(), moduleCallback() uses Zone.current.zone to run() the callback. In order to restore the previous Zone state as when the callback was made.
  • main.js's callback, fileGotten(), is called with the results. If there was an error then main.js will rethrow.
  • Any exceptions from fileGotten() will bubble, unhandled, to its root Zone and then be rethrown.
  • The exception will then bubble to the errorHandler() defined in module.js and be silenced.

Scenario: I know my application is throwing, but can't find where the exception is being swallowed. Causing me angst.

Challenge 1: Find where the exception is being swallowed, only being allowed to touch the code in main.js. This is meant to be a very simplified situation where usually an application would have many modules, and viewing/editing dependencies could prove to be overly difficult.

Challenge 2: How can the author of module.js synchronously run the application's callback in a way that removes its own Zone from the stack. Such that getFile()'s errorHandler won't be invoked if the application's callback throws? (note: synchronous part is key if EventEmitter is to support Zones)

Point: If a third-party module was swallowing my exceptions with uncaughtException it could be easily enough found with the following snippet:

process.on('newListener',
           (n) => n === 'uncaughtException' && console.log((new Error()).stack));

If a third-party module wants complete error handling for only the duration of its execution the module can call process.removeListener('uncaughtException', fn) before calling the user's callback in order to prevent swallowing errors it doesn't care about. i.e. removing this error handling is trivial.

While debugging my application or troubleshooting a module all usage uncaughtException can be subverted by simply including process.on('uncaughtException', (e) => { throw e }) at the top of my application.

Simply put, if I as the application's author don't want any module to silence my exceptions I have the APIs to make sure that happens. As far as I understand the current Zone's spec, a third-party module could swallow my exception into the abyss. Making it near impossible for me to find or debug.

@bmeck
Copy link

bmeck commented May 20, 2016

Mmmmm our meeting schedule doesn't fully align. I can say we either split it into 2 meetings, or wait another week.

@mhevery
Copy link

mhevery commented May 27, 2016

I am sorry about a delayed response.

My understanding of your concern is that exceptions can be swallowed by an uncooperating library.

function libA() {
  var logExceptionsZone = Zone.current.fork({
    handleError: (e) => console.log(e)
  })
  logExceptionsZone.runGuarded(() => {
    libB() => throw new Error('What will happen to me?'));
  });
}


function libB(callback) {
  var eatExceptions = Zone.current.fork({
    handleError: () => true
  })
  eatException.runGuarded(callback);
}


libA();

In the above case libA wants to log exceptions, but when it calls into libB all exceptions get swalowed. This danger already exists today with try catch. Becasue try catch is already awailable, from our point of view Zones don't introduce anything new.

try-catch is scoped to stack frames, just like zones are scoped to stack frames. So unlike Domains it is not possible to enter but forget to exit a zone. Also zones are nested like stack frames so exeception handling unwraps like stack as well. Very analogous.

Same thing using try-catch

function libA() {
  try {
    libB() => throw new Error('What will happen to me?'));
  } catch (e) {
    consol.error(e);
    throw e;
  }
}


function libB(callback) {
  try {
    callback();
  } catch (e) {
  }
}


libA();

Yes, libB swalows exceptions and there is no way that libA could know about it.

A better way to write this would be:

function libA() {
  var logExceptions = Zone.current.fork({
    handleError: (e) => console.log(e)
  })
  logExceptions.runGuarded(() => {
    libB() => throw new Error('What will happen to me?'));
  });
}


function libB(callback) {
  var storeZoneBoundCallback = Zone.current.wrapGuarded(callback);
  var eatExceptions = Zone.current.fork({
    handleError: () => true
  })
  eatException.runGuarded(storeZoneBoundCallback);
}


libA();

Notice: that libB uses wrapGuarded and then just invokes it.

But I think don't think that is right either. I think there should only be run and wrapGuarded (There should not be runGuarded and wrap). So let's rewritie one more time.

function libA() {
  var logExceptions = Zone.current.fork({
    handleError: (e) => console.log(e)
  })
  logExceptions.run(() => {
    libB() => throw new Error('What will happen to me?'));
  });
}


function libB(callback) {
  var storeZoneBoundCallback = Zone.current.wrapGuarded(callback);
  var eatExceptions = Zone.current.fork({
    handleError: () => true
  })
  eatException.run(storeZoneBoundCallback);
}

libA();

The reason for this change is that code should be devided into two categories. My code, and callbacks I got from someplace else (not my code). When executing code execeptions should be handled using try-catch, because the code should handle its own execeptions. It is only when executing other code the exceptions have no meaning to my code and should be handled by the zone.

In the above example when callback got passed to libB and it is at that that libB should say, "Hey this is not my code, let me wrap to insulate myself from its exceptions", hence wrapGuarded.

Also becasue we have removed the runGuarded what we are saying is that synchronous exceptions should be handlede synchronously and only async (the ones which have wrapGuarded should have the exceptions handled.) The above example is strange because we are wrapping callback but executing in sync.

The rules are:

  1. runGuarded does not exist, only run which means that entering zone does not mess with synchronous execeptions (regardless of wether the zone spec swalows exceptions). run only means enter a zone, don't catch exceptions.

  2. wrap does not exist, only wrapGuarded. Call wrapGuarded only if you will store the callback for later invocation (async). As long as Zone.current.wrapGuarded(callback) gets called as soon as external callback enters a library then the correct zone will handle the callback. Doing so will not only be wrong with respect to exceptions handling, but also wrong in semantics as in the code will restore the wrong zone.

@trevnorris
Copy link

@mhevery

This danger already exists today with try catch. Becasue try catch is already awailable, from our point of view Zones don't introduce anything new.

I'm missing how it's not plain to see that try catch isn't the same. Take the following example:

try {
  fs.writeFile(path, data, err => {
    throw new Error('Nothing gonna catch me!');
  });
} catch (e) { }

No automated catch propagation. Can we all agree on this important difference?

Also zones are nested like stack frames so exeception handling unwraps like stack as well. Very analogous.

Sure it's analogous, but in practical terms it's very different. The Zone will automatically propagate the try catch like behavior everywhere, through all asynchronous time.

Yes, libB swalows exceptions and there is no way that libA could know about it.

And you don't see a problem with that?

Finally, neither of my challenges were addressed. 1) How am I supposed to locate where an exception of mine is being swallowed by a module. 2) How can I synchronously execute a callback where the root Zone is the only Zone in the stack. Example for asynchronous callback execution:

fs.writeFile(path, data, err => {
  // Say "callback" is a user supplied callback in an
  // above scope.
  Zone.root.fork().run(() => setImmediate(() => callback(err)));
});

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants