-
Notifications
You must be signed in to change notification settings - Fork 27
Post Mortem Analysis with Async APIs #114
Comments
The bottleneck is disk I/O, creating a core dump is basically the kernel copying the raw memory of a process to a file. (I would be really surprised if there were OSes that supported copy on write for coredumps, but I didn’t check that.) |
Do you mean |
@jclulow Nice catch 🙏! I started as copying it from the (then proposed) |
There essentially isn't a solution to this problem. Promises by design swallow all of the synchronous exceptions that the interpreter emits to signal programmer errors, and it seems you cannot have a compliant Promises implementation that drops that unfortunate behaviour. If your Promises implementation isn't compliant, presumably it will break a bunch of software that depends on the way it is specified, which doesn't help users either. I understand that advocates of this misguided model will cry foul and say that errors are not "swallowed", merely redirected into one asynchronous channel -- but that's actually at the heart of what's so abhorrent. By design, the system jams the synchronous exceptions that represent a logical inconsistency in the program, after which it cannot possibly be working as designed (e.g., At the end of the day, it just isn't a good idea. There doesn't appear to be a compromise that lets people use Promises as specified (and thus, presumably, as implemented in other venues like the browser), while retaining first-class post mortem debugging support for programmer errors. This is also true of indiscriminate wrapping of your entire program in a series of |
What about the suggested CLI flag? it might change the behaviour, but it might help fix a problem. Similar to Debug/Release builds. |
If you're going to change the behaviour, it may as well be a totally different interface. It will require testing your entire dependency tree in the compliant Promises environment as well as in the non-compliant environment -- and it's reasonably clear that isn't going to happen for the majority of software in the NPM registry. |
I think |
We are talking "post-mortem" something has already broken for the user. We should give them way to figure out what happened. Even if it's incomplete (as in they might find false positive in the way). |
I'd rather we avoid that sort of tone.
This is incorrect, there are a lot of asynchronous programmer errors and a lot of synchronous operational errors. If you get passed incorrect input from an API and you (Another good example is |
@jclulow That depends on how different the behaviour is. @refack This is specifically about post-mortem, do you mind removing off-topic comments? |
@addaleax edited. |
Wasn't that what |
I'm going to repeat something that @mikeal said on the other thread. Promises are here. Whether you want them or not, whether you use them in your code or not. A rapidly growing number of users, libraries, and open source projects are using Promises. You can't deny that. Promises are trivial to implement on your own in under 200 LoC, so you can't really build an automatic tool that would check whether some code uses Promises or not. So you're left with manually code reviewing every 3rd party library you might want to use in your code. Promises and async/await are a part of the language. It's the core dump that's the implementation detail, and I think that it isn't the job of V8 to yield to our needs, it's our job to yield to our users' needs. |
What was so serendipitous about By contrast, with promises, you basically can't ever know a priori whether a rejection will subsequently be handled -- at least not until you've already had to unwind the stack. To do otherwise would seem to be non-compliant with the specification, and will presumably break at least some software based on that specification. While I obviously think Promises are a bad idea, I also have extreme empathy for users of the things -- if you're going to use them, they should work the way they're documented and specified to work. I also understand that they're a part of the language, and here to stay. So is the |
I saw there are at least some products that do this like https://code.google.com/archive/p/google-coredumper/ |
@benjamingr Yeah, that looks like a straightforward userland implementation (for Linux/ELF), and Node could probably use that. It doesn’t do CoW, either, though (I would have no idea how that would be implementable in userland, and for the kernel it’s probably too much housekeeping overhead, too). |
Meet-me-halfway solution
|
/cc @rnchamberlain, who was recently explaining how one way to get core dumps reasonably portably is to |
@sam-github Forking is dicey in multi-threaded programs, particularly if one component is not responsible for all of the threads. It's also dicey in a context where you know the program has encountered a non-operational error (i.e., a bug). You could cause the program to hang instead of either dumping core or proceeding on. Plus, important state about the program can change in the child process. This approach might be fine in some specific programs, but I wouldn't recommend it as a general approach. |
P.S. @benjamingr I fixed a typo in the 1st comment |
side note: please don't name it "exception", as |
@addaleax I misunderstood your request. In my comment:
I meant Windows has a way to suspend a crashing process and trigger a "post-mortem" analysis while the process is still alive, it's called "JIT debugging". So there is less need for core dumps. This API can also be triggered programmatically like a reverse process.on('rejectionHandled uncaughtException unhandledRejection', () => {
native-win32.callJITDebugger()
}) Which is a simple solution. [editorial]Please take any comment of mine with an emoji as good-spirited, tongue-in-cheek, me trying to be funny or lighten up the conversation[/editorial] |
For additional context around the challenges, there's a thread in the postmortem WG - nodejs/post-mortem#16 - whilst a little dated it's summarises the problems with postmortems and promises well. To go through the points
This could lead to a lot of overhead, also when do you decide a rejection is handled or not? handling can be async, so you have to wait a predefined amount of time, depending on how code is written this may var, (some kind of flag like
This ignores the ability to async attach a catch handler... that would have to be well documented, also, won't we lose the call stack (similar to aborting in a setImmediate)?
If we can do this in simple way without getting in the weeds with promises I'd be +1 on doing this as soon as possible and recommend async/await syntax in docs (also, from a syntactical/best practices position this is a general win). Not sure how this works with something like
Very much against this, it could introduce breaking changes, it's can cause confusion, all my -1
last resort (but a better option if combined with async/await core dumps) |
Here is the meat of the problem:
Some thoughts on the direction for the fixes, not sure if right. Looking for technical corrections if I got anything wrong:
Some additional ideas for performance:
|
As mentioned earlier in the thread I think promises are here so the task it figuring out the way forward given that reality. If enough of the people on this thread are going to be at the collaborator summit next week it might make sense to schedule a session to discuss in person. |
@mhdawson I thought I could attend but work did not approve my days off nor did it budget time for it - so I'll be missing the coming summit :( I would love to remote-attend the said session or if we could do a hangouts I think it could be great. |
@benjamingr if enough people are going to do that we'll try to add a session and get you connected in. If not I'm thinking we might want to schedule a hangout to get the interested parties together soon after. |
The more I think about it, the more this double-run idea sounds like a simple (if slightly inconvenient) solution to this whole problem:
|
Probably doesn't even need direct support from V8. The postmortem metadata that is part of the UNIX binaries is rich enough to decode JS stack frames and the objects in them. |
Saving the stack would be a good first step and the double-run approach does sound like something we should take a look at. It does mean it might take longer to get the full debug info, but if is the only way then it much better than not being able to get it at all. |
What's the status of async_hooks with promises? |
@benjamingr @matthewloring has a patch available that should work for making them compatible, so I think it’s mostly just a question of async_hooks getting ready. :) |
FYI - work to investigate issues discussed here: nodejs/post-mortem#45 |
Going to close this now since nodejs/post-mortem#45 is happening. One last call for @misterdjules and anyone else from @nodejs/post-mortem to participate there if you'd like to help make promises work smoothly with post-mortem tools. |
Currently, one workflow to debug NodeJS problems it to generate a core dump when an unhandled exception occurs.
The flow goes something like this:
--abort-on-uncaught-exception
However, there are currently a lot of issues with the said flow and modern NodeJS that have not been addressed. I think post-mortem analysis based on core dumps is problematic with async APIs and I think we should solve that in order to ship a lot of things.
nextTick
, at which point stack traces are invalid (see Support awaiting values throughutil.awaitable
#12 (comment)).I think we're at a point where we're not giving users a very great experience for their tools (especially ones that are not experts).
Now, I'm not sure we can do much about the callback APIs, but before we have a real promised core we should address these concerns.
I'd also like to point out that this is specifically about state changes that happen within a microtask and that no I/O is processed in that time anyway today. So no I/O handling effects your core dump anyway.
Possible solutions for promises and core dumps:
Take a core dump when a rejection happens in your code, remove it when the rejection is handled.[1]
Abort synchronously with a flag if there is no catch handler and the promise was created more than a microtask ago.
Solve the problem only for async functions - since those must declare catch handlers synchronously and syntactically, and recommend that people who want to use promises use them.
Expose a flag that violates the ECMAScript standard and schedules promises synchronously.
Document that promises do not work with post-mortem debugging tools.
I'm going to ping @nodejs/post-mortem and current and previous people involved in promises work to discuss. I'd like to sort this once and for all and give users a better experience.
@misterdjules @mikeal @chrisdickinson @addaleax @Fishrock123 @petkaantonov @spion @thefourtheye @vkurchatkin @refack
[1]
throw
s, but since it is established thatthrow
does not indicate a programmer error and aerr
callback does not indicate an operational one - I'm not sure if that makes sense.The text was updated successfully, but these errors were encountered: