-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Support error causes #61
Comments
I've adapted the project's current regex into a lexer that I think will work: const moo = require("moo");
let lexer = moo.compile({
Whitespace: { match: /[ \t\n]/, lineBreaks: true },
CausedBy: /^\s*Cause(?:d by)?:/,
// Any line ending with file:line:col is a stack frame
// file:line:col may be wrapped in parens, or may be the literal 'native'
StackFrame: /^.*?(?:native|:\d+:\d+|\((?:.+?:\d+:\d+|native)\))$/,
FreeText: /.+$/,
}); Stack frame lines would be subject to further parsing. In particular the current big stack-frame-line regex will need to be broken down in order to support embedding URI syntax and fix #60. |
Here's a condensed summary of the tokens emitted from that lexer when run on the example input above: [
{
type: 'FreeText',
text: 'SomeError: The grumblecakes could not be found',
},
{
type: 'Whitespace',
text: '\n',
},
{
type: 'StackFrame',
text: 'at grumblecake-locator.js:12:2',
},
{
type: 'Whitespace',
text: '\n',
},
{
type: 'CausedBy',
text: 'Caused by:',
},
{
type: 'Whitespace',
text: ' ',
},
{
type: 'FreeText',
text: 'QueryError: Invalid query `SELECT <grumblecakes>`',
},
{
type: 'Whitespace',
text: '\n',
},
{
type: 'StackFrame',
text: 'at db.js:51:4',
},
{
type: 'Whitespace',
text: '\n',
},
{
type: 'CausedBy',
text: 'Caused by:',
},
{
type: 'Whitespace',
text: ' ',
},
{
type: 'FreeText',
text: 'ParseError: Unexpected character `<` at position 8',
},
{
type: 'Whitespace',
text: '\n',
},
{
type: 'StackFrame',
text: 'at db.js:1233:10',
}
] |
Oh boy this project actually already supports "nesting" of error stacks through string concatenation as done by nested-error-stacks. Unfortunately the behavior of nested-error-stacks is fundamentally not interoperable with errors that have causes. After all, to print errors stacked using cause you need access to their unstacked stack traces, which are destroyed by nested-error-stacks. In practice |
Oh and I think that answers another question I had. I was wondering why the It was my intention anyway, but I'd say my plan is to make breaking changes and publish my fork. Once I have something that works and I can actively maintain we can discuss whether the fork should merge back into this package. |
Hmm, if (!(/^\s*at /.test(stack[0])) && (/^\s*at /.test(stack[1]))) {
stack = stack.slice(1);
} If a line doesn't begin with |
I'm very tempted to rip out support for method and file names which support |
Similarly I think it's probably safe to assume that most methods names won't have That would allow us to handle some pretty weird stuff without doing anything all that weird, e.g.: const obj = {
'weird fn name()': () => {
throw new Error('message');
},
};
obj['weird fn name()'](); which results in
|
I could also stipulate that results may be bad if file names contained mismatched parens. Then I could just go back from the end of the frame string until I find the nearest balanced paren. I'd slice that part out and feed it back into a path parser that could differentiate between URIs and different kinds of paths. I see from testing that windows and unix path styles are sometimes mixed in a single trace, which is just a pain.
|
It may be that the easiest way to deal with this is to always tokenize parens. This will sometimes split apart parts of a single token like |
My new lexer is: let lexer = moo.states({
error: {
Newline: { match: "\n", lineBreaks: true },
Whitespace: /[ \t]/,
CausedBy: { match: /^\s*[cC]ause(?:d [bB]y)?:/, value: () => "Caused by:" },
At: { match: /^[ \t]*at /, push: "frame" },
// Any line ending with file:line:col is a stack frame
// file:line:col may be wrapped in parens, or may be the literal 'native'
FreeText: /.+$/,
},
frame: {
Newline: { match: "\n", lineBreaks: true, pop: true },
"(": "(",
")": ")",
Whitespace: /[ \t]/,
FrameFragment: /[^() \t\n]+/,
},
}); |
And here is a complete parser. It's still missing some of the more minor features, but it proves that everything works together. Try pasting it into the nearley playground |
Aight I'm out for the day. Signing off. No more work. Thanks for your help everyone! Errors are gonna be great! |
In the end everything I was afraid I might not support works great. This parser will be even more flexible and powerful than the current one. |
Ah ok now we get into more fun stuff. The more we add into these stack lines the more we're going to hit a very particular problem: the grammar is fundamentally ambiguous. |
I think what this means is that I need two parsers, one for errors, and one for lines. That way the line parser can be ambiguous by design but for each line we'll eliminate any ongoing ambiguity. The only question is how much overlap should there be? If stack lines are any lines that begin with |
Do we still want a scanner for the frame parser? I could certainly simplify the frame parser code a lot if just I trashed it. Is it doing anything for me? I suppose since the frame parser is fundamentally ambiguous it might help with speed. When you're doing a permutations problem it helps to have less stuff to permute. I guess I'll leave the tokenizer for now, but maybe once this is all working I'll put together a benchmark and see what happens to perf it I toss it. |
I'm happy to report that it's all coming together. My new implementation now supports all the syntax that this parser supports as well as some new things like |
The new
My initial idea was to parse into a structure like this: type StackFrame = {
type: 'Frame';
file: string;
function?: string;
// ...
} | {
type: 'Text';
text: string;
};
type ParsedError = {
name: string;
message: string;
stack: Array<StackFrame>;
cause: ParsedError;
}; The weird thing with this design is that causes are from a syntactic perspective a subset of freetext stack frames, yet this structure treats them in completely different ways. The other drawback is that it's actually fairly valuable I think to have the ability to reverse iterate the cause stack, i.e. going from most specific to least specific as error handling does in real code. Essentially chaining cause through a If I do use an array I'd probably need a new type to store it in. I'd have something like: type ErrorChain = {
chain: Array<Error>;
};
type Error = {
name: string;
message: string;
stack: Array<StackFrame>;
isCause: boolean;
}; This could work much better because there'd be no more need for "freetext frames". Instead the freetext would just become the message in a new error, and the stack would always be guaranteed to be an array of frames, which seems useful from the perspective of writing code to filter out spurious frames. I think the combined advantages of the chain approach are worth making it my new design. |
Hmm thinking about this some more and I don't like To generalize that functionality I could:
I'm tempted to do the second thing. It would be easier to implement, test, and document, and it does all it needs to do. The purpose of this tool has never been (in my mind) to "understand" errors. Someone could build that on top. This just needs to be able to tweak and reprint. |
Completely revamping the representation of stack to make it not-megamorphic and easier to document/extend. |
Well I've got pretty much a full implementation of the ambiguous parser approach, but while I initially thought there would be a little ambiguity, now that I've implemented everything it's clear that there's a LOT. I'm on the fence about what really makes sense to do in this case. The only way to cut down on the ambiguity really is to cut down on the stupid stuff that we currently make our best effort to handle. One option might be to have two versions of the parser, one with no ambiguity (or significantly reduced ambiguity) that can be run as a first pass. In the common case this would be all that is needed. But if that fails, we could incur the cost of running a more permissive parser. The main question that arises from this is: is it even worth having the backup parser? Well, that depends on what it costs. The nearley API allows me to specify a rule as |
Yeah I think that'll work. I can reuse most of the parser. I'll factor out the |
Also it might be pointless to save the lexer's output. It takes up resources in the common case, and in the less common case of real ambiguity lexing again will still likely be very fast compared to the cost of the ambiguous parsing. |
Ah and most of the rules would need to change, so the optimizations are bunk. But even so, I think the two parser method is probably still worth it. I think the average frame results in at least seven ambiguous parsings internally, which stem from the format of |
K, got that working. Time to fix the test suite again. You know at one point all the tests in the suite passed, and I honestly have no idea why cause there were definite problems with the parser. |
I need to think about how to handle spaces. Right now I've treated spaces as JSON does for example: collapse multiple spaces down into one token and basically ignore spaces anywhere that isn't inside a literal. Of here it's a bit fuzzier exactly what a literal is, but anyway. The current codebase tests this case:
In the current parser the file part is expected to be |
I've added tests which validate that I don't fall out of the strict parser's grammar for frames that don't do anything weird. It's already caught some issues. To be able to write the test I had to introduce a |
Oh god I'm testing my work on stack-utils using AVA which formats its errors with stack-utils. I tried to stop the test on an uncaught exception and I was confused as to why the failing code was dealing with stack-utils but it didn't seem to be the one I was working on. In fact I've just discovered yet another deficiency in the current stack-utils: it can't parse the frame |
I can't figure out if the first |
It has come to my attention that this is a parser for v8 stack traces, i.e. node, Chromium, Chrome, and I guess now IE Edge. I think this answers the question of what I'll do when I'm done with this work. I don't really want to have to ask permission to publish something that I wrote, but I also don't want to have to name the package something silly. So I can be |
Site will now have a |
I'm really getting towards the finish line now. I think I'm content with the API I've created, and I'm working on the test suite. |
If you read this far I commend you! If you didn't read everything above I don't commend you, but it's OK I was just talking to myself. I started describing a feature I wanted and ended up building a whole replacement library. It's called |
This is a proposal to support printing stacks of errors nested with
error.cause
. The JS spec allowserror.cause
to be any value, but this proposal is mainly concerned with what happens whencause
is present as a string or another error.I've written up the most descriptive form of this proposal as a proposed jest feature. If you don't mind I'll avoid copying it all out here. Jest uses
stack-utils
under the hood to reformat errors.The TC39 committee has so far indicated that the language will not provide any official implementation of printing nested errors. Their hands are tied to some extent until they finish specifying the semantics of
error.stack
. But until then it will be very useful to have best-effort printing and formatting for errors printed in the most common way.I think errors with causes should be printed more or less like this:
Currently there is no standard for this. I think
stack-utils
would be an excellent library in which to create an implementation which formats, prints, and loosely parses errors with multiple messages and stack traces.I'm willing to build it. Is there interest from the maintainers?
The text was updated successfully, but these errors were encountered: