Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling for VS Code #6146

Closed
dberenbaum opened this issue Jun 9, 2021 · 17 comments
Closed

Error handling for VS Code #6146

dberenbaum opened this issue Jun 9, 2021 · 17 comments
Assignees
Labels
awaiting response we are waiting for your reply, please respond! :) p1-important Important, aka current backlog of things to do product: VSCode Integration with VSCode extension

Comments

@dberenbaum
Copy link
Collaborator

Extracted from #5984 (comment).

How can the granular error info from that PR be consumed by VS Code (and possibly other apps)? Need a way to pass that info in a machine-readable format. Also need to consider whether this is generally useful information for all users who want machine-readable output, or is this a special format for VS Code and maybe other internal applications?

cc @shcheklein @rogermparent @mattseddon

@dberenbaum dberenbaum added p1-important Important, aka current backlog of things to do product: VSCode Integration with VSCode extension labels Jun 9, 2021
@pared
Copy link
Contributor

pared commented Jun 9, 2021

Related #6039

EDIT
sorry not that issue

@mattseddon
Copy link
Member

This comment is related (please hit me up for more context if needed).

@rogermparent
Copy link
Contributor

rogermparent commented Jun 10, 2021

Also need to consider whether this is generally useful information for all users who want machine-readable output, or is this a special format for VS Code and maybe other internal applications?

I think any API-via-CLI we add to DVC should be in a format that's usable by as many applications as possible, even if VS Code is going to be the only one using it for a while.

Formatting Errors

I think a simple, flexible way to convey errors in JSON is a consistent minimal error interface like { type: string, ...rest } with rest's shape (if any) being defined by each type. This allows for errors to be able to translate any pieces of information that would differ between messages of the same type (filenames, branch names, pids, anything that would be string interpolated). rest can be as complex as any error needs, as long as it's consistent between types, and consumption of this format is particularly easy in Node with rest/spread operators.

I can't see any alternatives that provide the same functionality, so for the most part I'm assuming that solutions will output errors in this format: Error.

Formatting --show-json responses

Two possible implementations were brought up in the cross-team meeting, which I'll bring up with some extra detail here. While detailing them, I also thought of a third solution that I find most preferable.

Wrapping

Adding a consistent top-level { errors?: Error[], data?: Data } structure where Data is the original --show-json data. Simple and predictable, everything is in the same JSON blob read by stdout.

There's a big consideration for --show-json consumers other than VS Code, so I wouldn't be comfortable just overriding all --show-json commands with wrapping behavior. We could still do it, however, by adding the behavior under a separate flag (--show-json-with-errors?) for 2.0 and maybe deprecating original errorless --show-json in DVC 3.0.

Error[] in "error" key in original JSON

This is the idea of just tossing an errors key in existing --show-json outputs without wrapping, similarly to how exp show --show-json puts a workspace key among the git branches. This would be backwards compatible, but there's also the possibility of this being an issue with any --show-json output that allows for arbitrary keys at the top level. exp show could handle it since we'll never have a branch sha of "errors", but I can't say the same for all other existing and future commands.

Individual Error objects delivered via stderr

Since stdout and stderr can be handled separately, we could just have --show-json act the same with stdout and output Errors on stderr. I think the best solution here would be to output whole Error objects on stderr as they happen, one object per line; this could work well if we implement --show-json on long-running commands like exp run (as opposed to something like an Array that requires errors to be collected before output).

While this implementation more difficult to consume, I think it makes up for its drawbacks by flexibly handling all the complex situations where we may want to report errors in DVC across all its commands.

@dberenbaum
Copy link
Collaborator Author

Great points!

this could work well if we implement --show-json on long-running commands like exp run (as opposed to something like an Array that requires errors to be collected before output)

Won't this same concern will apply for stdout? Providing no output that things are working during long-running commands might be problematic if we have a --show-json option.

I think the best solution here would be to output whole Error objects on stderr as they happen, one object per line

Is the main difference from current behavior that errors are formatted as one line per error?

@rogermparent
Copy link
Contributor

rogermparent commented Jun 11, 2021

Won't this same concern will apply for stdout? Providing no output that things are working during long-running commands might be problematic if we have a --show-json option.

While a bit outside of this error-focused ticket, I agree. Even machine-readable output can benefit from continuous updates.

Is the main difference from current behavior that errors are formatted as one line per error?

Current DVC, if it errors, errors normally with non-json output so the new behavior would basically change everything. I haven't actually run #5984, so I can't yet comment on the difference between its implementation and what I'm thinking.

I would imagine my envisioned behavior is very similar to standard output printing, but all the human-readable messages are machine-readable JSON instead.

@dberenbaum
Copy link
Collaborator Author

Okay, I misunderstood. So you are suggesting that when an error occurs, information about the error is put into a standalone json object that is output to stderr? So each error outputs its own json object?

It looks like the approach in #5984 for dvc exp show is to note in the table where errors occurred. If errors are reported separately to stderr, how will they be tied back to specific experiments or files in the experiments table?

@rogermparent
Copy link
Contributor

It looks like the approach in #5984 for dvc exp show is to note in the table where errors occurred.

That's a fine approach, but embedding the errors in output depends on the command being finished and outputting, which is what I was attempting to address specifically for long-running commands like exp run where errors can happen well before the command finishes. As discussed in the meeting, this approach also has drawbacks in planning on where to insert the errors for each error type when trying to co-locate them with the data like that.

If errors are reported separately to stderr, how will they be tied back to specific experiments or files in the experiments table?

This is where the ...rest metadata comes in: if an error can be specifically tied to a commit or file, it should have a sha or file field that denotes that is where it originated from. It is then up to the application consuming the JSON to display these errors in the most sensible way- this could change heavily from application to application. Due to the nature of --show-json, this sort of transformation must happen anyway for each command run, so collecting errors from stderr and connecting them with the output, given the pieces of associating data (foreign keys, so to speak), should be pretty trivial.

That all said, the VS Code extension can adapt to whatever data shape the DVC team decides on, we just need the data to be there and know how to get at it.

@shcheklein
Copy link
Member

It feels that for a command like dvc exp show errors should be part of the regular output. It's a different use case from when a command fails completely (exit code non 0, DVC prints ERROR, etc). In this case these errors serve as some additional information even, rather than a command status/output.

@rogermparent
Copy link
Contributor

rogermparent commented Jun 14, 2021

@shcheklein Good point, and I'd also like to bring up the fact that it seems exp show --show-json is closely tied with the output on non---show-json, such that the former is the input for the latter.

In this case, the VSCode extension can adapt to whatever solution works for displaying these errors in exp show, and take cues from the CLI table in how and where to display these errors.

If we do decide to display errors differently between commands (which certainly has its merits), we should make sure to document that this is the case on each command that does so, not just for VS Code but for all integrations, especially ones that don't original internally.

@shcheklein
Copy link
Member

It feels that for a command like dvc exp show errors should be part of the regular output.

To clarify here a bit. By errors here I mean only those specific to it not being able to parse specific items in the table. It means that, for example, lock error even for dvc exp show should be handled the same regular way as for the other commands.

@Suor
Copy link
Contributor

Suor commented Jun 15, 2021

Related #6042

@pared
Copy link
Contributor

pared commented Jul 20, 2021

@rogermparent #5984 has been merged, it introduces "error" key in original json for --show-json, please check it out and lets continue this issue. I guess what I need now is information what (if?) vscode needs more info after #5984 and what should we add to error serialization.

@rogermparent
Copy link
Contributor

rogermparent commented Jul 20, 2021

@rogermparent #5984 has been merged, it introduces "error" key in original json for --show-json, please check it out and lets continue this issue. I guess what I need now is information what (if?) vscode needs more info after #5984 and what should we add to error serialization.

@mattseddon discovered this and took it on yesterday with iterative/vscode-dvc#660, which may be the best place to continue discussion

@mattseddon
Copy link
Member

@pared I've raised iterative/vscode-dvc#662 to deal with the new data and how we are going to display it. I'll be back in touch as soon as I have more information 👍🏻 .

@Suor
Copy link
Contributor

Suor commented Jul 21, 2021

I would say there should be some callback one may setup to get errors. So that API users might see those too.

@pared
Copy link
Contributor

pared commented Jul 21, 2021

@Suor
One can provide onerror, if nothing is porivded to show methods, we will use onerror_collect which provides default behaviour of filling result dict with error field.

@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Jul 27, 2021
@dberenbaum
Copy link
Collaborator Author

@mattseddon Can we close this issue and continue discussion in iterative/vscode-dvc#662?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) p1-important Important, aka current backlog of things to do product: VSCode Integration with VSCode extension
Projects
None yet
Development

No branches or pull requests

7 participants