Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add -diagnosticsFormat command-line flag #2363

Merged
merged 16 commits into from
Jul 21, 2022

Conversation

cpitclaudel
Copy link
Member

@cpitclaudel cpitclaudel commented Jul 5, 2022

This PR adds -diagnosticsFormat:json to the CLI. The output looks like this:

{"range":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":8,"character":0}}},"severity":2,"message":"module-level const declarations are always non-instance, so the \u0027static\u0027 keyword is not allowed here","source":"Parser","relatedInformation":[]}
{"range":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":9,"character":17}}},"severity":1,"message":"Error: result of operation might violate newtype constraint for \u0027byte\u0027","source":"Verifier","relatedInformation":[]}
{"range":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":12,"character":16}}},"severity":1,"message":"Error: A precondition for this call might not hold.","source":"Verifier","relatedInformation":[{"location":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":11,"character":35}}},"message":"Related location: This is the precondition that might not hold."}]}

Dafny program verifier finished with 1 verified, 2 errors

This format is consistent with a bunch of other tools (see below). The intent is that client can look for { in column 0 and parse one object per such line.

Design notes:

  • Many compilers have this option, but it's not named consistently. Other tools
    use -fdiagnostics-format=json, --json, --outputjson, --formatter=json,
    --format=json, --error-format=json, --output-format=json,
    --message-format=json, etc.
  • Output format is one message per line. This makes it easy to parse output
    even if it's mixed with other non-JSON output. This is a common
    approach (e.g. rustc does this), but it's not universal (GCC groups all
    messages into an array). Streaming results allows clients to consume them as
    they come.

Implementation:

  • Source/Dafny/DafnyOptions.cs
    (DiagnosticsFormats): New enum.
    (DiagnosticsFormat): Default to PlainText.
    (ParseOption): Recognize new -diagnosticsFormat flag.
    (HelpBody): Document it.
  • Source/DafnyDriver/DafnyDriver.cs
    (ThreadMain): Delay initialization of ErrorReporter.
  • Source/Dafny/JsonDiagnostics.cs: New file. Capture Dafny and boogie message
    and print them as single-line JSON objects.
  • Test/cli/diagnosticsFormats.dfy: New test.
  • docs/DafnyRef/UserGuide.md: Document -diagnosticsFormat.

Closes #2084 (by adressing the underlying use case), closes #2229 (by superseding it). CC @mschlaipfer

Design notes:

- Many compilers have this option, but it's not named consistently.  Other tools
  use `-fdiagnostics-format=json`, `--json`, `--outputjson`, `--formatter=json`,
  `--format=json`, `--error-format=json`, `--output-format=json`,
  `--message-format=json`, etc.
- Output format is one message per line.  This makes it easy to parse output
  even if it's mixed with other non-JSON output.  This is a common
  approach (e.g. `rustc` does this), but it's not universal (GCC groups all
  messages into an array).  Streaming results allows clients to consume them as
  they come.

Implementation:

* `Source/Dafny/DafnyOptions.cs`
  (`DiagnosticsFormats`): New enum.
  (`DiagnosticsFormat`): Default to `PlainText`.
  (`ParseOption`): Recognize new `-diagnosticsFormat` flag.
  (`HelpBody`): Document it.
* `Source/DafnyDriver/DafnyDriver.cs`
  (`ThreadMain`): Delay initialization of `ErrorReporter`.
* `Source/Dafny/JsonDiagnostics.cs`: New file.  Capture Dafny and boogie message
  and print them as single-line JSON objects.
* `Test/cli/diagnosticsFormats.dfy`: New test.
* `docs/DafnyRef/UserGuide.md`: Document `-diagnosticsFormat`.
@MikaelMayer
Copy link
Member

This is a great addition to the Dafny ecosystem which will surely make Dafny more modular.
A few remarks before I dive in the PR review:

  1. I would like you to detect if the token is a RangeToken, and in this case add an "end" field to the range above, so that the range has both "start" and "end"
  2. Since it's API, I think we should also offer
    *. The absolute offset of the position. Necessary so that programmatic API consumers don't need to think of lines/column when parsing our error messages.
    *. The file name. There might be several files and just the position is not enough. There are also included files, and related errors.
  3. Related positions should be, as for the Language Server, nested under the main error, not a top-level error. In your example, the message containing "Related location: " is a top-level one... Look at how LSP does it:
    public void ReportBoogieError(ErrorInformation error) {
    var relatedInformation = new List<DiagnosticRelatedInformation>();
    foreach (var auxiliaryInformation in error.Aux) {
    if (auxiliaryInformation.Category == RelatedLocationCategory) {
    relatedInformation.AddRange(CreateDiagnosticRelatedInformationFor(auxiliaryInformation.Tok, auxiliaryInformation.Msg));
    } else {
    // The execution trace is an additional auxiliary which identifies itself with
    // line=0 and character=0. These positions cause errors when exposing them, Furthermore,
    // the execution trace message appears to not have any interesting information.
    if (auxiliaryInformation.Tok.line > 0) {
    Info(VerifierMessageSource, auxiliaryInformation.Tok, auxiliaryInformation.Msg);
    }
    }
    }
    var uri = GetDocumentUriOrDefault(error.Tok);
    var diagnostic = new Diagnostic {
    Severity = DiagnosticSeverity.Error,
    Message = error.Msg,
    Range = error.Tok.GetLspRange(),
    RelatedInformation = relatedInformation,
    Source = VerifierMessageSource.ToString()
    };

@cpitclaudel
Copy link
Member Author

cpitclaudel commented Jul 18, 2022

I would like you to detect if the token is a RangeToken, and in this case add an "end" field to the range above, so that the range has both "start" and "end"

Done, can you give me an example of code with a RangeToken so I can test the code?

Since it's API, I think we should also offer. The absolute offset of the position.

I don't think we should deviate from LSP here, but I can do it if you want.

The file name.

Yep, good catch. Done.

Related positions should be, as for the Language Server, nested under the main error, not a top-level error.

I will look into that.

@cpitclaudel
Copy link
Member Author

 I will look into that.

Done.

@MikaelMayer
Copy link
Member

Can you update the PR description?

@cpitclaudel
Copy link
Member Author

Since it's API, I think we should also offer. The absolute offset of the position.

I don't think we should deviate from LSP here, but I can do it if you want.

I changed my mind. I'd love to do that, but we don't have the data. Tokens have a UTF-16 offset, which is not very usable by clients. Until we move away from that, I don't think we should include that info in the output.

@cpitclaudel
Copy link
Member Author

Can you update the PR description?

With the example output? I can! It's also in the diff of the PR (there's a test)

@cpitclaudel
Copy link
Member Author

Ready for review, I think

@MikaelMayer
Copy link
Member

I changed my mind. I'd love to do that, but we don't have the data. Tokens have a UTF-16 offset, which is not very usable by clients. Until we move away from that, I don't think we should include that info in the output.

An UTF-16 offset? I don't think so. I have been consistently using pos to extract strings from program and it works well, either reading a buffer or the C# string. See #2434

Copy link
Member

@MikaelMayer MikaelMayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few items, otherwise, great start to make Dafny become more library-friendly.

RELEASE_NOTES.md Show resolved Hide resolved
Source/Dafny/DafnyOptions.cs Outdated Show resolved Hide resolved
Source/Dafny/JsonDiagnostics.cs Show resolved Hide resolved
diagnosticsFormats.dfy(11,35): Related location: This is the precondition that might not hold.

Dafny program verifier finished with 1 verified, 2 errors
{"range":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":8,"character":0}}},"severity":2,"message":"module-level const declarations are always non-instance, so the \u0027static\u0027 keyword is not allowed here","source":"Parser","relatedInformation":[]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test for the "end", you have to provide the argument name /showSnippets:1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand: /showSnippets is for showing the error in context, and we don't want to do that in JSON mode, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/showSnippets enables Dafny to create RangeTokens instead of just Tokens.
For the JSON mode, if /showSnippets is activated, it would be reasonable to me that we don't add the snippet as part of the error message (but we could... that could be useful), but that way we can report error using ranges.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I understand now :) Is there a strong reason to not create RangeTokens all the time? This way we could report ranges on the command line even when /showSnippets isn't specified, not just line-column pairs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the only reason was that I'm too lazy (or don't think it's worth it?) to change the 1200+ files of our integration tests And it could be a breaking change (however, I don't think this would be a big deal)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point regarding the test files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added -showSnippets:1 to the test and documented it in UserGuide.md

diagnosticsFormats.dfy(11,35): Related location: This is the precondition that might not hold.

Dafny program verifier finished with 1 verified, 2 errors
{"range":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":8,"character":0}}},"severity":2,"message":"module-level const declarations are always non-instance, so the \u0027static\u0027 keyword is not allowed here","source":"Parser","relatedInformation":[]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{"range":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":8,"character":0}}},"severity":2,"message":"module-level const declarations are always non-instance, so the \u0027static\u0027 keyword is not allowed here","source":"Parser","relatedInformation":[]}
{"location":{"filename":"diagnosticsFormats.dfy","range":{"start":{"line":8,"character":0}}},"severity":2,"message":"module-level const declarations are always non-instance, so the \u0027static\u0027 keyword is not allowed here","source":"Parser","relatedInformation":[]}

I suggest we use "range" only for the start and the end, and "location" for something that holds a file name and a range, like LSP is doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will fix, that was the intended thing and I got it wrong :'(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@cpitclaudel
Copy link
Member Author

An UTF-16 offset? I don't think so. I have been consistently using pos to extract strings from program and it works well, either reading a buffer or the C# string. See #2434

Yep, strings are UTF-16 in C#, so if you use these offsets into strings in C# it will work, but it wouldn't work in other languages that use bytes or utf-8. It would be weird to leak that.

cpitclaudel and others added 2 commits July 19, 2022 14:04
Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com>
Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com>
@MikaelMayer
Copy link
Member

Yep, strings are UTF-16 in C#, so if you use these offsets into strings in C# it will work, but it wouldn't work in other languages that use bytes or utf-8. It would be weird to leak that.

But isn't "col" already leaking this information?

@cpitclaudel
Copy link
Member Author

But isn't "col" already leaking this information?

😱 you're right :'(
I will add pos then, it doesn't make things worse.

@cpitclaudel
Copy link
Member Author

I will add pos then, it doesn't make things worse.

Done

Copy link
Member

@MikaelMayer MikaelMayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, almost there.

RELEASE_NOTES.md Show resolved Hide resolved
Source/Dafny/JsonDiagnostics.cs Show resolved Hide resolved
@@ -74,7 +75,7 @@ record DiagnosticMessageData(MessageSource source, ErrorLevel level, IToken tok,
var auxRelated = related?.SelectMany(SerializeAuxInfo) ?? Enumerable.Empty<JsonNode>();
var innerRelated = SerializeInnerTokens(tok);
return new JsonObject {
["range"] = SerializeToken(tok),
["location"] = SerializeToken(tok),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few CI failures line 93?

MikaelMayer
MikaelMayer previously approved these changes Jul 21, 2022
Copy link
Member

@MikaelMayer MikaelMayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job

@cpitclaudel cpitclaudel enabled auto-merge (squash) July 21, 2022 19:01
@cpitclaudel cpitclaudel merged commit 9253265 into master Jul 21, 2022
@cpitclaudel cpitclaudel deleted the cpitclaudel_jsonDiagnostics branch July 21, 2022 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify using Dafny as a library
2 participants