-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement sarif format reporter #1047
base: master
Are you sure you want to change the base?
Conversation
@@ -316,6 +317,8 @@ fn run_checks(args: &args::Args) -> proc_exit::ExitResult { | |||
if status_reporter.errors_found() { | |||
errors_found = true; | |||
} | |||
|
|||
status_reporter.finalize().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to find something better to do than unwrap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have no idea, ? can not cast error automatic, should i attach a map_err for everyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this marked as resolved when nothing has changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgeted to update this one, will fixed in the new patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log::error!("Error finalizing: {}", err);
Is not an appropriate way of handling this
c22c553
to
50550f1
Compare
@epage Could you please approve the workflow run to see if anything broken by this? |
Pull Request Test Coverage Report for Build 9723606862Details
💛 - Coveralls |
let column_start = | ||
unicode_segmentation::UnicodeSegmentation::graphemes(start.as_ref(), true).count() + 1; | ||
let column_end = | ||
unicode_segmentation::UnicodeSegmentation::graphemes(msg.typo, true).count() + column_start; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Looking at the spec, we should be reporting code points, not graphemes
- We need to specify the column unit
If a SARIF producer processes text artifacts and theRun.results (§3.14.23) is non-empty, the run object SHALL contain a property named columnKind whose value is a string that specifies the unit in which the analysis tool measures columns. If a SARIF producer processes text artifacts and theRun.results is empty, columnKind MAY be present.
...
- "unicodeCodePoints": Each Unicode code point (abstract character) is considered to occupy one column. This means that even a character that is represented in UTF-16 by a surrogate pair is considered to occupy one column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set the columnKind as unicodeCodePoints
. So the graphemes().count()
value should represents the code point correctly, since every call to next
should get a new character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't a char
a code point while a grapheme can be made of multiple multiple char
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries described, UnicodeSegmentation::graphemes
should mean to be unicodeCodePoints
in SARIF
, because they all represent a user-readable character, even though the underlying layer consists of multiple code points.
SARIF: This means that even a character that is represented in UTF-16 by a surrogate pair is considered to occupy one column.
Unicode: For example, “G” + grave-accent is a user-perceived character: users think of it as a single character, yet is actually represented by two Unicode code points. These user-perceived characters are approximated by what is called a grapheme cluster, which can be determined programmatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SARIF: This means that even a character that is represented in UTF-16 by a surrogate pair is considered to occupy one column.
From my understanding, UTF-16 surrogate pairs are a way to represent code points that can't be represented otherwise. Grapheme's are more than just surrogate pairs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switch to utf16CodeUnits
1d021ad
to
f0a456a
Compare
@epage hello, any updates? |
Pull Request Test Coverage Report for Build 10045532744Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
match status_reporter.finalize() { | ||
Ok(_) => {} | ||
Err(err) => { | ||
log::error!("Error finalizing: {}", err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error
would be redundant with log showing ERROR
. finalizing
has no user-facing meaning in this context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generate final result
still doesn't have much meaning to the user and reads oddly within context of an error.
Likely we should directly print the error and have error_to_io_error
change the error message to format!("failed to generate SARIF output: {err}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not seeing this updated
if let Err(err) = status_reporter.generate_final_result() { | ||
log::error!("generate final result: {}", err); | ||
return proc_exit::Code::FAILURE.ok(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has us printing one SARIF json object per argument.
This should either be moved out of the loop or this is jsonlines and we shouldn't use pretty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's too bad I didn't notice that typos created a reporter for each file. i need to keep elevating the status of the reporter to make sure that the sarif file contains information from all the files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please take a look at 7a834d0
match status_reporter.finalize() { | ||
Ok(_) => {} | ||
Err(err) => { | ||
log::error!("Error finalizing: {}", err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not seeing this updated
I squashed the commits again to clean up changes not related to this PR. |
f9be168
to
5fa2d29
Compare
The conflict has been resolved. |
@epage Can we move forward with this patch? |
Closes: #594
I asked earlier if anyone had already worked on this feature and got no reply #594 (comment) , so I added this implementation myself.
But as I declared, I'm not a professional rust developer, so this patch obviously contains a lot of problems caused by a lack of understanding of rust, such as
.unwrap
abuse, since I'm not sure how to properly convertError
toio::Error
.I enabled
Allow edits by maintainers
, so if you have any suggestions for syntax changes, please push them directly to the corresponding repository branch, thanks!