-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migrate wdl-gauntlet
to the new parser implementation.
#76
Conversation
This commit removes the use of the existing parser implementation in favor of the new implementation in `wdl-gauntlet`. As a result, `Arena.toml` and `Gauntlet.toml` have been refreshed; in the case of `Arena.toml`, there's a ~10X increase in logged diagnostics due to the new parser implementation successfully parsing files that the existing parser implementation incorrectly parsed with errors. Also added some wall-clock timing for the time spent analyzing a source file and an aggregate duration displayed at the end of gauntlet's output. From the before and after comparisons, the new parser implementation appears to be at least 20X faster in analyzing files, likely due to using `logos` for lexing and the new parser implementation does not backtrack like `pest` does. Also added configuration for ignoring specific files so that `task-templates.wdl` can be excluded as it isn't expected to be valid WDL (it has non-WDL placeholders in it to serve as a template). Once this merges and we're in consensus, removal of the existing parser implementation can begin.
The `ENCODE-DCC/chip-seq-pipeline2` repo has too many lint rule violations to be useful to track in `Arena.toml`. This also skips over logging diagnostics for files that fail to parse when arena mode is turned on; this removes the duplicated diagnostics from `Arena.toml` and `Gauntlet.toml`.
This commit removes the suffix `[rule: <name>]` from lint diagnostics in favor of just using the "code" in the codespan diagnostic, which will instead use: ``` <severity>[<code>]: message ```
Was kicking tires. Noticed a regression in the Can we update the |
Another very similar case found in this files preamble: https://github.com/biowdl/tasks/blob/develop/deconstructsigs.wdl We should similarly be reporting a multiline span instead of one per line |
@a-frantz good catch. That should be easily fixed. Do you want me to include the fix in this PR? |
Sure, if you say it's an easy one! |
Question: # BAD
# BAD
# BAD Is that one diagnostic about the incorrect comments or two? Put another way, are the comments consecutive if they have only a single newline in separating whitespace or any number of blank lines inbetween? |
IMO one per "chunk" of trivia. Does that make sense? |
Last comment I'll make 😅 |
This fixes the `PreambleComments` rule to emit a single diagnostic for any number of consecutive (excluding whitespace) comments.
@a-frantz pushed up both changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
This commit removes the use of the existing parser implementation in favor of the new implementation in
wdl-gauntlet
.As a result,
Arena.toml
andGauntlet.toml
have been refreshed; in the case ofArena.toml
, there's a ~10X increase in logged diagnostics due to the new parser implementation successfully parsing files that the existing parser implementation incorrectly parsed with errors.Also added some wall-clock timing for the time spent analyzing a source file and an aggregate duration displayed at the end of gauntlet's output. From the before and after comparisons, the new parser implementation appears to be at least 20X faster in analyzing files, likely due to using
logos
for lexing and the new parser implementation does not backtrack likepest
does.Also added configuration for ignoring specific files so that
task-templates.wdl
can be excluded as it isn't expected to be valid WDL (it has non-WDL placeholders in it to serve as a template).Once this merges and we're in consensus, removal of the existing parser implementation can begin.
Before submitting this PR, please make sure:
changes (when appropriate).
CHANGELOG.md
(see"keep a changelog" for more information).