-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement basic linting #282
Open
wxsBSD
wants to merge
8
commits into
VirusTotal:main
Choose a base branch
from
wxsBSD:check_lint
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+803
−41
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This implements a basic linter into the compiler that can check for required metadata fields and types. I chose to implement it into the compiler so that we can get these checks for free across any tool/binding that uses the compiler directly. Also, it was easier to get nicer reporting by building it into the compiler. This is an optional feature that you can turn on in the compiler using the "required_metadata" method on the compiler. If you do this the compiler will issue warnings if any metadata is missing or if the type is incorrect. I've provided all the basic types (string, int, bool, float) and also some convenience types like md5, sha1, sha256 and hash to make it a bit more flexible for users. Extending these with more convenience types is an easy exercise too.
Prior to this change if you had an invalid config file (ie: using the wrong type for an option) we would just silently replace it with the default config, which made debugging way more difficult than it should have been. We now pass errors back up to the caller if the error is a because the config file can not be parsed. Errors now look like this: ``` Error: invalid type: found signed int `0`, expected a boolean for key "default.fmt.patterns.align_values" in .yara-x.toml TOML file ``` Also, because we now display errors like the above it doesn't make sense to have each command have a -C option. If the config file in ${HOME}/.yara-x.toml is improperly formatted we would try to parse it and display an error even before we called the subcommand. This meant if you had an invalid config file in the default location and used "yr fmt -C /tmp/foo ..." you would see the error from your default config file despite the -C argument. To deal with this the -C argument was removed from each command and moved into the main one. It now loads the default config if, and only if, you do not specify one with the -C argument.
Also a bunch of other smaller changes that got lumped in here.
Move file io out of config helper function and generate better error messages when the file can't be parsed.
xorhex
reviewed
Jan 9, 2025
rule_name_regexp now returns a Result. This makes error handling in the case of an invalid (or too large) regex easier.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Apologies as this PR is a little bigger than I wanted, partially due to fixing a bug in config file handling and partially because it implements two linter checks (required metadata and rule name regexp).
Config file handling improvements
Prior to this change if you had an invalid config file in your home dir we would just silently replace it with the default config, which is unexpected from a user standpoint (invalid configs should produce an error and refuse to run). To deal with this we now pass errors back up in the main command instead of using
.unwrap_or_default()
.Also, rather than have each sub-command implement config file handling I moved it into main, to remove duplicate code. Also, because we are now propagating errors if you have an invalid config in
${HOME}/.yara-x.toml
I improved error messages so it is clear what file failed to parse:Linting checks
To implement linting checks for required metadata and rule name format I chose to implement them in the compiler directly. I initially had an implementation with a stand-alone Linter type that worked on the AST from the parser. While this worked it turned out that getting it to use the existing Report/ReportBuilder mechanism was exceedingly intrusive as reporting is coupled too closely to the compiler (or at least I couldn't figure out how to make it work easily). Because of this I just went with implementing it directly in the compiler, which I think is also a good idea because we now get warnings anywhere the compiler is used, which can be quite beneficial for enforcing standards.
To use these linting checks I added a section in the config file and exposed the options for it there. See the config guide changes for details, but the TLDR is you can now specify a table (TOML term for a dictionary) where the key is the metadata identifier that is required and the value is the type that is required. Any rule that does not meet these requirements will get warnings from the compiler.
The other check is a regex that must match the rule name. Again, this is specified in the config file and warnings are issued by the compiler for any rule that does not match this regex.
The default value for both of these options is off so by default the compiler does not enforce any particular standards but you can opt in to it for linting if you want. The only place that currently sets these options is the
yr check
command.If you're happy with these changes I'd like to also expose them in the various bindings so users can get lint warnings in the language of their choice.