Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement basic linting #282

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

wxsBSD
Copy link
Contributor

@wxsBSD wxsBSD commented Jan 9, 2025

Apologies as this PR is a little bigger than I wanted, partially due to fixing a bug in config file handling and partially because it implements two linter checks (required metadata and rule name regexp).

Config file handling improvements

Prior to this change if you had an invalid config file in your home dir we would just silently replace it with the default config, which is unexpected from a user standpoint (invalid configs should produce an error and refuse to run). To deal with this we now pass errors back up in the main command instead of using .unwrap_or_default().

Also, rather than have each sub-command implement config file handling I moved it into main, to remove duplicate code. Also, because we are now propagating errors if you have an invalid config in ${HOME}/.yara-x.toml I improved error messages so it is clear what file failed to parse:

Error: Unable to parse /Users/wxs/.yara-x.toml: invalid type: found signed int `0`, expected a boolean for key "default.fmt.patterns.align_values" in TOML source string

Linting checks

To implement linting checks for required metadata and rule name format I chose to implement them in the compiler directly. I initially had an implementation with a stand-alone Linter type that worked on the AST from the parser. While this worked it turned out that getting it to use the existing Report/ReportBuilder mechanism was exceedingly intrusive as reporting is coupled too closely to the compiler (or at least I couldn't figure out how to make it work easily). Because of this I just went with implementing it directly in the compiler, which I think is also a good idea because we now get warnings anywhere the compiler is used, which can be quite beneficial for enforcing standards.

To use these linting checks I added a section in the config file and exposed the options for it there. See the config guide changes for details, but the TLDR is you can now specify a table (TOML term for a dictionary) where the key is the metadata identifier that is required and the value is the type that is required. Any rule that does not meet these requirements will get warnings from the compiler.

The other check is a regex that must match the rule name. Again, this is specified in the config file and warnings are issued by the compiler for any rule that does not match this regex.

The default value for both of these options is off so by default the compiler does not enforce any particular standards but you can opt in to it for linting if you want. The only place that currently sets these options is the yr check command.

If you're happy with these changes I'd like to also expose them in the various bindings so users can get lint warnings in the language of their choice.

wxsBSD added 4 commits January 7, 2025 15:43
This implements a basic linter into the compiler that can check for required
metadata fields and types. I chose to implement it into the compiler so that we
can get these checks for free across any tool/binding that uses the compiler
directly. Also, it was easier to get nicer reporting by building it into the
compiler.

This is an optional feature that you can turn on in the compiler using the
"required_metadata" method on the compiler. If you do this the compiler will
issue warnings if any metadata is missing or if the type is incorrect.

I've provided all the basic types (string, int, bool, float) and also some
convenience types like md5, sha1, sha256 and hash to make it a bit more flexible
for users. Extending these with more convenience types is an easy exercise too.
Prior to this change if you had an invalid config file (ie: using the wrong
type for an option) we would just silently replace it with the default config,
which made debugging way more difficult than it should have been.  We now pass
errors back up to the caller if the error is a because the config file can not
be parsed.

Errors now look like this:

```
Error: invalid type: found signed int `0`, expected a boolean for key "default.fmt.patterns.align_values" in .yara-x.toml TOML file
```

Also, because we now display errors like the above it doesn't make sense to have
each command have a -C option. If the config file in ${HOME}/.yara-x.toml is
improperly formatted we would try to parse it and display an error even before
we called the subcommand. This meant if you had an invalid config file in the
default location and used "yr fmt -C /tmp/foo ..." you would see the error from
your default config file despite the -C argument.

To deal with this the -C argument was removed from each command and moved into
the main one. It now loads the default config if, and only if, you do not
specify one with the -C argument.
Also a bunch of other smaller changes that got lumped in here.
Move file io out of config helper function and generate better error messages
when the file can't be parsed.
@wxsBSD wxsBSD changed the title Implement basic linting feat: Implement basic linting Jan 9, 2025
docs/YARA-X Config Guide.md Outdated Show resolved Hide resolved
wxsBSD added 4 commits January 9, 2025 11:27
rule_name_regexp now returns a Result. This makes error handling in the case of
an invalid (or too large) regex easier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants