-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor rules loading result #2098
Conversation
fb55b0f
to
3ca9396
Compare
/milestone 0.33.0 |
3ca9396
to
80f96d9
Compare
One follow-on change I'd like to make is to extend the context struct to point inside a condition string/output string. I think that might require some libs changes such as annotating ASTs with yaml file positions. I see that the parser object has a position but I think for multi-line strings the yaml parsing changes the position. Let me know if I'm wrong on that and I can work on merging the positions. |
2ddb743
to
a0d4d11
Compare
Edit: rebased and dropped now. |
Here are some example outputs that use the result structure and print the contents in various ways: Validating some files, no errors:
Validating some files, some errors + warnings. This version of the output is terse, which matches old behavior:
If you add
And if you add
|
Loading rules is pretty much the same, although you only get a terse error message on error:
And human-readable details with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this effort Mark! I tried my best to review this, left you some comments.
On top of that, I think we also need to change the documentation here:
Line 77 in 35db0b4
json_output: false |
json_output
now does more than just formatting rule outputs, so I think we should document it properly. Alternatively, we should further separate responsibilities and have a separate config field like json_load_result
.
userspace/engine/falco_load_result.h
Outdated
#include <nlohmann/json.hpp> | ||
|
||
// Represents the result of loading a rules file. | ||
class falco_load_result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest renaming this into load_result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think we should put it in a falco namespace though as load_result by itself is generic. That brings up the question of whether we should add a namespace to falco_engine as well? The engine doesn't have any namespace at the moment, which is why I didn't use one for falco_load_result.
I don't want to get carried away with changes, so how about this--in this PR I simply add a falco namespace around load_result, and once this is merged we add a top level namespace to falco_engine in a follow-on PR? Maybe with an additional rename of the falco_engine class, we can bikeshed about it in the follow-on PR.
item_name: some macro | ||
code: FE_LOAD_ERR_VALIDATE | ||
message: "Undefined macro 'foo' used in filter." | ||
validate_warnings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably connects to my other comment: here the warning is not meaningful. We should have failed at the first error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I didn't want to tackle it in this PR, which is already quite big, I was hoping that someday rules loading would not always fail at the first error. The exception-based method is a good way to stop loading and unwind to a top-level location, but it will only show the first problem, not all of the potential problems in a set of rules.
With this version, the result contains any warnings up until the first error, or all warnings if there are no errors.
I think having the validate_warnings/validate_errors properties as a list will support a future change to return multiple errors/warnings if we decide to change the implementation. We'd just have to change the expected values for any tests that have multiple errors, or additional warnings after any errors.
Does that sound ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I agree that validate_warnings
and validate_errors
must be lists to support multiple warnings and errors in the future, however I think changing our "fail at first error" model might be too much for a single PR. I'd suggest keeping the failure condition as-is for now, and eventually change it in a separate PR in which we document the rationale. Note that this would probably introduce user-facing breaking changes, so we might be to be careful about it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree on not changing the rules loading code, it will not strive to return all errors. But I think it's fine to keep the test code the same, right? That's only visible internally and end users won't see the test code.
for vres in vobj["falco_load_results"]: | ||
for warning in vres["warnings"]: | ||
if warning["code"] == warnobj["code"]: | ||
if ("message" in warnobj and warning["message"] == warnobj["message"]) or ("message_contains" in warnobj and warnobj["message_contains"] in warning["message"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically message
and message_contains
seem to be mandatory, right? Should we make it optional? Should we document this somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message_contains is only used for the test engine_version_mismatch, where the error message will be dependent on the current embedded falco engine version e.g. "Rules require engine version 9999999, but engine version is 14". If it's a big deal I could switch everything to be a substring match, but I thought the distinction was useful.
I believe all of the other error messages are static--they used to contain "in rule xxx" snippets but all of that info is in the context instead. The engine version is a strange dynamic thing that doesn't really fit into a context.
a0d4d11
to
0f601de
Compare
Related to the changes in falcosecurity/falco#2098, update the docs for the json_output config option to note that it controls both the output format for falco alerts as well as the format of rules loading/validation results. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
LGTM label has been added. Git tree hash: 6fa1cdef4ca1b62680725be188d3a1a553cf5ff3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @mstemm
Thank for this PR. It's a significant improvement.
Overall, it looks good to me. However, I've got a concern.
I don't see any real value in having non-verbose messages. The old behavior when loading rules was always to print verbose messages. It was advantageous. Think users that just copy/paste their Pod logs in a GitHub issue. Forcing them to redeploy Falco with -v
would waste time.
That being said, I approve this PR anyway because I think it's valuable. Yet, I'd invite all @falcosecurity/falco-maintainers to reconsider verbosity mode when loading/validating rules. I believe verbosity should be enabled by default in those cases, or the previous behavior should be kept.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jasondellaluce, leogr, mstemm The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
You're right about verbose--I really thought that the old behavior was to only print full details on the error when verbose was true. But I double-checked and 0.32.1 does print details both for |
Thank you, Mark! 🙏 |
In #2098, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Doing this required some small changes to the context struct, though. Previously, the name (e.g. yaml filename) was held separate from the list of locations. However, condition strings don't directly reflect any content in the yaml file. Yaml block scalars remove whitespace/unwrap strings. Macro references/list references are replaced with their contents. To handle this, move the "name" (e.g. filename) from the context into each location. This allows a chain of locations to start with file positions but transition to offsets within a condition expression. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Finally, when printing snippets that contain very long lines (> a static const 160 chars), instead of printing the entire line, print the 160 chars surrounding the position. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Doing this required some small changes to the context struct, though. Previously, the name (e.g. yaml filename) was held separate from the list of locations. However, condition strings don't directly reflect any content in the yaml file. Yaml block scalars remove whitespace/unwrap strings. Macro references/list references are replaced with their contents. To handle this, move the "name" (e.g. filename) from the context into each location. This allows a chain of locations to start with file positions but transition to offsets within a condition expression. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Finally, when printing snippets that contain very long lines (> a static const 160 chars), instead of printing the entire line, print the 160 chars surrounding the position. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
The latest released falco always prints full details on errors when used with -r (read rules)/-V (validate rules). However #2098 changed this to only print full details when verbose is true. Fix the regression by always printing full details regardless of verbose/-v. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Doing this required some small changes to the context struct, though. Previously, the name (e.g. yaml filename) was held separate from the list of locations. However, condition strings don't directly reflect any content in the yaml file. Yaml block scalars remove whitespace/unwrap strings. Macro references/list references are replaced with their contents. To handle this, move the "name" (e.g. filename) from the context into each location. This allows a chain of locations to start with file positions but transition to offsets within a condition expression. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Finally, when printing snippets that contain very long lines (> a static const 160 chars), instead of printing the entire line, print the 160 chars surrounding the position. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Doing this required some small changes to the context struct, though. Previously, the name (e.g. yaml filename) was held separate from the list of locations. However, condition strings don't directly reflect any content in the yaml file. Yaml block scalars remove whitespace/unwrap strings. Macro references/list references are replaced with their contents. To handle this, move the "name" (e.g. filename) from the context into each location. This allows a chain of locations to start with file positions but transition to offsets within a condition expression. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Finally, when printing snippets that contain very long lines (> a static const 160 chars), instead of printing the entire line, print the 160 chars surrounding the position. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Doing this required some small changes to the context struct, though. Previously, the name (e.g. yaml filename) was held separate from the list of locations. However, condition strings don't directly reflect any content in the yaml file. Yaml block scalars remove whitespace/unwrap strings. Macro references/list references are replaced with their contents. To handle this, move the "name" (e.g. filename) from the context into each location. This allows a chain of locations to start with file positions but transition to offsets within a condition expression. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Finally, when printing snippets that contain very long lines (> a static const 160 chars), instead of printing the entire line, print the 160 chars surrounding the position. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
The latest released falco always prints full details on errors when used with -r (read rules)/-V (validate rules). However #2098 changed this to only print full details when verbose is true. Fix the regression by always printing errors when loading rules. Warnings will be printed only with -v. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Related to the changes in falcosecurity/falco#2098, update the docs for the json_output config option to note that it controls both the output format for falco alerts as well as the format of rules loading/validation results. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
The latest released falco always prints full details on errors when used with -r (read rules)/-V (validate rules). However #2098 changed this to only print full details when verbose is true. Fix the regression by always printing errors when loading rules. Warnings will be printed only with -v. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
In #2098 and #2158, we reworked how rules loading errors/warnings were returned to provide a richer set of information, including locations/context for the errors/warnings. That did *not* include locations within condition expressions, though. When parsing a condition expression resulted in a warning/error, the location simply pointed to the condition property of the rule. This commit improves this to handle parse errors: - When libsinsp::filter::parser::parse() throws an exception, use get_pos() to get the position within the condition string. - Add a new context() constructor that takes a filter pos_info instead of a YAML::Mark. Now that positions aren't always related to the location of yaml nodes, Make up a generic "position" struct for locations and convert YAML::Mark and parser positions to a position struct. Also allow a context to contain an alternate content string which is used to build the snippet. For contexts related to condition strings, the content is the condition. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Related to the changes in falcosecurity/falco#2098, update the docs for the json_output config option to note that it controls both the output format for falco alerts as well as the format of rules loading/validation results. Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
What type of PR is this?
/kind feature
Any specific area of the project related to this PR?
/area engine
/area tests
What this PR does / why we need it:
When loading rules, return a structure with the load result and lists of errors/warnings + file locations.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: