-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graph-builder: add plugin to parse openshift/cincinnati-graph-data #231
graph-builder: add plugin to parse openshift/cincinnati-graph-data #231
Conversation
d99e81a
to
e50d406
Compare
e50d406
to
89cd657
Compare
|
||
/// If true causes the removal of all processed metadata from the releases. | ||
#[default(false)] | ||
pub remove_consumed_metadata: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to make this tuneable? Can we just always do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be helpful for debugging to leave them in, so see what annotations are on the release. Also for now it's still off by default, which is compatible with the existing deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be used to force the plugin to discard previous processed metadata and start with fresh metadata scanning ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This controls whether the metadata which is used by this plugin is removed after processing or not. As an example, if there is a previous.remove
with the value 1.2.3
metadata on a release, this plugin will remove the edge, in the graph. If remove_consumed_metadata
is true
in the plugin configuration, it will also remove the metadata entry with the key previous.remove
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, a few nitpicks and odd parts
@@ -11,4 +11,6 @@ if ! type -f "${YAML_LINTER}"; then | |||
exit 1 | |||
fi | |||
|
|||
find . -type f \( -name '*.yaml' -o -name '*.yml' \) -print0 | xargs -L 1 -0 "${YAML_LINT_CMD[@]}" | |||
find . \ | |||
-path "./graph-builder/src/plugins/openshift_secondary_metadata_parser/test_fixtures" -prune -o \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: define path as a var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's only one usage of it though, and I could see us wanting to prune more than just this one. usually I'm all for constants if they benefit the DRY principle, but in this case it would just add indirection.
graph-builder/src/plugins/openshift_secondary_metadata_parser/plugin.rs
Outdated
Show resolved
Hide resolved
async fn process_channels(&self, io: &mut InternalIO) -> Fallible<()> { | ||
let channels_dir = self.settings.data_directory.join("channels"); | ||
let channels: Vec<graph_data_model::Channel> = | ||
deserialize_directory_files(&channels_dir, regex::Regex::new("ya+ml")?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps its worth sticking to yaml
only until file extensions go out of control :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently in line with #226. I think both (yml and yaml) are common extensions. Do you think we should make it configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are common, but I'd rather not leave a space for errors here.
@wking @LalatenduMohanty do you agree we should stick to ".yaml" and ignore (possible) ".yml" in graph-data repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard practice for yaml file uses .yaml , .yml
. So we should only support these two extensions. Just supporting yaml
is fine too but then someone might request to add yml
as well. So supporting .yaml , .yml
makes us future proof.
@@ -53,7 +57,8 @@ impl PluginSettings for EdgeAddRemovePlugin { | |||
/// 2. next | |||
/// 2. *.remove | |||
/// 1. previous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems both previous
and previous_regex
data would be applied. Shouldn't we prefer previous_regex
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the existing syntax can coexist with the additional one, so I don't see why we should remove the existing one. What makes you think we should?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect we use either previous
or previous_regex
in blocked-edges
, but current implementation allows both. This may lead to a few odd situations - lets throw error when both are specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why it would be an error. They could be used in conjunction as well. Why do you want to restrict it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not an error really, just feels that it might have undesirable consequences. Maybe I'm overreacting really as I can't come up with a potential issue on this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it helps, I don't see a potential issue in this part of the code either, and I've thought about it quite a bit 🙂
&to.to_string() | ||
))? | ||
.insert( | ||
format!("{}.{}", self.settings.key_prefix, "previous.remove_regex"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure this would be a regex - is it an exact version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If by this you mean blocked_edge.from
, it is defined to be here: https://github.com/openshift/cincinnati/blob/89cd657bd552efdd1f8bd2c3c57a038e6cfecf9d/graph-builder/src/plugins/openshift_secondary_metadata_parser/plugin.rs#L20-L22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, correct, thanks!
89cd657
to
a44fe52
Compare
Thanks for the review! I addressed your comments either by code changes or answers. |
/retest |
a7768b3
to
4f3357f
Compare
Rebased and last TODO item completed! |
/retest |
@steveej Can we merge some of the commits with single line code changes still keep them logically separate? IMO all code changes related to the plugin should be just in one commit. |
4f3357f
to
1c35ec0
Compare
I suggest not squashing them, as it makes reverting commits more difficult in the future. Take ab718b4ab985601172e4345060990380693b2ba0 for example which exposes a constant. If this is squashed with the plugin commit, and we would want to revert the plugin commit, there is no guarantee the reversal won't break other code which in the meantime also depends on the public constant. |
…adata removal setting * Add `previous.remove_regex` metadata key to remove edges by regex matching. * Add a `remove_consumed_metadata` configuration option to control the removal of metadata after it has been processed.
Add openshift_secondary_metadata_parser plugin: * Implement plugin logic * Test using fixtures produced with a custom graph-builder to obtain a snapshot of the following triple: * a vanilla graph produced by the registry scraper plugin * a vanilla graph augmented by the quay secondary metadata plugin * the data from the openshift/cincinnati-graph-data repository The test asserts equality of two results of the EdgeAddRemovePlugin. The first result is obtianed by passing the raw graph through the new parser plugin and then through the EdgeAddRemovePlugin. The second passing the quay metadata graph fixture through the new parser plugin. This ensures compatibility between the new plugin and the QuayMetadataPlugin with respect to equality for subsequent processing of their produced metadata. Note: for this to work I had to add raw metadata to the fixtures, which represents metadata present on quay, but not present as part of the existing cincinnati-graph-data. This raw metadata has also been added to the upstream data repository.
1c35ec0
to
09fcff9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: steveeJ, vrutkovs The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add openshift_secondary_metadata_parser plugin:
Implement plugin logic
Test using fixtures produced with a custom graph-builder to obtain a
snapshot of the following triple:
The test asserts equality of two results of the EdgeAddRemovePlugin.
The first result is obtianed by passing the raw graph through the new
parser plugin and then through the EdgeAddRemovePlugin.
The second passing the quay metadata graph fixture through the new
parser plugin.
This ensures compatibility between the new plugin and the
QuayMetadataPlugin with respect to equality for subsequent processing
of their produced metadata.
Note: for this to work I had to add raw metadata to the fixtures,
which represents metadata present on quay, but not present as part of
the existing cincinnati-graph-data. This raw metadata has also been
added to the upstream data repository.
Split off the original #226.
Tests which rely on the previous plugin?tokio::fs
functions or usespawn_blocking