-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graph-builder: add support for configuration file #69
graph-builder: add support for configuration file #69
Conversation
Rebased to master and added graph-builder configuration options for #72 (comment). |
24e4382
to
f58304e
Compare
59eba02
to
d5206e0
Compare
@steveej I did a major rework on the PR in the direction of #69 (comment). This means that the While this definitely still needs some polishing, can you please another pass to see if suits your taste better than the previous iteration? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall approach looks very clean to me, especially the simplicity of the assemble
function.
I would've wished for some code generation (macros) for the merge implementations but I wouldn't block on that if there exists no ready-made solution right now.
I have added a couple of comments to be addressed and there are also some leftover from the previous review. Thanks for exploring this so thoroughly!
f2d6f09
to
922ed46
Compare
I added the final tests and documentation, completing the last TODOs. Dropped the WIP label, this is ready for the last round of review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the nice rework!
there's a small doc change I'd like to make and a couple of questions/suggestions I added (please feel free to dismiss the suggestions).
This updates cincinnati pod definition with newer CLI options names.
This adds a user-doc listing graph-builder TOML configuration options.
Pushed another update addressing last comments, PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lucab, steveeJ The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The default seems to have lived in the 30s range since at least 517af6e (graph-builder: add support for configuration file, 2019-03-12, openshift#69). But we can allow some latency here. On the cluster-version operator (CVO) side: * cvo.New takes an argument for the poll interval [1]. * That is fed by resyncPeriod(o.ResyncInterval)() [2]. * ResyncInterval is set from minResyncPeriod [3]. * minResyncPeriod is 2 minutes [4]. * resyncPeriod adds some jitter (returning between 1x and 2x the input value) [5]. So should be uniformly distributed between 2 and 4 minutes. That puts a floor on latency in getting this information out to clusters. It's also not a problem to add latency above the CVO-enforced floor. A delay in minutes in getting a new release out into a channel is not significant, because the biggest publication rush would be releasing a CVE fix into the stable channel. And that's already delayed by needing to cook in the candidate and fast channels, and eventually having the push to stable being a phased rollout. A delay in pulling an edge that has proven itself unstable is more serious, but we already have latencies in the minutes before alerts fire, and Telemetry/Insights that report issues to Red Hat add additional latencies in the minutes (Telemetry) to hours (Insights) range. Delaying the edge pull by a few more minutes is not going to greatly increase overall cluster-breaks -> edge-pulled latency here. The above discussion hopefully shows that there is little downside to growing the latency up to several minutes. With this commit, the delay is 5 minutes plus the time it takes the plugins to execute (I'd initially argued for 10m, but was bargained back down to 5 [6]). The upside is that it decreases our load on our external dependencies (quay.io, GitHub), reducing issues like: [2020-03-31T21:35:16Z DEBUG graph_builder::graph] graph update triggered [2020-03-31T21:35:20Z ERROR graph_builder::graph] Checking for new commit [2020-03-31T21:35:20Z ERROR graph_builder::graph] Parsing {"message":"API rate limit exceeded for user ID ...","documentation_url":"https://developer.github.com/v3/#rate-limiting"} to Vec<Branch> [2020-03-31T21:35:20Z ERROR graph_builder::graph] invalid type: map, expected a sequence at line 1 column 0 [2020-03-31T21:35:50Z DEBUG graph_builder::graph] graph update triggered It's also reduces our CPU and network load, since the graph-builder pod spends more time resting in between recalculate-the-graph spikes. Just the load consideration is enough for me to want to tune this latency as high as we can while keeping it well below other edge-pulling latency components. [1]: https://github.com/openshift/cluster-version-operator/blob/12623332615a0657a5468fae27cff1998a70bfef/pkg/cvo/cvo.go#L165 [2]: https://github.com/openshift/cluster-version-operator/blob/12623332615a0657a5468fae27cff1998a70bfef/pkg/start/start.go#L341 [3]: https://github.com/openshift/cluster-version-operator/blob/12623332615a0657a5468fae27cff1998a70bfef/pkg/start/start.go#L93 [4]: https://github.com/openshift/cluster-version-operator/blob/12623332615a0657a5468fae27cff1998a70bfef/pkg/start/start.go#L43 [5]: https://github.com/openshift/cluster-version-operator/blob/12623332615a0657a5468fae27cff1998a70bfef/pkg/start/start.go#L250-L251 [6]: openshift#264 (comment)
This reworks plugins setup and graph-builder CLI parser to allow for sourcing complex/structured options from configuration file(s). In particular, configuration logic now cares about the following entities:
Ref: https://jira.coreos.com/browse/CORS-967
/cc @steveej