graph-builder: use openshift/cincinnati-graph-data instead of quay labels #232

steveej · 2020-02-25T09:16:10Z

This configures the graph-builder binary to use the
GithubOpenshiftSecondaryMetadataScraperPlugin and the
OpenshiftSecondaryMetadataParserPlugin in conjunction to use
https://github.com/openshift/cincinnati-graph-data as the new source for
secondary metadata.

Requires graph-builder: add plugin to fetch openshift/cincinnati-graph-data from github #226
Requires graph-builder: add plugin to parse openshift/cincinnati-graph-data #231
Github token support in scraper graph-builder/plugins/github scraper: add token authorization handling #233
Configure the e2e test cluster to work with the new plugin (requires graph-builder: move plugins to cincinnati crate and catalog, and make plugins configurable #234, and possibly more)
Based on openshift_secondary_metadata_parser: add 20200319.204124 test fixtures #250
Based on Justfile: make running the graph-builder and e2e tests simpler #258

steveej · 2020-02-26T20:56:41Z

/retest

steveej · 2020-02-27T07:34:22Z

/retest

vrutkovs · 2020-02-27T09:42:21Z

We're now using prod metadata - but tests are still using channels "a" and "b"

steveej · 2020-02-27T10:12:30Z

We're now using prod metadata - but tests are still using channels "a" and "b"

Note that we're only using prod secondary metadata, while using test primary metadata 😄

When I started working on a solution for this locally, I stumbled upon a real issue:

[2020-02-27T10:06:44Z ERROR graph_builder::graph] Parsing {"message":"API rate limit exceeded for 82.197.161.223. (But here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)","documentation_url":"https://developer.github.com/v3/#rate-limiting"} to Vec<Branch>

Our default delay of 30 seconds is too short for GitHub's API, the good news is we're not that far away:

For unauthenticated requests, the rate limit allows for up to 60 requests per hour. Unauthenticated requests are associated with the originating IP address, and not the user making requests.

I.e., raising the delay to 60 seconds, assuming our deployments have different IPs, will do it until we have authentication against the GitHub API. If we have consensus I will raise the delay to 60 in this PR as well.

vrutkovs · 2020-02-27T10:22:04Z

If we have consensus I will raise the delay to 60 in this PR as well.

SGTM, perhaps we should invest into token support (most likely we'll use that on stage / prod anyway)

steveej · 2020-02-27T10:25:50Z

SGTM, perhaps we should invest into token support (most likely we'll use that on stage / prod anyway)

Definitely a mid-term goal, but I wouldn't block on it, as I think 60s, or even 120s is a small duration compared to the release frequency and the cluster sync frequency.

dist/openshift/cincinnati.yaml

steveej · 2020-02-27T11:46:47Z

As per @LalatenduMohanty comment I have changed my opinion on blocking on authentication. The comment reminded me of our intention to follow best continuous delivery practices, which means that we shouldn't compromise our quality in master (which feeds stage) if not necessary.
In this case it can be prevented.

I implemented authentication in a new commit. Now we need to get secrets for CI, staging and production and wire them through in the deployment templates.

graph-builder/src/plugins/github_openshift_secondary_metadata_scraper/plugin.rs

steveej · 2020-03-30T16:30:48Z

Amending to my previous comment, we're not dealing with actual fluctuation of the CI, but I'm certain that this PR doesn't degrade the the performance of the Cincinnati stack either. This PR does change the e2e deployment to use production data, which is the likely cause of performance changes.

steveej · 2020-03-30T17:42:55Z

/retest

vrutkovs · 2020-03-31T08:03:25Z

This LGTM, lets squash in a single commit

* GithubOpenshiftSecondaryMetadataScraperPlugin: fix path in debug output Unpacking happens to a tmpdir and not to the final output_directory. Reflect this in the debug message. * GithubOpenshiftSecondaryMetadataScraperPlugin: add revision setting When `revision` is set, the plugin will try to download this revision from the configured repository instead of looking up the latest one. * GithubOpenshiftSecondaryMetadataScraperPlugin: fix should_update evaluation Only download if the wanted and completed commits *do not* match. Previously this has been mistakenly inverted. * GithubOpenshiftSecondaryMetadataScraperPlugin: make directory handling safe and robust This changes the plugin create and use two temproary directories inside the given output directory; one per extraction and final output respectively. The path of the final output directory is passed through the following plugin via the IO parameters at the key defined with the public variable `GRAPH_DATA_DIR_PARAM_KEY`. With this change, the plugin only touches files within the configured output directory, while never removing any pre-existing files, which it both previously did. Also the unit test is changed to call the plugin twice, to ensure the plugin does not fail on subsequent runs. * OpenshiftSecondaryMetadataParserPlugin: lookup data directory in parameters The data directory found in the IO parameters, if found, has precedence over the configured data directory.

…ondaryMetadataParser We want to use an OpenShift specific default value for the key_prefix setting. However, using a default value for the key_prefix string which is not the default for the string type, which would be the empty string, is not intuitive. This can be alleviated by making the default value a publicly expose constant and thus make it transparent.

…bstract releases This switches to using the `Graph`'s implementation of `Eq` for comparison and exludes abstract releases in the edge debug output.

vrutkovs

/lgtm

🎉

LalatenduMohanty · 2020-03-31T10:53:02Z

/hold I am still going through the PR

I previously assumed the dev-dependency would refer to the version of the same dependency from the regular dependency list. According to this error my assumption was wrong. ``` cincinnati/Cargo.toml: dependency (prettydiff) specified without providing a local path, Git repository, or version to use. This will be considered an error in future versions ``` For now I don't see a better way than to manage these versions separately, while bumping them in lockstep.

* graph-builder: by default use openshift/cincinnati-graph-data instead of quay labels This configures the graph-builder binary to by default use the GithubOpenshiftSecondaryMetadataScraperPlugin and the OpenshiftSecondaryMetadataParserPlugin in conjunction to use https://github.com/openshift/cincinnati-graph-data as the new source for secondary metadata. It also removes the NodeRemovePlugin because it's not envisioned to be used within the secondary metadata schema. * tests/e2e: test with cincinnati-graph-data This switches the e2e test to verifying the a Cincinnati stack which is configured to fetch metadata from a specific revision of the cincinnati-graph-data repository [0]. Also change the test to reuse the cincinnati graph comparison functionality which offers a helpful human readable failure output. [0]: https://github.com/openshift/cincinnati-graph-data * dist/openshift: scrape secondary metadata from github This changes the OpenShift template plugin configuration to use the GitHub scraper plugins by default.

* Justfile: configure scraping from GitHub and add e2e recipes * Justfile: refactor graph-builder arguments to global variables This makes it easier to override variables for the `run-graph-builder` and related recipes where it is a transitive dependency.

The previous change to using production data in the e2e graph-builder configuration had a significant impact on the load balancing test results. Until we have better tracing capabilities we simply lower the expectations.

steveej · 2020-03-31T13:02:28Z

New changes are detected. LGTM label has been removed.

Just reworded a commit message, no content changes:

$ git diff 814bee1 a8cddc8 | wc -l
0

LalatenduMohanty · 2020-03-31T13:17:02Z

/hold cancel
/lgtm

openshift-ci-robot · 2020-03-31T13:17:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LalatenduMohanty, steveeJ, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [LalatenduMohanty,steveeJ,vrutkovs]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 25, 2020

openshift-ci-robot requested review from LalatenduMohanty and rrati February 25, 2020 09:16

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 25, 2020

steveej mentioned this pull request Feb 25, 2020

graph-builder: add plugin to fetch openshift/cincinnati-graph-data from github #226

Merged

7 tasks

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch from 9c82118 to 7efc17b Compare February 26, 2020 19:38

steveej changed the title ~~[WIP/BLOCKED] graph-builder: use openshift/cincinnati-graph-data instead of quay labels~~ graph-builder: use openshift/cincinnati-graph-data instead of quay labels Feb 26, 2020

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 26, 2020

steveej requested review from vrutkovs and removed request for rrati February 26, 2020 19:57

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch from 7efc17b to d29f684 Compare February 27, 2020 07:37

LalatenduMohanty requested changes Feb 27, 2020

View reviewed changes

dist/openshift/cincinnati.yaml Outdated Show resolved Hide resolved

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch from 4c4d0a0 to 0cba757 Compare February 27, 2020 12:00

openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 27, 2020

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch 3 times, most recently from 27a69e4 to 6261258 Compare February 27, 2020 14:14

vrutkovs reviewed Mar 4, 2020

View reviewed changes

graph-builder/src/plugins/github_openshift_secondary_metadata_scraper/plugin.rs Outdated Show resolved Hide resolved

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 4, 2020

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch from b7287c4 to 374778c Compare March 11, 2020 08:54

openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2020

steveej added 3 commits March 31, 2020 11:37

cincinnati/testing/compare_graph: compare graph directly and ignore a…

b99a978

…bstract releases This switches to using the `Graph`'s implementation of `Eq` for comparison and exludes abstract releases in the edge debug output.

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch from d900722 to 814bee1 Compare March 31, 2020 09:40

vrutkovs approved these changes Mar 31, 2020

View reviewed changes

openshift-ci-robot assigned vrutkovs Mar 31, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2020

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 31, 2020

steveej added 7 commits March 31, 2020 14:59

e2e/Cargo/deps: add lazy_static, cincinnati

f677ee1

config: ensure path_prefix is parsed alike for cli and file config

f7014b1

graph-builder/Cargo: remove stale deps

c0a0443

e2e/load testing: reduce expected rate

a8cddc8

The previous change to using production data in the e2e graph-builder configuration had a significant impact on the load balancing test results. Until we have better tracing capabilities we simply lower the expectations.

steveej force-pushed the pr/graph-builder-secondary-metadata-from-github branch from 814bee1 to a8cddc8 Compare March 31, 2020 13:00

openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2020

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 31, 2020

openshift-ci-robot assigned LalatenduMohanty Mar 31, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2020

openshift-merge-robot merged commit 90efacd into openshift:master Mar 31, 2020

steveej deleted the pr/graph-builder-secondary-metadata-from-github branch March 31, 2020 15:02

steveej mentioned this pull request Mar 31, 2020

e2e/load testing: use production data #261

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

graph-builder: use openshift/cincinnati-graph-data instead of quay labels #232

graph-builder: use openshift/cincinnati-graph-data instead of quay labels #232

steveej commented Feb 25, 2020 •

edited

Loading

steveej commented Feb 26, 2020

steveej commented Feb 27, 2020

vrutkovs commented Feb 27, 2020

steveej commented Feb 27, 2020

vrutkovs commented Feb 27, 2020

steveej commented Feb 27, 2020

steveej commented Feb 27, 2020

steveej commented Mar 30, 2020

steveej commented Mar 30, 2020

vrutkovs commented Mar 31, 2020

vrutkovs left a comment

LalatenduMohanty commented Mar 31, 2020

steveej commented Mar 31, 2020

LalatenduMohanty commented Mar 31, 2020

openshift-ci-robot commented Mar 31, 2020

graph-builder: use openshift/cincinnati-graph-data instead of quay labels #232

graph-builder: use openshift/cincinnati-graph-data instead of quay labels #232

Conversation

steveej commented Feb 25, 2020 • edited Loading

steveej commented Feb 26, 2020

steveej commented Feb 27, 2020

vrutkovs commented Feb 27, 2020

steveej commented Feb 27, 2020

vrutkovs commented Feb 27, 2020

steveej commented Feb 27, 2020

steveej commented Feb 27, 2020

steveej commented Mar 30, 2020

steveej commented Mar 30, 2020

vrutkovs commented Mar 31, 2020

vrutkovs left a comment

Choose a reason for hiding this comment

LalatenduMohanty commented Mar 31, 2020

steveej commented Mar 31, 2020

LalatenduMohanty commented Mar 31, 2020

openshift-ci-robot commented Mar 31, 2020

steveej commented Feb 25, 2020 •

edited

Loading