-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update: distribute cincinnati-graph-data as container images #310
update: distribute cincinnati-graph-data as container images #310
Conversation
## Open Questions | ||
|
||
### Cincinnati Git(Hub) scraper deprecation | ||
We need to decide whether we want to maintain the Git(Hub) scraper plugin in the future or drop it altogether. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to drop it. Customers can mount the volume with a cloned repo / extracted tarball and configure operator to use it, in case they want to mutate the graph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and configure operator to use it...
Not clear to me how you'd do that with the current operator config, but sure, extending the config is possible. And yeah, with a Cincinnati operator to handle this sort of thing, I'd like to do it that way instead of having Git logic baked into Cincinnati.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think discussing the operator config exposure is out of scope for this enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the open question. I'll add the decision to deprecate the Git plugin to the proposal, mentioning that Cincinnati can be configured to read these files directly from a user-provided directory.
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
- Implementation details about the streamlining of the mirroring process for primary and secondary release metadata. | ||
|
||
## Proposal | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of your implementation-level wording from Goals should move down into here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm entirely lost on the meaning of "goals" vs "proposal" in the context of an enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm entirely lost on the meaning of "goals" vs "proposal" in the context of an enhancement.
"goals" is what you'd pitch to project managers, end users, and other folks who are curious about how you're scoping the enhancement and what it will accomplish. "proposal" spells out the technical implementation to team members and other developers who will be reviewing the implementation. Might help to think about "non-goals", and try to have your goals match the level of detail you used for non-goals when saying "these are out of scope".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've split this up. Please take another look to see if it matches your understanding of the two sections.
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
The content of the cincinnati-graph-data repository may be altered by the process of being packaged by CPaaS and distributed via the container image registry. | ||
|
||
#### Mitigation | ||
A mitigation strategy is to sign the resulting image in the CPaaS pipeline, publish the signatures, and have Cincinnati verify the image after download. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just "have Cincinnati verify the signature.", because ideally the sig-check happens before downloading much, to avoid tarbombs and huge-layer DoS sorts of things.
What is the plan for teaching Cincinnati about trusted keys? Do we have a config setting where we inline an ASCII-armored GPG key that should be trusted? And then reboot Cincinnati when rotating keys? Teach Cincinnati to be able to configure multiple trusted signers to allow more graceful key rotation? Which image-signature-distribution mechanisms do we support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just "have Cincinnati verify the signature.", because ideally the sig-check happens before downloading much, to avoid tarbombs and huge-layer DoS sorts of things.
We want to ensure that the image is signed, which is what the previous part of the sentence is about. In my understanding, signature checking involves verifying the content integrity. Are you suggesting to download and verify the signatures first by themselves, and only on success proceed to download and verify the content?
What is the plan for teaching Cincinnati about trusted keys?
There's no plan, so we can discuss it here. Do you think adding the results under the mitigation section is appropriate?
Do we have a config setting where we inline an ASCII-armored GPG key that should be trusted? And then reboot Cincinnati when rotating keys? Teach Cincinnati to be able to configure multiple trusted signers to allow more graceful key rotation?
Having an an array of ASCII-armored GPG pubkeys sounds like a viable plan. This would be a plugin setting.
Which image-signature-distribution mechanisms do we support?
I was thinking about shipping them in a container image as well. Just to be sure, it would be a single signature, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting to download and verify the signatures first by themselves, and only on success proceed to download and verify the content?
Yeah. The signature asserts a pullspec-digest association (e.g. "sha256:817a12c32a39bbe394944ba49de563e085f1d3c5266eb8e9723256bc4448680e is a valid digest for docker.io/library/busybox:latest"). If you lookup by tag, the tag also provides a pullspec-digest association. If you don't find any signatures you trust asserting the pullspec-digest pair claimed by the tag, then there's no need to download anything further. If you find a signature you trust that asserts the pullspec-digest pair claimed by the tag, you can safely proceed to download the by-digest image. And your pulling library needs to ensure that downloaded content matches the claimed digests (e.g. that the top-level manifest for the image hashes to sha256:817a12...), to safely transfer the trust established by the valid signature.
Do you think adding the results under the mitigation section is appropriate?
I think there will need to be something in the proposal section about how users will tell Cincinnati which signers they trust, and about which protocol(s) Cincinnati will use to look up signatures. Then the mitigation section can reference that portion of the proposal and say:
We're mitigating this risk [with image signatures](#link-to-proposal-section-about-signatures).
Having an an array of ASCII-armored GPG pubkeys sounds like a viable plan.
Might be easier to skip the array and accept an ASCII-armored keyring, since a keyring can already contain multiple keys.
I was thinking about shipping them in a container image as well.
Not possible, because the signature includes the hash of the image's top-level manifest, and unless you've broken the hash you cannot have a Merkle loop like:
- Image layer, including a signature signing the manifest hash from (2).
- Manifest, including a hash of the layer from (1).
Just to be sure, it would be a single signature, right?
Other schemes are possible (e.g. require both ops and prod-sec signatures). Formats like containers-policy.json(5) allow folks to declare policies like that. For Red Hat and Cincinnati though, I think "we can find at least one signature that is trusted by at least one of the signers in the configured keyring" is probably enough, at least for an initial implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not possible, because the signature includes the hash of the image's top-level manifest, and unless you've broken the hash you cannot have a Merkle loop like:
I think it's possible. What I mean is have the signatures in a separate repository, while you're discussing having them in a subsequent layer of an image in the same repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. The signature asserts a pullspec-digest association ...
That doesn't answer my question. I tried to ask you if you think that the trust towards the signing certificate should be verified before even proceeding to attempt to verify content integrity.
Might be easier to skip the array and accept an ASCII-armored keyring, since a keyring can already contain multiple keys.
I'm not expert here, but a keyring sounds like a more complex data structure than an array of keys. Do you think it's important to lay this implementation detail out here? I'm fine if it his but I'd be surprised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of keyring may make it easier to implement verify by re-using logic (porting to Rust I suppose) already in use in oc and cvo (https://github.com/openshift/cluster-version-operator/blob/e318b86d674ccc850266b74169a0a4403d3b633b/pkg/verify/verify.go#L227-L241). Not sure but something to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned the keyring as a configuration option to the new plugin. I marked it as optional to remain flexible to environments where the image is not signed.
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
|
||
### Increase OCP release latency | ||
The enhancement introduces the additional latency of the CPaaS pipeline into the OCP release workflow. | ||
Assuming that the pipeline is fully automated, the latency introduced is expected to be several orders of magnitude lower (minutes) compared to the OCP release cadence (hours-days). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is about edge-pulling, where our latency is ideally a small fraction of our future phased-rollout duration. The failure detection timeline (in an ideal, future world):
- Update becomes available to a cluster.
- Cluster updates (could be ~minutes with a future auto-update knob, enhancements/update/automatic-updates: Propose a new enhancement #124, could be hours/days/never with manual updates).
- Cluster begins to hit update-triggered issues (sometime between 0 and 3h, with a cap of 3h for "update taking surprisingly long counting as a "failure" and contributing to inspection).
- Cluster pushes Telemetry up describing it's issue (pushes every ~5m).
- Alerting trips over the concerning Telemetry and pings for review, while also automatically PRing and merging graph-data to remove the edge for safety during triage (~minutes).
- CPaaS informed of the merge, builds the image, pushes to Quay (or wherever).
- Cincinnati (operator?) notices the new image, verifies the sig, pull it down, and starts serving it (~minutes).
- Rest of fleet pings Cincinnati, notices that the edge is gone (~minutes).
There's an unavoidable ~hours delay in step 3, and smearing out the roll-out minimizes the number of clusters that initiate the update during that window. If step 6 is also ~hours, we'd need to ~double our roll-out durations to maintain the same level of protection. If step 6 is minutes, the duration would only have to increase slightly to preserve the same level of protection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If step 6 is minutes, the duration would only have to increase slightly to preserve the same level of protection.
I assume it's minutes as the build job doesn't involve any IO or CPU heavy steps. The steps are: check out the git repo, build a container image with the files, sign the image, and publish the image and the signatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minutes sounds reasonable if all those steps are automated, but it's not obvious to me that they are (signing might require a manual step?). There's also whatever logic that is informing CPaaS of the merge, since my impression (which may be wrong) is that CPaaS can't listen to GitHub merge notifications. But regardless, "the OCP release cadence" is not what the latency section should use as a benchmark; it should be talking about edge blocking. If you can show CPaaS builds with ~minutes latencies, that's great. If you can wave your hands around "should be ~minutes", that's probably fine too. But folks shouldn't have to drop into PR comments to find these arguments ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LalatenduMohanty what's your impression on the CPaaS fully-automated build? Mine is that this is possible today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not the CPaaS build I am worried about, but the whole process of signing the image, creating an errata and pushing errata to live would take hours for sure. If I am proven wrong I would be happy about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an excerpt from a thread on the CPaaS Google Chat:
Eyal Edri,
Jul 17, 4:24 PM
,
@Stefan Junker loos at this run for e.g https://jenkins-cpaas-cnv.cloud.paas.psi.redhat.com/blue/organizations/jenkins/cnv-2.4.0%2FBuild-Pipeline/detail/Build-Pipeline/67/pipeline
it built multiple containers, RPMs, operator, attached to Errata and run smoke tests, in 1.3 hours
All of this happens automatically, and our builds are likely faster than the CNV build as we don't invoke any toolchains and also don't build RPMs.
Assuming that the pipeline is fully automated, the latency introduced is expected to be several orders of magnitude lower (minutes) compared to the OCP release cadence (hours-days). | ||
|
||
## Alternatives | ||
*None, other than leaving things as they are.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shipping secondary-metadata into the cluster via an alternative mechanism (tarball, ConfigMap, custom resource, whatever), and having the local Cincinnati operator provide a mechanism for the admin to submit it for injection into the Cincinnati container (ConfigMap, URI pointing at a tarball on some internal HTTPS host, local Git host, whatever). Upside of those approaches is that they don't involve CPaaS, its latency, and image signing (which is not well standardized), and instead rely on transitive trust (which is also not well standardized, but which is up to the user and does not need to be explained to Cincinnati or its operator). Downside is that you need a channel for pushing that data into restricted-network clusters, and #188 (which was setting itself up to be that channel) was rejected in favor of the narrower #283 (which explicitly excludes itself from being that channel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this enhancement is not only about the network-restricted operation of Cincinnati. So I'm not sure which of the things you're mentioning is an alternative on the level of this enhancement. I see them as alternatives to replace specific aspects of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to extract the essence of your comment into the alternatives
section.
### Product name for the CPaaS integration | ||
Which product name should this image be released under? | ||
Related to this is that it's currently unclear if the Cincinnati image itself will be published under its own product name or now. | ||
Further note: the repository for the graph-data images will not be semantically versioned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The graph-data format is likely to evolve. Now that Cincinnati is consuming the format directly, we should add a version identifier to the repository so Cincinnati can fail fast on snapshots it does not understand. But that's independent of this enhancement, which is just about how we get the content from the graph-data repo over to Cincinnati.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal for semantically versioning the graph-data openshift/cincinnati-graph-data#233 ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to write about the release information rather than its data format. I do agree that versioning the data format, and possibly versioning the repository according to the data format changes would make sense to inform consumers of breaking changes. This assumes that the data itself will foreseeably never have to be versioned.
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
7f6e85f
to
f701ecd
Compare
f701ecd
to
53ca65b
Compare
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
reviewers: | ||
- @openshift/openshift-team-cincinnati-maintainers | ||
- @openshift/openshift-team-cincinnati | ||
- @openshift/team-cincinnati-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think github sends any kind communication to the alias we have mentioned here. Wondering if we mention a single person's alias, the github notification works or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so either. Once all of the reviewer's comments are resolved I'll ping the approvers in the PR description.
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
### Product name for the CPaaS integration | ||
Which product name should this image be released under? | ||
Related to this is that it's currently unclear if the Cincinnati image itself will be published under its own product name or now. | ||
Further note: the repository for the graph-data images will not be semantically versioned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new product is a big deal as we need to create many things for a product. As a engineering organization we need to track each products. Product management also need to be on board for any new product .So all practical purpose we should assume it would use the same product as Cincinnati image and Cincinnati-operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that works by the practical means of setting up the fully automated pipeline for the image creation I'm fine with any option of product handling here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two things have completely different lifecycles and release gates. It could be difficult and counter-productive to meld them into one product.
Cincinnati and its operator will release when they have new features and/or bug fixes. Release gates include CI jobs and QE that ensure cincinnati itself works correctly.
The graph data will "release" based on OCP testing and releases that add or remove edges. Looking at the commit log for the repo, it seems to average more than one change per day. At this point, each of those changes is implicitly a "release".
It would be confusing and awkward to release cincinnati itself once or more per day, just because the graph data is changing. Completely different stakeholders should be involved in releasing cincinnati itself vs OCP's graph data. I think this would need to be a separate product from cincinnati.
@rthallisey WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your thoughts on this @mhrivnak. I fully agree with the semantics you're describing. What I'm lacking is knowledge and experience about the implications of what it means to be a product, both practically as well es theoretically. I would appreciate any help on this matter as I want this enhancement to be realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ryan Hartman would be a good person to discuss that with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveej Lets take this up with product management. We need to decide this asap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@katherinedube please chime in here if you can 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't block this enhancement on the point of the graph data being it's own product because the pending result isn't a blocker for the downstream implementation. Like I mentioned in this thread, you can choose what containers end up in an errata, so there may be a way to ship with "just graph data" and "just OSUS containers". Therefore, I think it's a good idea to assume they will be the same product and the details can be sorted later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @rthallisey . The decision about single vs multiple products is orthogonal to this enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cincinnati graph data is definitely something that should be part of OpenShift Container Platform (it's only ever used within this context.) I don't want to bog down the enhancement with implementation details, but it's something that needs to be worked alongside the rest of the work around how best to implement this.
|
||
## Summary | ||
The [cincinnati-graph-data GitHub repository][cincinnati-graph-data] will be continuously packaged and released in form of a container image by the means of integration with CPaaS. | ||
This image will be available for public consumption, and will be used by the publicly available Cincinnati instances operated by Red Hat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also carries a risk that app-sre might not agree to run this image on stage an production. So we should communicate that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I inquired about this they agreed to run it, however they do not have experience in doing so.
The registry, repository name, a tag or a digest will be configurable. | ||
If a tag is specified, the plugin will check if the specified tag changed on each plugin execution, and download the content if that check is positive. | ||
|
||
2. The Git(Hub) scraper plugin will be deprecated in favor of the new plugin. Note that Cincinnati has support to use no cincinnati-graph-data scraper plugin at all, and can be provided the content to the filesystem directly by the its operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a back mechanism to feed the production Cincinnati with secondary metadata if the image building pipeline gets in to some issue. Because the CPaaS pipeline is not under our control .So we need some backup mechanism to keep feeding Cincinnati running in production by APP- SRE.
If we deprecate the Git or Github scrapper what option we have for an alternate mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same option as the Cincinnati Operator uses today. Download the data from GitHub, extract it to the filesystem, mount the path to a volume in the Cincinnati Graph-Builder, and configure it to use that path directly and not scrape any secondary metadata source.
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
8f9252d
to
3248cbd
Compare
- [x] Graduation criteria | ||
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
||
## Open Questions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the Open Questions section towards the end or at least somewhere after the enhancement itself has been described in detail. The questions will be easier to understand and assess with an understanding of the enhancement itself. Especially for someone who has not been involved in the many internal discussions had by the OTA team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I kept the template's order for the sections, and I do see an advantage of this order as well. It makes sense to resolve the open questions before reviewing the full enhancement, as that may change significantly after these questions have been resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the intent of Open Questions section is to get the questions answered before the enhancement is reviewed externally - that is, by those outside of OTA team - and then remove this section and update the enhancement per the question resolutions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's almost how I understand it. My interpretation differs insofar that it's not limited it to those outside of OTA, but to anyone who can contribute to resolving it.
I could also see us leaving some open questions there if we decide they're not in scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could also see us leaving some open questions there if we decide they're not in scope.
If we decide it's out of scope, we should have a Non-Goals paragraph about it. Open Questions, as described in the template, is just for:
...areas of the design that require closure before deciding to implement the design...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wking, that makes sense. Still a few questions to resolve there!
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
6ceb959
to
02a531e
Compare
1. Every change to the master branch of the cincinnati-graph-data results in a new container image containing the unmodified working tree files of the new revision. | ||
This will allow the plugin to continuously grab the latest version without being informed that a latest version exists. | ||
|
||
2. The image will be signed by an official Red Hat key, and a container image including the signature for the latest image will be produced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me yet how this signature will look like. Maybe it's worth looking at how the OCP release images are currently signed.
|
||
3. The image will be pushed to a publicly available container image registry under a well-known repository name, and it's `latest` tag will be updated to the latest revision. | ||
|
||
4. The signature verification image will be pushed to the same repository similarly, under the tag `latest-signature`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is horrifying. But if your constraint is "everything we push has to land in the registry", it may be your only option. Are you trying to push it with some particular media type? Or are you dropping it into a layer in a shell container image (if so, you probably have to define a well-known path inside the layer filesystem)?
Hmm, also, this is going to be racy.
- Push image A to
latest
. - Push signature A to
latest-signature
. - Push image B to
latest
. - Cincinnati pulls
latest
and `latest-signature, but signature A does not match image B. - Push signature B to
latest-signature
. - Cincinnati tries again, and this time the image/signature pair matches.
Are we sure the delay between (4) and (5) will be manageable? Would we... stage the image somewhere else while we waited for a signature image, and then try to push them into latest
and latest-signature
as quickly as possible, hoping that nobody takes a snapshot for mirroring in the middle?
You could partially mitigate the race using signature-${at_least_part_of_the_images_digest_hash}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I had the race issue in my head at one point but forgot about it when writing the enhancement, thanks for bringing it up.
I recognize the race condition exists, and I can see that it's practically possible be experienced. I think it's worth exploring options to solve it.
(a) Always update the latest-signature
tag before the latest
tag, maintain a previous-signature
tag, and have Cincinnati try to verify with the latest and previous signature.
(b) Create a third image which merely references to the data and signature images. E.g. without relying on specific container metadata, this third container image could contain two files named image
and signatures
, which contain the digests of these images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just put a label on the first image that contains the tag you'll use for the signature image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of labels are you referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.docker.com/engine/reference/builder/#label
As seen for example if you run skopeo inspect docker://quay.io/fedora/fedora:31-x86_64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah these, thanks! I think that'll work, as the information is updated atomically.
*Not applicable* | ||
|
||
### Version Skew Strategy | ||
*Not applicable* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably talk about Cincinnati vs. image/signature formats that are newer/older than what it understands here. By linking here? If Cincinnati hits a version it doesn't understand, how would it inform its operator? Via a metric we can alert on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is in scope, as we're merely talking about changing the transport mechanism of the data here, not about the data itself.
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
A user might want to mirror all content which is required by Cincinnati. | ||
|
||
1. The user sets up a container image registry in the target environment. | ||
2. Optionally, if the target environment does not have internet access, the user downloads the release container images, the desired graph-data container image, and the images' verification signatures to a transfer medium. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "release container images", do you mean for cincinnati itself? I suggest limiting this proposal to just the graph data, and letting the deployment of cincinnati remain a separate topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I mean by this. I tried to explain how this proposal fits in and improves the overall network-restricted environment workflow. The surrounding workflow is only described to the extent to demonstrate the improvements, and is not meant to be changed by it in any other way.
|
||
#### Plugin implementation | ||
A new Cincinnati plugin will be implemented to fetch the container image from a configured container image repository. | ||
The plugin will have the following configuration options: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need optional credentials and trusted CA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the "trusted CA", do you mean for the TLS connection to the container image registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you suggest an inlined version of the CA or a path to a file? I'm slightly in favor of inlining as it means less files to take care of.
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
|
||
Instead of proceeding with this enhancement, we could continue to work with, or rather around, the current implementation of secondary metadata distribution in Cincinnati. | ||
|
||
Without any code changes, this would mean that we document the various ways how the cincinnati-graph-data GitHub repository can be mirrored and provided to Cincinnat, and how to reconfigure the plugin settings to directly read the data from disk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth including more detail on the alternative that is currently implemented with the cincinnati-operator. I wouldn't say it's a better option that this proposal, but since it works and is even implemented, it's worth describing.
It only applies if cincinnati is being run in k8s, which could be a downside if there's desire to run it outside of k8s.
The container image that has the graph data can be used as an init container to cincinnati. So long as it has the cp
binary or similar, it can copy the data into an emptyDir
volume, from which cincinnati can read the data.
An advantage is that trust and integrity are handled by the cluster's normal process of running a container.
It's up to the cincinnati-operator in that scenario to restart the pod when a new graph data image should be used. Today, that's determined based on a field in the Cincinnati
CRD Spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I added some of this information to the enhancement.
Is it correct that the responsibility of building the init container image from the graph data on GitHub is with the customer in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the current implementation requires the customer to build the image them self whenever they want a new "snapshot" of the data. It would be an improvement if they could instead just retrieve that image from Red Hat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enhancement doesn't cover the init container image, but it does cover the graph data image, in fact that is the primary goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. The "Plugin implementation" section discusses one way that the container image can be accessed and consumed. I'm suggesting that in the Alternatives section there be at least some discussion of the init container pattern as a different way to access and consume the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added https://github.com/openshift/enhancements/pull/310/files#diff-28b30f8a4731d3d193a2d4f521f39d71R191-R192 in response to your original comment here. PTAL and provide feedback if your comment is satisfied.
|
||
### Non-Goals | ||
1. Changes to the GitHub repository workflow or data format. | ||
2. Implementation details about the streamlining of the mirroring process for primary and secondary release metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oc
command currently has features to mirror OCP releases and to mirror optional operators. It seems natural for it to also mirror these graph data images. Would that be a separate enhancement proposal? Or why not part of this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I purposely left this detail out of the proposal to keep the scope as small as possible. I do see this as a possible improvement in the future, just not critical to this enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. We just need to ensure that there is some viable and supportable way for the customer to accomplish the mirroring. Perhaps skopeo
would meet this need, but it's worth confirming and testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'm not worried about this though, as anything which can pull an image to disk and push it from disk will do. skopeo seems do too as well. But I think that some subcommand of oc
can also copy images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a ticket somewhere to track this? I wouldn't really know where that would be. Could it be an issue in this repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to create a Jira card track this (to create an enhancement PR in future) or create another enhancement PR around this.
02a531e
to
ca29c8f
Compare
Without any code changes, this would mean that we document the various ways how the cincinnati-graph-data GitHub repository can be mirrored and provided to Cincinnati, and how to reconfigure the plugin settings to directly read the data from disk. | ||
|
||
This alternative is currently implemented for Kubernetes deployments in the [Cincinnati Kubernetes operator](https://github.com/openshift/cincinnati-operator). | ||
It ships the graph-data using the [init container pattern](https://www.vinsguru.com/cloud-design-pattern-kubernetes-init-container-pattern/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ships/accesses/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the init container not contain the data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. But the main difference is not in how the container is built or shipped. The difference is primarily in how it is used. It's a matter of using it as an init container vs. retrieving and extracting the data directly from a registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the perspective of a user, building and shipping is the relevant action, as the rest is supposedly handled by the cincinnati-operator.
It's a matter of using it as an init container vs. retrieving and extracting the data directly from a registry.
"Retrieving" is just another word for "shipping" in this context.
I think that we don't disagree as we both seem to understand the workflows, but I'm not fully getting what you're missing in the proposal. I guessed it would be to explain the implementation details so I added these. Please resolve the thread if this does the trick for you.
ca29c8f
to
107a5b3
Compare
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
enhancements/update/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
enhancements/ota/distribute-secondary-metadata-as-container-image.md
Outdated
Show resolved
Hide resolved
|
||
### Non-Goals | ||
1. Changes to the GitHub repository workflow or data format. | ||
2. Implementation details about the streamlining of the mirroring process for primary and secondary release metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to create a Jira card track this (to create an enhancement PR in future) or create another enhancement PR around this.
@steveej Once you resolve my comments I am ready to give a |
107a5b3
to
be48aca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveej Can you mark this enhancement implementable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LalatenduMohanty done
be48aca
to
14483ec
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sdodson, steveeJ The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
reviewers: