-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSDOCS#12209: Sigstore signature verification for core OCP images #81985
base: main
Are you sure you want to change the base?
Conversation
/retest |
🤖 Thu Sep 19 16:55:57 - Prow CI generated the docs preview: |
. The cluster compares the calculated digest of the image manifest with the known digest. | ||
If the digest values do not match, the image is rejected without further processing. | ||
|
||
. The cluster downloads an image config file, which contains metadata such as SHA256 digests for each part of the container image, and calculates the digest of the image config file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The digests in the config are not actually validated / used for anything in our implementation; the layer digests in the manifest are used to ensure integrity.
I rather wonder whether customers care to read the implementation details in this paragraph. Something conveying “If we know the image digest, integrity of the whole image follows from that and is enforced” might be sufficient? But I wouldn’t know what customers are actually asking our support, maybe this is necessary and valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fair to trim out unnecessary details out. I think I added that part in to contextualize what an image config file even is, so it wouldn't just say "there's an image config file and that's part of the validation".
How about I change this line to "The cluster downloads an image config file and calculates its digest." to keep it simple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing else in the article specifically discusses the image config, so I’m not sure it’s necessary to contextualize it.
… but I wouldn’t know what customers are asking about.
|
||
[NOTE] | ||
==== | ||
The image manifest is validated only when the manifest digest is known, which is true in the following cases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this should go into that level of detail:
- Yes, if an image is being pulled by an unsigned tag, we don’t know the manifest digest, and we trust the registry to return the correct and unmodified manifest
- (But the config and layer integrity checks do happen.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any additional content you'd suggest to add this info in?
At the end of this note, should I add something like "Even if the image manifest is not validated in these cases, the image config and layers are still checked for integrity"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added my suggested sentence for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“for integrity against the [untrusted] manifest”? I’m very unsure about the “untrusted” word — it’s correct in the security jargon, but it might carry the connotation of “specifically known to be dangerous”.
My general opinion is that none of the details of configs/layers need to be included, and then we don’t need to discuss this either; but if they are discussed, like that.
[id="security-understanding-image-authenticity_{context}"] | ||
== Image authenticity validation | ||
|
||
In addition to validating the integrity of images, {product-title} clusters also validate their authenticity by certifying that the images were approved by a trusted source, in this case Red{nbsp}Hat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d be inclined to quibble about the “approved” wording — approved for what?
The only thing the signature really provides is integrity (and name binding, explained below), i.e. that they were not modified since they were signed. Any semantics like “approval” can be imposed by the organization making the signature, but those would be up to the specific signer.
In the RH case, I think the signature basically means “was produced by a RH build system and not modified since”. (It does not mean “approved to run by a specific kind of customer”, and it definitely does not mean “without known vulnerabilities” — we don’t remove signatures after an image is released.)
The other major value of the signature is the name binding, i.e. that when the user refers to an image by tag (e.g. quay.io/openshift-release-dev/ocp-release:4.99.98
, the user does not just get any image authored by Red Hat, but an image Red Hat authored to be …ocp-release:4.99.98.
I.e. users start with (the public key of Red Hat) and an intent to use a specific product+release; the signature turns that into a specific manifest digest, and then integrity of the image is assured. At no point users need to manually handle the image digests, e.g. by copy&pasting them from a RH website, or by calling RH support; they just refer to products+versions in image names, and the signatures ensure they get what they asked for.
In the context of OpenShift installs/updates, right now, this version binding IIRC not used (@wking to keep me honest), because the oc
binary, and Cincinnati metadata includes an image digest already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of OpenShift installs/updates, right now, this version binding IIRC not used...
Correct. But the only images Red Hat is currently signing under quay.io/openshift-release-dev/ocp-release
are OCP release images. And when a version string is part of the update request, divergence between the requested version and the actual release image version is fatal since 2020's openshift/cluster-version-operator#431. So the "is the requested image the OCP release image you're hoping for?" are happening today, just not at the pullspec-asserting-signature level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original word used was "published", though @wking suggested to change it to "approved" with the following reasoning given:
The signer and publisher can be different, e.g. a customer's ProdSec team might sign some subset of Red Hat release images as acceptable, and then those releases would have both Red Hat and customer-ProdSec signatures
So I'm not sure where to proceed if "approved" is too ambiguous. Maybe some rewording of that paragraph to convey the point about "was produced by a RH build system and not modified since"? Maybe something like this:
In addition to validating the integrity of images, {product-title} clusters also validate their authenticity by certifying that the images were approved by a trusted source, in this case Red{nbsp}Hat. | |
In addition to validating the integrity of images, {product-title} clusters also validate their authenticity by certifying that the images have not been modified since being produced by a trusted source, in this case Red{nbsp}Hat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to step back and suggest that:
In addition to validating the integrity of images, {product-title} clusters also validate their authenticity by certifying that the images were approved by a trusted source, in this case Red{nbsp}Hat.
This validation process includes checking the digital signature of the images.Cluster administrators can use
ClusterImagePolicy
resources to declare image policy for the cluster.
get reworded to something like:
Cluster administrators can use
ClusterImagePolicy
resources to declare image policy for the cluster, requiring signature validation and other policy checks in addition to image integrity.
Although other wording in the current pull is a bit mushy on whether "integrity" includes policy, or is limited to the internally-consistent digest-matching and excluding higher-level policy like sufficient signatures.
But this section is mainly about "ClusterImagePolicy exists, and you can use it!", and Red Hat using it in the openshift
policy for release images is a later addition, once readers have internalized what ClusterImagePolicy is about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll go with your suggested wording for now, and if need be it can be further refined here or in a v2 of this doc.
---- | ||
<1> Specifies OIDC issuer and the email of the Fulcio authentication configuration. | ||
|
||
.Example `ClusterImagePolicy` resource 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d recommend leading with the PublicKey
example, that’s really more relevant to enterprises (and it’s how RH images are signed). Similarly the signedIdentity
section may be critically important for some customers but should not be necessary in the default deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can switch the order of the examples.
fulcioSubject: <example_input_here> <1> | ||
rekorKeyData: dGVzdC1yZWtvci1rZXktZGF0YQ== | ||
fulcioSubject: | ||
oidcIssuer: https://OIDC.example.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I didn’t review the YAMLs line by line to ensure the syntax is correct.
In this case IIRC there is only one fulcioSubject
, so this is probably not quite correct.)
When image integrity is validated at pull time for images whose pullspecs have configured policies with matching `scopes` values, CRI-O will retrieve any Sigstore signature images associated with the image and check them against the configured policies. | ||
If signature verification fails for any configured policy, the associated pod will be rejected with <missing_info> conditions in Pod status and events in the associated namespace. | ||
//// | ||
FIXME: Should "associated pod" be "associated container"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, “the image for the container in the pod”. And IIRC one pod may have multiple containers, so the container granularity may be relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pod is the Kube resource, so it is the Pod's status
. Within that Pod's status, there are containerStatuses
and initContainerStatus
and ephemeralContainerStatuses
, and the SignatureValidationFailed
and ImagePullBackOff
reason/messaging will be tucked down in there for impacted container(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I should have known these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per conversation with @wking I'll reword this to something like the following:
If signature verification fails for any configured policy, the associated image pull will be rejected. Events will be created in the associated namespace explaining the verification failure, and the Pod
status
for containers consuming that image will set a relevantstate
.
But let me know if there's any further feedback for that new wording
//// | ||
|
||
[id="security-understanding-trusting-install_{context}"] | ||
== Trusting an {product-title} installation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also want to have a section about trusting updates, describing the Cincinnati stream?
It might not be relevant for this article, I don’t know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add that section from this blog, which I'm using as a reference for this overall PR, if you think it would be valuable to add. Or if it's something that can wait until a v2 of this doc, then I'm okay with that, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can exclude Cincinnati/OCP updates to ship a v1 of this content, and then we can think through if/how to add something in that space in follow-up work.
<1> This policy for `test0.com` and the policy from `mypolicy-0` will be appended together. | ||
|
||
The previous example resources define two policies, both of which apply to `test0.com` pullspecs, and only one of which applies to `test1.com` pullspecs. | ||
When image integrity is validated at pull time for images whose pullspecs have configured policies with matching `scopes` values, CRI-O will retrieve any Sigstore signature images associated with the image and check them against the configured policies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be a bit more precise about what “matching” means … and I should double-check.
In the underlying c/image, only the closest matching scope is used; I’m not immediately sure wether {Cluster,}ImagePolicy
also uses the “closest” scope semantics or whether the wider scopes are “inherited” down into the narrower ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterImagePolicy API Godocs include:
If multiple scopes match a given image, only the policy requirements for the most specific scope apply. The policy requirements for more general scopes are ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would the wording change look like in light of this feedback? Or do I need to add another sentence to clarify what matching means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just inline the Godocs sentence after the one we're commenting on here, so folks here about how "maching scopes
..." matter, and then the next sentence gives more precision about what qualifies as "matching".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I added the text snippet you provided
.Example `ClusterImagePolicy` resource 2 | ||
[source,yaml] | ||
---- | ||
kind: ClusterImagePolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind: ClusterImagePolicy | |
apiVersion: config.openshift.io/v1alpha1 | |
kind: ClusterImagePolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, does this need to be added to the other example too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, all the ClusterImagePolicy YAML examples will need this apiVersion
line
fulcioCAWithRekor: | ||
fulcioCAData: dGVzdC1jYS1kYXRhLWRhdGE= | ||
fulcioSubject: <example_input_here> <1> | ||
rekorKeyData: dGVzdC1yZWtvci1rZXktZGF0YQ== | ||
fulcioSubject: | ||
oidcIssuer: https://OIDC.example.com | ||
signedEmail: test-user@example.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fulcioCAWithRekor: | |
fulcioCAData: dGVzdC1jYS1kYXRhLWRhdGE= | |
fulcioSubject: <example_input_here> <1> | |
rekorKeyData: dGVzdC1yZWtvci1rZXktZGF0YQ== | |
fulcioSubject: | |
oidcIssuer: https://OIDC.example.com | |
signedEmail: test-user@example.com | |
fulcioCAWithRekor: | |
fulcioCAData: dGVzdC1jYS1kYXRhLWRhdGE= | |
rekorKeyData: dGVzdC1yZWtvci1rZXktZGF0YQ== | |
fulcioSubject: | |
oidcIssuer: https://OIDC.example.com | |
signedEmail: test-user@example.com |
signedPrefix: test-remap-signed-prefix | ||
|
||
---- | ||
<1> Specifies OIDC issuer and the email of the Fulcio authentication configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also explain the other fields and their possible values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, is there any content that describes these parameters, or could anyone provide me such information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are Godocs like these. But I'd expect those to get rendered under here. Or maybe we only doc GA-cluster APIs there? For these longer-form docs, I'd expect something that weaves some user stories about how the knobs all come together to deliver a valuable target, with notes to allow customers to riff off that target, and I'm not aware of anything existing today in that space. For example this enhancement user story is very thin, and the policy YAML quoted later in the enhancement doesn't have associated text explaining the environment it fits into and how the configuration plugs into that environment.
I agree that these kind of longer stories would be very useful. I think node team can add them as follow-up improvements on an initial pass in this pull, because y'all are the experts on Sigstore and the sorts of environments customers are likely to set up and want to plug their policies into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would explaining more of these parameters here honestly be useful in the context of this particular doc? I feel like the purpose of this doc is to mostly just show that ClusterImagePolicy exists and that it now plays a role in internal core image validation.
To me, it seems like if we really wanted to show users how they could use this feature for their own consumption, that would belong in some different doc, perhaps some procedure, and that this other doc would be a better location for any sort of comprehensive reference info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally think that there only needs to be one place for full reference documentation, and this article does not look like the canonical place for that. Linking to the reference wouldn’t hurt.
6b32e54
to
feb18e4
Compare
@skopacz1: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
prefix: test-remap-prefix | ||
signedPrefix: test-remap-signed-prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(For the record, these values are invalid and will either be rejected, or won’t do anything — real prefixes contain at least a host name with at least one dot, or a host:port. But then again, the …Data
fields don’t contain really realistic inputs either.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix: test-remap-prefix | |
signedPrefix: test-remap-signed-prefix | |
prefix: test.remap.prefix | |
signedPrefix: test.remap.signed.prefix |
What do you think about this? theses are valid for the regex validation, and look more like a prefix of the repository
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if we care about the example values, it would be better to make this all realistic. scopes: … mirror.internal/vendorA/productA
; prefix: mirror.internal/vendorA/productA
; signedPrefix: vendorA.example/productA
or something (and maybe the same vendorA.example
in the Fulcio conditions).
But, also, it seems to me that details of these CRs are really not the primary focus of the article.
= Container security and validation | ||
|
||
{product-title} is released as a set of container images that comprise the different functions of the software. | ||
These images must be validated in order to ensure that they have not been tampered with by an external party, and to ensure that they are the intended images provided by Red{nbsp}Hat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can rephrase to highlight "customer admins can use ClusterImagePolicy for their own policy too!" with something like:
{product-title} runs containers from containers images.
These images may be validated in order to ensure that they have not been tampered with by an external party.
Then the following sections are:
- Image integrity validation at pull time, which can be dropped or tweaked.
- Image authenticity validation, introducing ClusterImagePolicy as a generic tool.
- Trusting an {product-title} installation, which gets into Red Hat images for the first time.
- Unfortunately for continuity, we have to take a bit of a detour into bootstrapping/installer trust, which is complicated.
- Trusting a connected installation, still talking about client/installer trust.
- Trusting a disconnected installation, still talking about client/installer trust.
Hrm, and seems like we want an additional section like Trusting an {product-title} post-install, that points out the trusted installer will roll out a trusted release image with the openshift
ClusterImagePolicy (for TechPreview clusters), and that that pulls in the ClusterImagePolicy security discussed under Image authenticity validation for images in the quay.io/openshift-release-dev/ocp-release
scope. That prevents folks from running an quay.io/openshift-release-dev/ocp-release
image that hasn't been signed by the trusted Red Hat key used to sign OCP release images, e.g. in the event that quay.io or a customer-configured mirror registry is compromised.
Or maybe that only matters when we start to talk about updates, and we're puning on that for now, because really, you'd need a compromised mirror and an admin tricked into updating to a malicious digest or a by-tag release image reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where I’m a little lost is that the goal, per the title of the PR (not of the document), is to talk about OCP product images … which don’t currently use sigstore / ClusterImagePolicy
IIRC (just in a tech preview). So if that is the goal, why bring ClusterImagePolicy
into the picture at all, now? It would certainly be relevant in a future release, but I don’t think we are there yet.
OTOH the title of the document (as opposed to the PR) talks about container “security and validation”, i.e. not restricted to core OCP, and in that case ClusterImagePolicy
is certainly the most relevant part, and the OCP trust bootstrapping a side issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OCP trust bootstrapping is more of an installer issue, and I'm fine with them leading there, and doc'ing the long-GA processes around that.
ClusterImagePolicy is more of a node issue, and I'm fine with the node folks leading on that. There was some work associated with the 4.16 tech-preview of the type in #76613, #77203, #78396.
I'm here because 4.17 is releasing the tech-preview openshift
ClusterImagePolicy roughly as laid out in this enhancement. There's a release note landed today via #81997, but it doesn't cover mechanics, and I'd expect customer security folks to have some questions about mechanics. But maybe they don't? I'm fine letting this float until there's some clarity around the kinds of security questions customers ask, and if anyone is particularly interested in if/how to manage installer trust or have customers create their own ClusterImagePolicies.
// * security/container_security/security-understanding.adoc | ||
|
||
:_mod-docs-content-type: CONCEPT | ||
[id="security-understanding-container-validation_{context}"] | ||
= Container security and validation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic: “container security” is much larger than the scope of this document.
“Container image autenticity?” “Preventing supply chain attacks on container images”?
The This is because your PR targets the If the update in your PR does NOT apply to version 4.18 onward, please re-target this PR to go directly into the appropriate version branch or branches (enterprise-4.x) instead of main. |
@skopacz1: This pull request references OSDOCS-12209 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@skopacz1: No Jira issue is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
OSDOCS-11885
Version(s): 4.17+
This PR adds 1. general content about how OCP validates the integrity and authenticity of container images and 2. specific details about how the 4.17 TP ContainerImagePolicy resource plays a part in the authenticity validation.
QE review:
Preview: Container security and validation