-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Github workflow run information to the signing certificate #624
Comments
@asraa Any thoughts on this? Do you have the context on the set of information we initially chose to include in issued certs for GitHub? Some of this information seems more like build provenance rather than identity. It also depends where we draw the line for what represents a GitHub Actions identity. If we add this additional information, each run will get its own certificate with different identity information in the OIDs, so we're changing the certificate identity from a per-workflow identity to a per-run identity. |
Hey @tonistiigi - thanks for the issue!
Yes, I totally see the concern here. To reiterate on the Hayden's point on identity: I think the information we include in the cert is meant to pinpoint the Workflow itself as an identity, not the Run (the repo, commit hash and ref pinning the workflow content). An anlogy would somewhat be that GitHub logins pinpoint on GitHub username/emails, and does not include extensions like "log in time" to map to the actual actor given that usernames can be overtaken. Do you have a specific use-case in mind? I work on a project that creates build provenance that signs over the specific WorkflowRun information with Fulcio-issued certs, that may be a similar path for what you are trying to do. |
I agree that fields like these (eg. links to build logs) could also be part of provenance but in the example case, the attacker has already managed to trigger a workflow. So they have full control of the builder. Where builder writes the provenance payload from the environment to say build logs is at path There could even be a policy that verifies that the builder logs in provenance match up with the signer. |
I think it'd be difficult to build a verification policy that enforces run ID rather than workflow ID. I can have a policy that says "I only trust builds from workflow X" because the workflow is provided out of band or some trusted well-known workflow. Run IDs aren't known beforehand, and I can't think of a way to publish the "trusted" run IDs and differentiate those from run IDs where the workflow run was started by an attacker. |
@haydentherapper Not a policy that verifies a specific RunID but that the signer was allowed to sign such a provenance that points to build logs at a specific RunID. |
Can you explain more what you mean by "So they have full control of the builder"? In the case of SLSA3+. the attacker cannot control what the build service writes into the provenance, besides build information such as compiler arguments, env variables and repo source. For example, for our SLSA3+ Go builder, we record run ID, run number and run attempt https://github.com/slsa-framework/slsa-github-generator/blob/main/internal/builders/go/README.md#example-provenance From there you can fetch the logs. |
@laurentsimon It depends on what tool you use to generate the provenance. The condition for this case is that attacker already triggered a workflow they can control. So if as part of your workflow you are running a process that runs the builder/generator process the attacker can modify this process before it is invoked. Or it can just instead run its own process that just generates the same JSON bytes that the provenance generator in the actual release build would generate. |
Gotcha. What tool are you using to generate the provenance? We are going to release a generic provenance in a few weeks (see https://github.com/slsa-framework/slsa-github-generator/blob/main/internal/builders/generic/README.md) which I think may allow you to do what you're asking. It lets you compile your project and "attach" a non-forgeable provenance, which contains the run information you're looking for. The slsa-verifier will let you verify the provenance, and you can then peek into the provenance and get the run info. Please let us know if this would work for your use case; and if it does not, what would :-) /cc @ianlewis |
@laurentsimon I consider anything that runs in the Github VM insecure for this case. It is not even that attacker needs to find a place to inject into your build(what is possible), but they can just run whatever they want and make it output the same value that your generator-builder does. Then they can get it properly signed as well. If the signer identity included the RunID then they could not fake the build logs link and would be caught instead of cleaning up traces that this run ever existed. |
I'm not following. The attacker cannot influence the content of the provenance, except for the build steps (and the final hash). The attacker cannot influence the run ID, run attempts and run number: the trusted builder (the one I provided links for) retrieves this information itself, without any input from the attacker. The attacker does not control the builder: it has an interface the attacker can call, but that's it. The builder I linked above uses a re-usable workflow, which enforces isolation from the developer workflow (see 1, 2). The provenance generation is done in a different VM than the build itself, so the attacker cannot control it. I must be mis-understanding something. Please correct me. |
I don't know what you mean by the attacker but it is not what I think. Going back to Fulcio, I (attacker) have obtained a capability to run my code in workflows in Github. Doesn't really matter how, for simplicity let's say credentials of a project maintainer were stolen. Inside my code, I can contact Fulcio and request a signing certificate with the fields listed above. With this certificate, I can now sign absolutely anything, any random bytes, any provenance, including the bytes that were previously generated by your "trusted builder". Users viewing that provenance will see that it has a verified signature, it points to (fake) build logs that look legit etc. If the attacker now cleans up after itself, removing github runs etc. only way they could be caught is by someone doing audit on the Fulcio transparency log. |
Can you elaborate on your threat model? What do you mean by attacker?
The certificate used by the trusted builder is not accessible by your repo. The trusted builder has it own identity. When you verify the provenance, you first verify that it was generated by the trusted builder. So in your case, verification would fail because it would have the wrong identity (your repo's identity). Once the identify of the builder is verified, the verification verifies the repo the source code came from, in this case your repo. Both are available in the cert by Fulcio in the trusted builder we have built, so there are distinguishable from one another
One attack that provenance does not solve is an attacker who pushes code to your repo. In this case, adding a run ID to the cert does not help either, unless you parse the logs and search for malice. In an incidence response, that is probably fine, but in the the general case, it's unlikely you're going to do this for every verification. Either way, the trusted builder gives you the same guarantees: you can get the run ID/etc from the provenance and trust it if the builder is SLSA3+ |
Which of the fields listed above is different? If I as a repo author can't trigger workflow with the correct identity then who can?
That's exactly the case I'm talking about. Even when there is a compromise to the access to workflows, attacker should not be able to fake the build logs (as logs are managed by Github itself). |
@tonistiigi As I understand it, in your threat model the attacker has control over the developers Github credentials and can start a workflow in a repository they have access to. Am I understanding correctly?
I think what @laurentsimon is saying is that by using the reusable workflow he linked to the OIDC token that is retrieved from the Github provider and used to sign the provenance is linked to the identity of the reusable workflow and not the workflow of the developer. The workflow code is also present in a repo not controlled by the developer. An attacker in your scenario can trigger the reusable workflow but cannot forge the provenance unless they take over the slsa-github-generator repo as well. or they are somehow able to escape and get control of the build environment job VM and jump job VMs to the job VM generating provenance. However, we determine that we are protected from this by the VM security boundary and Github's control over VM execution in our threat model. So because the attacker cannot forge the reusable workflow's identity, if they sign it in the user's workflow instead, any verification you do would be able to catch that the id used to sign the provenance is not the id of the reusable workflow. |
maybe
yes
Thanks. So Subject in the cert is always That being said, I did open up this issue against Fulcio and not against slsa-github-generator . Unless maintainers want to declare that the only safe way to use Fulcio/Cosign with Github is to use a reusable workflow from an external repo that the maintainers always trust, I think the concerns are still valid. |
no worries, I hope my explanations are making some sense.
If you use the reusable workflow, yes, as that is the ID used to create the certificate and do the signing.
I understand. I think it depends on what you consider safe. If tying a signature back to source code and/or specific build run IDs is needed to be safe, then I agree with you that more metadata is needed. Others might be fine with just the signature as it is. I think we got to this point because the proposal to add the run id, run count, and run attempt to the signature sounded like it was solving a problem that you could instead solve by generating provenance and signing that instead. This is because provenance formats like SLSA can support a more flexible format than adding metadata to the certificate could. slsa-github-generator was brought up just an example of a tool that could be used to generate provenance. In fact, we include exactly this kind of information in our implementation. Though there is a dearth of these tools currently, you could theoretically use any tool to do this, we just happened to be working on one. Ultimately, I think that the idea of tools like Github's OIDC and fulcio is to create short lived certificates that are used and immediately discarded. So having metadata on a cert itself is likely not what we really want to do because users who follow this practice can't go back and look at it. What we really want is to include this info in provenance metadata and use fulcio to sign that. That way we can have the metadata in a verifiable signature format and also immediately get rid of certs and avoid having to store them. |
Interesting. So you mean it is signed by Fulcio root directly, not by the user's key? Is there some discussion/work going on that I could read on this? I guess that would mean that you can only sign very strictly defined payloads. This isn't just about signing provenance attestations, but artifacts and other related objects as well. You can make some strict rules on how a provenance definition matches up with Github's OIDC token but it already gets fuzzy when I use my Gmail token instead. And if you define a specific payload object that defines the token scope precisely for each object then the user would need to keep hold of that payload the same way they keep the cert today. |
I think what was meant is that "use Fulcio to generate a short lived certificate", and sign with that. This is for instance how the https://github.com/slsa-framework/slsa-github-generator works. |
Apologies it wasn't clear. I don't mean signing using fulcio's root but rather via an "ephemeral certificate" which is a short lived certificate issued by the fulcio server and is used once. The expectation is that using ephemeral certificates is not necessarily specific to any kind of data. For example,
Here it...
You could similarly use the Github OIDC provider rather than oauth2.sigstore.dev/Google if you wanted to. So the idea is that this is what a builder would do when generating provenance. The provenance includes a sha256sum or whatever of the binary, and other metadata like the run id etc. and is signed with an ephemeral cert. After that provenance can be verified via the signature, the binary via the sha256sum, and the cert itself is never stored or used again. |
I think I understand how cosign/fulcio work but your latest comments have me quite confused. In both comments you say that cert is thrown away and never used again. So what is the point of signing and how do you verify the signature?
You could throw away the cert metadata and just keep the public key component but as I mentioned in the previous comment then your payload needs to be in a strict format to verify the signing policies. |
Ah, yes. You're right. I confused things a bit. The signature is included in the DSSE that wraps the provenance info. The public key cert is stored in rekor and is retrieved from there for validation. The private key is what matters to be thrown out and is discarded after signing (the public key is what is printed). I'm not sure it convinces me that public keys are a good place to store build metadata but you're right that storage is needed for public keys (either in transparency log or otherwise) and me bringing it up probably didn't help. |
#945 will obsolete this request, so marking as closed. |
Description
Currently the certificates created via Github token add the following GH info fields:
Would be good if this info would also include the workflow run information. Based on https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect#understanding-the-oidc-token the token includes the run ID, run count and attempt count.
Having this info available would make a more direct connection to where the actual process that did the signing ran and look up the build logs if they are available. Could imagine a case where untrusted party has managed to trigger a workflow run on their terms and then tries to make it look like a legitimate release/branch build.
The text was updated successfully, but these errors were encountered: