-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic equivalency, reproducible builds, and a new "verifiable build" track #873
Comments
I'm not generally in favor of treating reproducibility as anything other than a strict binary. Putting that aside, I've got a few questions about how a semantic equivalency track would work: Who is the intended user for a semantic equivalency track? What does their workflow look like? What value does the track bring to them? The name "semantic equivalence" implies a property of two or more packages, whereas reproducibility is a property of a single package. Do we need more than one package for verification? Who provides that second package? If the second package is trustworthy enough to use as a benchmark for the one being verified, why not use it in the first place? What would a semantic equivalency track look like? Assuming that L0 means "not reproducible" and L means "bit-for-bit reproducible," we'd need a principled way to divide the space between into levels. Having a known end level also constrains our ability to react to changes in the state of the art for determining semantic equivalency -- we'd either have to modifying existing levels or add levels that break the convention of using counting numbers in increasing order (e.g. Repro L5 and Repro L6 might be weaker than Repro L4). What would we attach the level to for a semantic equivalence trace? A package, a particular version of a package, a repo, a project, something else? Do we expect to pass around an attestation for the 's level, or would we add it to the build provenance? What does the attestation mean (who is attesting to what)? |
@kpk47 :
That's fine! But I wanted to start a discussion.
A potential user of a package who wants to know that "the built package I'm installing corresponds to the source code it putatively was generated from". Ideally they want 100% confidence, but more is better than less.
Something like, "Run OSSGadget & see if it reports semantic equivalence (or that it's a reproducible build)". I'm sure more details would need to be ironed out before it went anywhere.
No. The idea is that "When I re-execute the build from known source, I get a package that is semantically equivalent to the package posted". Naming is hard; they originally called it "reproducible builds" but that was confusing since most people mean "bit-for-bit" when they say "reproducible build". Naming is hard.
Agreed. I currently proposed only 1 intermediate state. |
To add a little context, and what I was thinking about when I started writing the tool -- I wanted to determine the likelihood that a (for example) npm package actually reflected the source repository it was linked to. IIRC, there were a bunch of cases of malware where the registry account was compromised but the source repo wasn't, and malicious version published clearly didn't bear any resemblance to the repo contents. Obviously, it's better if projects have clear build scripts defined, but many don't. So I came up with different strategies that seemed reasonable:
Based on which strategies work, we assign a rough confidence. There's definitely room for improvement here. Here's sample output from my favorite string padding library:
|
I commented on potential levels for a "reproducible" track in #230 (comment). One related set of requirements from the 0.1 spec is pinned dependencies. Content from the linked comment:
A binary reproducibility might be the highest bar, but that doesn't mean we cannot illuminate the lower levels to get there, highlighting their benefit. One challenge here, of course, is whether the different levels are "common enough" to warrant being grouped together. |
We discussed this in our weekly specification meeting and decided to move it to our backlog. It is a large issue that could be split up and possibly deduped with existing issues, but nobody volunteered to pick it up. |
I couldn't attend the meeting last week. I'm interested in supporting the work, so we have a volunteer :-). |
This part, where we have tiers for "semantically equivalent" and then "reproducible build", gets a big +1 from me. A project should have a motivation to get timestamps out of their build when possible, and I think this gives them that carrot. One of my back burner projects involves comparing the responses to external requests (e.g. dependency downloads during a build) and there are parts of those responses that will always be different (e.g. auth tokens). The focus for me has been excluding comparisons on parts that I expect to change while flagging parts that should be the same, which fits in nicely with the "semantically equivalent" that we're defining here. |
The threat being countered is the case where the package on the repository is subverted, say by a subverted build process or rights to the distribution repository, but the attacker didn't have the permissions necessary to subvert the source code. Reproducible builds (when verified) counter this risk, because they can detect this case. Semantic equivalency also counters this risk (not quite as well as reproducible builds). |
Brandon Mitchell to Everyone (Sep 11, 2023, 12:42 PM) made this great quote:
|
My feeling is that reproducible builds don't need a separate track, but rather we ought to document how they are a possible solution to achieve the Build track. That connection is very unclear in the spec now, so it would be good to make that more clear. In the case that the build is not deterministic, such that multiple builds result in different output bits, then that seems like a challenge for the implementation: how do you know that two different builds are "close enough" to determine that it still qualifies as SLSA Build Level X. One way would be an claim (either an explicit attestation or an implicit part of the builder) that "benign" changes are ignored. I like the OSS Gadget approach of talking about levels of similarity, which is effectively a "confidence" about how benign the changes are. |
Note: In meeting today 2023-09-11, Mark L thought this might be better within the current build track instead of a separate track. Marcela was inclined the same way as well. So we end up with more levels. |
I wasn't able to attend the call today again unfortunately. Would properties related to this be added in L4 and above or would there be "properties of reproducibility" that are associated with levels 1-3 as they exist now as well? |
Per the meeting today, it was decided to craft some text in a Google doc. Here's the Google doc: https://docs.google.com/document/d/1Jk0yZnkTC3dfp8G5dmO8K9r1Kc7TRX2QVOwcFSKw1OQ/edit |
I added my proposal and rationale to keep reproducibility separate from the build track to the document above. |
Related: #8 Related: slsa-framework/slsa#977 Related: slsa-framework/slsa#873 Signed-off-by: John Andersen <johnandersen777@protonmail.com>
I have an idea that probably needs some refinement, but I think there may be something here. In short: there's a newer "backoff" idea of reproducible builds called "semantic equivalency" that is somewhat easier to achieve than reproducible builds. Since there's a backoff system, this suggests to me that it may be appropriate to have a whole new track. The current build track imposes requirements on protecting a build and sharing information about the build. The possible new track "verifiable build" imposes requirements on the ability to independently verify the results of a build. There are other possibilities, e.g., making "semantic equivalency" part of a new SLSA level 4, and reproducible builds in SLSA level 5 (I'm not sure where hermetic builds goes in that case). Below is my thinking, discussion welcome!
===
Reproducible builds are where you rebuild software from a given source and produce a bit-for-bit identical version of the built result. In many ways it's a "gold standard" for verifying builds. Reproducible builds were proposed for SLSA in #5.
Different source code typically produces different build results, but you can identify the the source with a cryptographic hash. Different tools typically produce different results, so you must specify the tools used. But the biggest challenge for reproducible builds is that there are many ways, such as embedded timestamps, that can make it hard to create a reproducible build.
In some situations reproducible builds are trivial or at least not hard. In others, they're easy because the developers have spent many hours to achieve reproducible builds. Good for them! But others find it challenging to create reproducible builds. The survey "SLSA++: A Survey of Supply Chain Security Practices and Beliefs" (published 2023, survey was done in 2022) has info on reproducible builds. In particular, reproducible builds and hermetic builds were considered much more difficult than the other practices surveyed; Over 50% of respondents stated that this practice was either extremely difficult or very difficult.
The tool "OSSGadget" includes a tool to measure that they're about to call "semantic equivalency" or "semantically equivalent". (They once used the term "reproducible build", but that was confusing, so they're going to switch names to make the idea clearer to everyone.) A project build is
semantically equivalent
if "its build results can be either (1) recreated exactly (a bit for bit reproducible build), or if (2) the differences between the release package and a rebuilt package are not expected to produce functional differences in normal cases."For example, builds would be considered semantically equivalent if the differences only included differences in date/time stamps. It'd also be fine if the build added/removed files that would not affect the execution of the code presuming that the code was not malicious to start with and followed "normal" practices, for example, adding/removing a ".gitignore" file (we would expect that a non-malicious program would not run ".gitignore" and wouldn't do something different depending the presence of ".gitignore").
@scovetta pointed me towards this "semantically equivalent" measure, and I think it's really promising. Sure, it'd be best if builds were reproducible, but where that's unavailable and those involved are unwilling to change the build process, what's the alternative? This alternative enables end-users to estimate the likelihood of it being maliciously built (presumably as a part of decideing whether or not the package is safe to install).
I had previously mooted the idea of "reproducible builds but ignoring date/timestamps" (because date/timestamps are a common problem for creating reproducible builds). My commendation to the OSSGadget developers & others for developing this alternative.
The threat model is a little different in the case of "semantically equivalent". The assumption isn't that "it is impossible for these differences to cause damage". The assumption is that "the original source code was benign, reasonably coded, and did not do damage". The question is, is this non-reproducible package likely to have been generated from it, even though it's not a reproducible build?"
Here's an example that might clarify the threat model. It's possible that a program could look for ".gitignore" and run it if present. The source code repo might not have a .gitignore file, but the malicious package added .gitignore and filled it with
a malicious application. That would cause malicious code to
be executed, but it would also be highly suspicious to
run a ".gitignore" file (that's not what they are for), so
it's reasonable to assume that the source code didn't do that.
If an attacker can insert a file that would cause malicious code
to execute in a reasonably-coded app, then that would be a problem.
"What's reasonable" is hard to truly write down, but a
whitelisted list of specific filenames seems like a reasonable place
to start.
Sure, ideally everything would have a reproducible build. Since that day isn't here, what can we do to take piecemeal steps towards that?
Making this a separate track has its advantages. Semantically equivalent builds, and reproducible builds, imply the ability for independent verification by the recipient. Of course, this only matters if someone does the verification, but it is different than making assertions about the process used to create the build being acquired.
SLSA version 1.0 only had build levels 1-3 defined, because there were many challenges in working out how to define level 4. Maybe a separate track is the way forward. If not, maybe "semantically equivalent" goes in a new level 4, with "reproducible build" being in level 5. In any case, this idea of "semantically equivalent" gives us something new to discuss and think about.
Discussion welcome.
Links:
The text was updated successfully, but these errors were encountered: