Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

editorial: clarify requirements around cache use by the build platform #901

Closed
wants to merge 6 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions docs/spec/v1.0/requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,13 +307,22 @@ The build platform MUST guarantee the following:
- It MUST NOT be possible for one build to persist or influence the build
environment of a subsequent build. In other words, an ephemeral build
environment MUST be provisioned for each build.
- The build platform MUST NOT open services that allow for remote influence
unless all such interactions are captured as `externalParameters` in the
provenance.

If the build platform leverages a cache for builds, it MUST guarantee the following:

- It MUST NOT be possible for one build to inject false entries into a build
cache used by another build, also known as "cache poisoning". In other
words, the output of the build MUST be identical whether or not the cache is
arewm marked this conversation as resolved.
Show resolved Hide resolved
used.
- The build platform MUST NOT open services that allow for remote influence
unless all such interactions are captured as `externalParameters` in the
provenance.
- If the build platform is capable of providing the provenance information for
an external resource when a cache is not in use, then the provenance
information MUST remain unchanged if a cache is used. In other words, the
information in the provenance MUST be identical whether or not the cache is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this + @mlieberman85's suggestion re the artifact hash addresses the reproducibility issue. There may be a separate question about whether intermediate artifacts that can impact the target artifact should be recorded for completeness but that's out of scope here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mlieberman85 , do you have a suggestion to the content of this PR related to your earlier comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this makes sense to me, but I do wonder if it's not clear enough to someone who isn't familiar with what we're implying? Maybe just a clarification that unique identifiers such as checksums are what we're talking about here?

Copy link
Member Author

@arewm arewm Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unique identifiers of what? The produced artifact or the cached entries used?

If the former, are you suggesting that we indicate that the hash of the artifact can differ if the cache is used because we are not claiming anything about reproducibility of the artifact?

If the latter, that seems like it might fit better in the previous bullet like so:

-  It MUST NOT be possible for one build to inject false entries into a build
    cache used by another build, also known as "cache poisoning". In other
    words, the output of the build MUST be identical whether or not the cache is
    used. This SHOULD be achieved using unique identifiers in the cache such
    as checksums

Copy link
Member

@adityasaky adityasaky Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, the output of the build MUST be identical whether or not the cache is used.

This implicitly requires reproducibility, no? Unless this only applies to the provenance predicate, allowing the hash of the produced artifact to still change.

used. Communication with the build cache MUST NOT be represented in
`resolvedDependencies`.

There are no sub-requirements on the build itself. Build L3 is limited to
ensuring that a well-intentioned build runs securely. It does not require that
Expand Down