diff --git a/spec/v1/CHANGELOG.md b/spec/v1/CHANGELOG.md new file mode 100644 index 00000000..5e3f9ace --- /dev/null +++ b/spec/v1/CHANGELOG.md @@ -0,0 +1,11 @@ +# Changelog + +## v1.1 + +- Clarified that subjects are assumed to be immutable and that it is +acceptable to use a non-cryptographic digest (though cryptographic +digests are still strongly recommended). + +## v1 + +Initial release. diff --git a/spec/v1/README.md b/spec/v1/README.md index 48dae99a..522a8b58 100644 --- a/spec/v1/README.md +++ b/spec/v1/README.md @@ -1,6 +1,6 @@ # Specification for in-toto attestation layers -Version: v1.0 +Version: v1.1 Index: diff --git a/spec/v1/bundle.md b/spec/v1/bundle.md index c7dee9d7..6d9a168f 100644 --- a/spec/v1/bundle.md +++ b/spec/v1/bundle.md @@ -1,7 +1,5 @@ # Bundle layer specification -Version: v1.0 - An attestation Bundle is a collection of multiple attestations in a single file. This allows attestations from multiple different points in the software supply chain (e.g. Provenance, Code Review, Test Result, vuln scan, ...) to diff --git a/spec/v1/digest_set.md b/spec/v1/digest_set.md index 15971614..8cd19220 100644 --- a/spec/v1/digest_set.md +++ b/spec/v1/digest_set.md @@ -1,16 +1,14 @@ # DigestSet field type specification -Version: v1.0 - -Set of one or more cryptographic digests for a single software artifact or -metadata object. +Set of one or more cryptographic digests, or other immutable references, +for a single software artifact or metadata object. ## Schema ```json { - "": "", - "": "", + "": "", + "": "", ... } ``` @@ -23,6 +21,12 @@ algorithms below use lowercase hex encoding. Usually there is just a single key/value pair, but multiple entries MAY be used for algorithm agility. +Each entry in a DigestSet MUST be an immutable reference to an artifact. It is +STRONGLY RECOMMENDED to use a commonly accepted, cryptographically secure digest +algorithm to achieve this immutability. See [Use cases for non-cryptographic, +immutable, digests](#use-cases-for-non-cryptographic-immutable-digests) for +further guidance. + ### Supported algorithms #### `sha256`, `sha224`, `sha384`, `sha512`, `sha512_224`, `sha512_256`, `sha3_224`, `sha3_256`, `sha3_384`, `sha3_512`, `shake128`, `shake256`, `blake2b`, `blake2s`, `ripemd160`, `sm3`, `gost`, `sha1`, `md5` @@ -144,6 +148,38 @@ matches. New algorithms MUST document how the value is encoded, e.g. URL-safe base64, lowercase hex, etc... +### Use cases for non-cryptographic, immutable, digests + +While cryptographic digests are the strongly recommended immutable identifier, +users might have need to refer to an artifact by some other means. For example, +it might be technically infeasible to compute a digest over the content, or +the user might interact with the content through an interface that doesn't +expose them to the entirety of the content. + +In these situations, users MAY use a non-cryptographic identifier in a DigestSet +so long as the risk of the object being mutated is acceptable for the +application. + +One concrete example of where a non-cryptographic hash can be useful is when +referring to Virtual Machine images. Often these images are very large +(impractical to run a cryptographic hash over) and users often interact with +them via APIs that the platform provides that don't involve the user having +complete custody of the content. Platforms like AWS and GCP provide 'ids' for +users to use when referring to these images. A user may say something like +"create an instance with image 123". In that case the user doesn't actually have +the bits that correspond to 'image 123' so they cannot digest it themselves. And +by the time the image has started it can be difficult, if not impossible, to +digest the original content that was used to boot the instance. + +These IDs can often be treated as immutable and may be perfectly suited to users +threat profiles. Allowing DigestSets to use these types of identifiers allows +providers to make statements about the content of these VM images using the +identifiers their users have ready access to. + +In addition, using an ID like this does not preclude including a cryptographic +hash in the DigestSet as well. If possible including both may provide the most +flexibility for the user's various use cases. + ## Examples - `{"sha256": "abcd", "sha512": "1234"}` matches `{"sha256": "abcd"}` diff --git a/spec/v1/predicate.md b/spec/v1/predicate.md index f9cc4a1d..e2fd7dc4 100644 --- a/spec/v1/predicate.md +++ b/spec/v1/predicate.md @@ -1,7 +1,5 @@ # Predicate layer specification -Version: v1.0 - The Predicate is the innermost layer of the attestation, containing arbitrary metadata about the [Statement]'s `subject`. diff --git a/spec/v1/resource_descriptor.md b/spec/v1/resource_descriptor.md index 5285a29d..3d87e11b 100644 --- a/spec/v1/resource_descriptor.md +++ b/spec/v1/resource_descriptor.md @@ -1,7 +1,5 @@ # ResourceDescriptor field type specification -Version: v1.0 - A size-efficient description of any software artifact or resource (mutable or immutable). diff --git a/spec/v1/statement.md b/spec/v1/statement.md index 611fc471..630f773c 100644 --- a/spec/v1/statement.md +++ b/spec/v1/statement.md @@ -1,7 +1,5 @@ # Statement layer specification -Version: v1.0 - The Statement is the middle layer of the attestation, binding it to a particular subject and unambiguously identifying the types of the [Predicate]. @@ -38,6 +36,9 @@ Additional [parsing rules] apply. > Set of software artifacts that the attestation applies to. Each element > represents a single software artifact. Each element MUST have `digest` set. > +> Subjects are assumed to be _immutable_, i.e. the artifacts identified by the +> subject SHOULD NOT change. +> > The `name` field may be used as an identifier to distinguish this artifact > from others within the `subject`. Similarly, other ResourceDescriptor fields > may be used as required by the context. The semantics are up to the producer