Skip to content

Commit

Permalink
generic fetcher: Official support with ADR
Browse files Browse the repository at this point in the history
This change includes the generic fetcher ADR which is a necessary
step for the feature to be officially supported.

Signed-off-by: Jan Koscielniak <jakoscie@redhat.com>
  • Loading branch information
kosciCZ committed Nov 12, 2024
1 parent 5c43194 commit edfeca5
Showing 1 changed file with 104 additions and 0 deletions.
104 changes: 104 additions & 0 deletions docs/adr/0001-add-generic-fetcher.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Add generic fetcher

## Context

The main motivation for this change is to cover use cases of users that need to download arbitrary files that don't fit
within an established package ecosystem cachi2 could potentially otherwise support. The target audience is users that
want to use cachi2 to achieve hermetic builds and want an easy way to also include these arbitrary files, that cachi2
will account for in the SBOM it produces.

## Decision

A new package manager for generic artifacts must be introduced.. This package manager utilizes a custom
lockfile that is located in the input repository. Based on that lockfile, it will download files, save them into a requested
location, and verify checksums. Below is a more detailed overview of the implementation.

### Lockfile format

Cachi2 expects the lockfile to be named `artifacts.lock.yaml`.
In order to account for possible future breaking changes, the lockfile will contain a `metadata` section with a `version`
field that will indicate the version of the lockfile format. It will also contain a list of artifacts (files) to download,
each of the artifacts to have a URL, list of checksums, and optionally resulting filename specified.

```yaml
metadata:
# uses X.Y semantic versioning
version: "1.0"
artifacts:
- download_url: https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors?download=true
filename: granite-model-1.safetensors
checksum: sha256:d16bf783cb6670f7f692ad7d6885ab957c63cfc1b9649bc4a3ba1cfbdfd5230c
```
#### Lockfile properties
Below is an explanation of individual properties of the lockfile.
##### download_url (required)
Specified as a string containing the download url of the artifact.
##### checksum (required)
Specified as string in the format of "algorithm:hash". Must be provided to ensure at least some
degree of confidence in the identity of the artifact.
#### filename (optional)
This key is provided mainly for the users convenience, so the files end up in expected locations. It is optional and if
not specified, it will be derived from the download_url. Filename here is a path inside cachi2's output directory for
the generic fetcher (`{cachi2-output-dir}/deps/generic`). Cachi2 will verify that the resulting filenames, including those
derived from download urls do not overlap.

### SBOM components

Artifacts fetched with the generic fetcher will all be recorded in the SBOM cachi2 produces. Given the inability to derive
any extra information about these files beyond a download location and a filename, these files will always be recorded
as SBOM components with purl of type generic.

Additionally, the SBOM component will contain [externalReferences] of type `distribution` to indicate the url used to download
the file to allow for easier handling for tools that might process the SBOM.

Here's an example SBOM generated for above file.

```json
{
"bomFormat": "CycloneDX",
"components": [
{
"name": "granite-model-1.safetensors",
"purl": "pkg:generic/granite-model-1.safetensors?checksum=sha256:d16bf783cb6670f7f692ad7d6885ab957c63cfc1b9649bc4a3ba1cfbdfd5230c&download_url=https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors",
"properties": [
{
"name": "cachi2:found_by",
"value": "cachi2"
}
],
"type": "file",
"externalReferences": [
{
"url": "https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors",
"type": "distribution"
}
]
}
],
"metadata": {
"tools": [
{
"vendor": "red hat",
"name": "cachi2"
}
]
},
"specVersion": "1.4",
"version": 1
}
```

## Consequences

As mentioned before, this package manager enables users to fetch arbitrary files with cachi2 and have them accounted for
in the SBOM.

[externalReferences]: https://cyclonedx.org/docs/1.6/json/#components_items_externalReferences

0 comments on commit edfeca5

Please sign in to comment.