Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference Types Working Group remaining tasks #337

Closed
7 tasks done
sudo-bmitch opened this issue Aug 25, 2022 · 9 comments
Closed
7 tasks done

Reference Types Working Group remaining tasks #337

sudo-bmitch opened this issue Aug 25, 2022 · 9 comments
Milestone

Comments

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented Aug 25, 2022

From our meeting today, we are merging #335 with the following items left for a future PR:

  • "Refers": add "field" or "to"?
  • "Referrers": add "list"?
  • Change defined words to lower case rather than upper case in the spec.
  • Change "registry SHOULD accept a manifest with a refers" to "MUST". Include details of why (without saying GC).
  • Change <reference> to <digest> and define <digest> in the Referrers API. Note that "tags may be supported in the future".
  • Define an error code if the reference cannot be parsed in the Referrers API.
  • Add "note" to "Multiple clients could attempt to update the tag simultaneously resulting in race conditions and data loss."
@mikebrow
Copy link
Member

  • discuss an implied refer digest for/to the image.index when the image.index includes an artifact manifest without a refer reference
  • discuss what it means to be an artifact that does not have a refers (potentially used as an end leaf only for some number of graphs)

@sudo-bmitch sudo-bmitch added this to the v1.1.0 milestone Aug 26, 2022
@sudo-bmitch
Copy link
Contributor Author

@mikebrow I don't think we want to change how existing artifacts, like a Helm chart, are handled. Does that negate some of the thoughts you had with implied refers digests and leaf nodes in a graph?

@Jamstah
Copy link
Contributor

Jamstah commented Jan 23, 2023

discuss an implied refer digest for/to the image.index when the image.index includes an artifact manifest without a refer reference

Is this proposing that the referrers API should include referrers from image indexes? I was looking to see if I could find any discussions around doing that...

@sudo-bmitch
Copy link
Contributor Author

@Jamstah
Copy link
Contributor

Jamstah commented Jan 23, 2023

That's more allowing an index to have a subject field, which I agree doesn't add up because of the graph looping.

I was more thinking:

  • I create an image index at registry.io/repo:index with two platforms, amd64 @sha256:abcdef and arm @sha256:ghijkl
  • I call registry.io/v2/repo/referrers/abcdef

Would it make sense for that to return:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "size": 1234,
      "digest": "sha256:a1a1a1...",
      "artifactType": "application/vnd.oci.image.index.v1+json"
    }
  ]
}

Would need to cover this in the spec I think, and add index info to this section:

Each descriptor is of an image or artifact manifest in the same namespace with a subject field that specifies the value of . The descriptors MUST include an artifactType field that is set to the value of artifactType for an artifact manifest if present, or the configuration descriptor's mediaType for an image manifest.

I'm guessing this has been discussed too, but I couldn't see it :/

@sudo-bmitch
Copy link
Contributor Author

The referrers only returns manifests with a subject field that has a matching digest field. Treating the index entries as implicit subjects has the same issue as giving the index an explicit subject.

@Jamstah
Copy link
Contributor

Jamstah commented Jan 24, 2023

OK, I think I get it. Its all about the direction of the parent-child relationship. An artifact is the child of its subject. An index is the parent of its target. If we (implicitly) add a target as a subject of an index, they become both parents and children of each other. This hits problems where clients are both following indirection (to handle tags/indexes) and referrers (to copy image metadata).

If the subject->artifact link is always parent->child, there is no looping and the client can:

  • when copying, copy all referrers and follow indirection
  • when displaying a UI tree, all referrers and indirections are children

I had it in my head that a subject link can be used for indirection, but I've been thinking about it, and can't really come up with a good use case for that. A tag is for individual indirection and an index is for dispatch indirection, a subject should never be used for indirection. (do you agree?)

The only reason I was thinking to add index entries as referrers was to make GC easier, so the registry can make a GC decision on a digest without needing to evaluate the whole graph. I can imagine there might be use cases for clients to want to ask the registry "Is this digest a child of anything?", but I can't think of them right now, so the registry will either need to process the graph as a whole for GC, or maintain its own metadata about index references that is not exposed and use that.

Should we consider putting any GC guidelines into the distribution spec? For example:

  • Registries should not consider any tagged digest for GC
  • Registries should not consider any image/artifact with a subject that exists for GC
  • Registries should not consider any digest that is included in an index for GC
  • Any other digest is a candidate for garbage collection (but not guaranteed to be collected)

@sudo-bmitch
Copy link
Contributor Author

I had it in my head that a subject link can be used for indirection, but I've been thinking about it, and can't really come up with a good use case for that. A tag is for individual indirection and an index is for dispatch indirection, a subject should never be used for indirection. (do you agree?)

I'm not quite sure I follow, but probably.

The only reason I was thinking to add index entries as referrers was to make GC easier, so the registry can make a GC decision on a digest without needing to evaluate the whole graph. I can imagine there might be use cases for clients to want to ask the registry "Is this digest a child of anything?", but I can't think of them right now, so the registry will either need to process the graph as a whole for GC, or maintain its own metadata about index references that is not exposed and use that.

We've avoided adding GC to the spec explicitly, but the general advice is to treat the referrers to a manifest the same as you would treat child manifests of an index. As long as the index is tagged, many registries would keep that index and all child manifests. So if a registry is keeping an image manifest, it would also keep all artifacts with a subject field pointing to that image manifest. Inversely, when a manifest is deleted, any untagged manifest with a subject field pointing to the deleted manifest is often safe to remove.

Overly specific guidance is difficult because GC has been implemented differently for different reasons. Some registries maintain untagged manifests for various reasons (maybe time since it was untagged or last pulled, or n number of previous values of a tag). Then there are registries like ttl.sh that delete any tagged image after a timeout.

A reverse reference API might be useful for other reasons, but probably not for GC.

For GC, what I've seen described most is a mark and sweep method, where a registry marks all manifests to preserve, and then recursively marks all child objects (manifests and blobs). With the fallback tag, that model still works since there's a tagged index pointing to the artifacts. When the referrers API is added, registries should treat those referrers as child manifests when recursively marking objects to preserve.

@jdolitsky
Copy link
Member

I think all remaining tasks have been addressed and the working group is no longer in session

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants