-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: add ability to use a shared CAS directory #190
Conversation
@cyphar this is a bit of a POC or strawman, would love some feedback on whether you'd take a feature like this and how you'd expect to see it implemented (I can think of about a thousand different approaches so not sure which you'd prefer). |
On Wed, Oct 11, 2017 at 02:53:52PM +0000, Jonathan Boulle wrote:
This is particularly useful when working with multiple image layouts
and wanting to share blobs between them.
I'm not sure if umoci is interested in picking this up, but as a
related idea has a CAS-protocol registry [1] with an
oci-cas-template-v1 entry [2]. The URI Template approach is inspired
by parcel (e.g. [3]). With a similar URI Template approach for refs
[4], image-spec layouts can become a special-case ref-engine-discovery
approach (xiekeyang/oci-discovery#20). With that approach, you could
add casEngines entries to the oci-layout file [5] to point at your
central CAS engine:
$ cat oci-layout
{
"imageLayoutVersion": "1.1.0",
"refEngines": [
{
"protocol": "oci-index-template-v1",
"uri": "index.json"
}
],
"casEngines": [
{
"protocol": "oci-cas-template-v1",
"uri": "file:///wherever/you/keep/your/blobs/{algorithm}/{encoded}"
}
]
}
This patch adds a `--shared-cas` flag to all umoci commands that
work with image layouts.
The casEngines-in-oci-layout approach lets you declare this sort of
thing in the layout itself, where all consumers can access it. That
means you'd only need a --shared-cas entry when you initially create a
new layout.
If there is any umoci interest in this approach, I'm happy to help
code it up. oci-index-template-v1 is already implemented in Go [6],
and I was planning on writing an oci-cas-template-v1 implementation
this week anyway (I just hadn't decided on which repo to put it in yet
;).
[1]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/cas-engine-protocols.md
[2]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/cas-template.md
[3]: cyphar/parcel@b106252#diff-6a735bf86dcc8fb712e11c207df01ae4R88
[4]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/index-template.md
[5]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md#layout-object
[6]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/tools/refengine/indextemplate/indextemplate.go
|
@wking As I've mentioned before, my issue with |
On Thu, Oct 12, 2017 at 07:41:23AM +0000, Aleksa Sarai wrote:
@wking As I've mentioned before, my issue with `casEngines` is
because of precisely the layer violation you are referring to here.
I expect you're referencing [1]. Can you provide more details on the
issue you see (possibly in an oci-discovery issue, if you feel it's
too off-topic here)? Is it the lack of blob-media-type routing you
added to parcel [2]? I agree that the OCI CAS Template Protocol [3]
does not address that, but there's no reason you couldn't register [4]
a new CAS-engine protocol that addresses it. For example, if we
registered your parcel blobURIs as parcel-blobURIs-v1, you could have
a casEngines entry like:
"casEngines": [
{
"protocol": "parcel-blobURIs-v1",
"blobURIs": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"templates": [
"https://manifests.cyphar.com/opensuse/{parcel.discovery.nameDigest}/manifests/{parcel.fetch.blob.digestAlgorithm}/{parcel.fetch.blob.digest}"
]
},
…
]
},
…
]
And for things like Docker's registry and its auth dance, you could
have a ‘docker’ CAS-engine protocol similar to the ‘docker’ ref-engine
protocol I sketched in [5].
I've rewritten significant portions of parcel's specification, by
the way.
Glad to see that moving forward :). I filed some small copy-edit PRs
a few hours back ;).
[1]: xiekeyang/oci-discovery#1 (comment)
[2]: https://github.com/cyphar/parcel/blame/b10625235275423b27032b764d8888e76aa2b122/DESIGN.md#L335-L339
[3]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/cas-template.md
[4]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/cas-engine-protocols.md
[5]: https://github.com/xiekeyang/oci-discovery/blob/0be7eae246ae9a975a76ca209c045043f0793572/well-known-uri-ref-engine-discovery.md#example-1
|
My issue is that it doesn't "smell" right for semantic information about this sort of "shared CAS" to be stored in descriptors. I'm a huge proponent of doing it top-down, distribution style. Effectively it boils down to that I'm not convinced that mis-using descriptors as a way of trying to do something more peer-to-peer is better in the long run than having a federated distribution system (similar to the existing distribution system for packages).
No, it's the above, but you're right that the reason I added multiple backend support was with the intention of allowing combining blobs at a higher level (similar to what the PR does, but at the pull stage and not at the actual operational stage). But all of this is quite off-topic. If you want we can discuss this in the parcel issue tracker (or oci-discovery), or email. |
@jonboulle I'm not opposed to the idea (as we discussed recently), but I'll have to get back to you on which layer this should be done. Also, out of interest, did you find any additional issues with symlinking blobs other than |
@cyphar manual symlink munging works, but we definitely want to get out of that business if at all possible and punt whatever we can into the OCI tooling - especially if there's any chance that something like opencontainers/image-tools#40 might happen. (You might consider this my attempt to create precedent, with a clear use case and active implementation..) I should mention that since our particular use doesn't involve particularly long-running systems (i.e. there's a certain level of ephemerality in the CAS usage) we're happy for the implementation to change across some releases if we need to iterate towards a converged solution |
// FIXME: We can make this a list. | ||
BlobAlgorithm = digest.SHA256 | ||
// DefaultBlobAlgorithm is the default supported digest algorithm for blobs | ||
DefaultBlobAlgorithm = digest.SHA256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecate in favor of go-digest's Canonical
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecate in favor of go-digest's
Canonical
?
Spun off into #197.
This commit hard-codes the blobs/{algorithm}/{encoded} template [1], but sets the stage for future work to relax that positioning [2]. I'm adding a PutIndex call in the tests, becase the CAS implementation now has its own temp directory which is not known to the dirEngine. Casengine's dir implementation does not flock its temporary directory, but it is protected from Clean by the pruneExpire logic. The "Deprecated:" syntax is discussed in [3,4,5]. [1]: https://github.com/opencontainers/image-spec/blob/v1.0.0/image-layout.md#blobs [2]: https://github.com/openSUSE/umoci/pull/190 [3]: https://blog.golang.org/godoc-documenting-go-code [4]: golang/blog@257114a [5]: golang/go#10909 Signed-off-by: W. Trevor King <wking@tremily.us>
I've walked a bit down this path with wking/umoci@2908a4a. While I still think decoupling the CAS and ref engines is useful, I no longer think you can configure a digest-listing CAS engine with just a URI Template. URI Templates are not reversible in general (jtacoma/uritemplates#2), and even if they were, there are issues with reversing reference resolution. You need to reverse expansion to list stored digests (casengines, umoci). In casengine, I'm getting around that by requiring the caller to provide their own reverser (wking/casengine@76d4062). But since you don't need digest-listing support for most operations and because the base URI for relative expansion is unclear, including a reversing regexp pattern in the engine config seems like too large a hack. That means any CAS engines configured solely via However, I'm still not sure we need to go as far as bdd8bdf and pass down location information from every command that might interact with CAS. Operations like If loading the CAS URI Template or base directory from And while the |
I personally think that Aside from |
On Thu, Oct 19, 2017 at 08:36:43AM -0700, Aleksa Sarai wrote:
I personally think that `ListBlobs` should be something that is
optionally supported.
Yeah. This is where I was going with my second paragraph in [1].
I'll post again once I have a working implementation.
However, `parcel` discussions are a bit out-of-place here. 😉
This isn't really a parcel discussion. I think a URI Template
approach to blob location would be a good way to address @jonboulle's
shared-CAS usecase *in umoci*. wking/umoci@2908a4a, which I mentioned
in [1], is initial work in that direction, and I'll keep working on it
until I have a working example of the “loading a template URI from
oci-layout” idea for you to consider. So I think these alternative
approaches are on-topic for this PR, although I'll try to keep my
references short.
Alternatively, I could open a WIP PR with my wking/umoci@2908a4a and
compare that approach with this one in that PR. That divides the
conversation, but would keep URI Template discussion out of this
branch of the conversation. Let me know if you want me to file that
alternative PR (which would not imply any approval of that PR's
direction).
[1]: https://github.com/openSUSE/umoci/pull/190#issuecomment-337742882
|
Sorry for the delays. I just re-read the spec and you're completely right that this is supported (though I anticipate that the reason for this wording is to allow for the "external artifact" feature that ACI has). I also now see why @wking was talking so much about I will re-review this pull request next week (please note that I have some assignments and exams coming up, so I'll have less time to work than usual, so apologies upfront for any delays). |
This commit hard-codes the blobs/{algorithm}/{encoded} template [1], but sets the stage for future work to relax that positioning [2]. I'm adding a PutIndex call in the tests, becase the CAS implementation now has its own temp directory which is not known to the dirEngine. Casengine's dir implementation does not use .umoci-* temporary directories (it uses .casengine-* temporary directories), so it's protected from Clean. And the .casengine-* implementation does not currently provide it's own Clean() implementation, although I may add that in the future. The "Deprecated:" syntax is discussed in [3,4,5]. Also adjust Close() to return the first error it encounters, but to continue to optimistically attempt the remaining cleanup, logging any subsequent errors. [1]: https://github.com/opencontainers/image-spec/blob/v1.0.0/image-layout.md#blobs [2]: https://github.com/openSUSE/umoci/pull/190 [3]: https://blog.golang.org/godoc-documenting-go-code [4]: golang/blog@257114a [5]: golang/go#10909 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
This commit hard-codes the blobs/{algorithm}/{encoded} template [1], but sets the stage for future work to relax that positioning [2]. I'm adding a PutIndex call in the tests, becase the CAS implementation now has its own temp directory which is not known to the dirEngine. Casengine's dir implementation does not use .umoci-* temporary directories (it uses .casengine-* temporary directories), so it's protected from Clean. And the .casengine-* implementation does not currently provide it's own Clean() implementation, although I may add that in the future. The "Deprecated:" syntax is discussed in [3,4,5]. Also adjust Close() to return the first error it encounters, but to continue to optimistically attempt the remaining cleanup, logging any subsequent errors. Bumping go-mtree pulls in [6] and gives us a lowercase sirupsen import that is compatible with oci-discovery and casengine. [1]: https://github.com/opencontainers/image-spec/blob/v1.0.0/image-layout.md#blobs [2]: https://github.com/openSUSE/umoci/pull/190 [3]: https://blog.golang.org/godoc-documenting-go-code [4]: golang/blog@257114a [5]: golang/go#10909 [6]: vbatts/go-mtree#144 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
umoci currently always expects blobs to live within the image layout it's working on, in the aptly-named "blobs" subdirectory. However, the OCI specification [1][1] permits this directory to be empty and for implementations to look in other locations for referenced blobs. This is particularly useful when working with multiple image layouts and wanting to share blobs between them. This patch adds a `--shared-cas` flag to all umoci commands that work with image layouts. The flag expects a directory to be passed, and if set, this directory will be used for all blob operations. In the first implementation, we extend the `dirEngine` implementation of the CAS interface to accept another (optional) `sharedCasPath` field. Analogously to the CLI flag, if this field is supplied, it will be used internally by the dirEngine for all blob-related operations. [1]: https://github.com/opencontainers/image-spec/blob/7c889fafd04a893f5c5f50b7ab9963d5d64e5242/image-layout.md#blobs Signed-off-by: Jonathan Boulle <jonathanboulle@gmail.com>
Signed-off-by: Jonathan Boulle <jonathanboulle@gmail.com>
Signed-off-by: Jonathan Boulle <jonathanboulle@gmail.com>
This is fairly stale by now, closing. I'd be happy to re-discuss this issue again if it's still an issue you're having with |
umoci currently always expects blobs to live within the image layout
it's working on, in the aptly-named "blobs" subdirectory. However, the
OCI specification 1 permits this directory to be empty and for
implementations to look in other locations for referenced blobs.
This is particularly useful when working with multiple image layouts and
wanting to share blobs between them.
This patch adds a
--shared-cas
flag to all umoci commands that workwith image layouts. The flag expects a directory to be passed, and if
set, this directory will be used for all blob operations.
In the first implementation, we extend the
dirEngine
implementation ofthe CAS interface to accept another (optional)
sharedCasPath
field.Analogously to the CLI flag, if this field is supplied, it will be used
internally by the dirEngine for all blob-related operations.
Signed-off-by: Jonathan Boulle jonathanboulle@gmail.com