-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
layout: sharding the blob store #449
Comments
Signed-off-by: Aleksa Sarai <asarai@suse.com>
On Sat, Nov 05, 2016 at 04:50:36PM -0700, Aleksa Sarai wrote:
That may be me ;). I'd rather phrase this as “I'd like the whole Whether a particular implemenation of that API (e.g. image-layout)
This has come up before in #94 and #208, with the bulk of the |
Agreed this is a dupe of #208. |
@cyphar I guess in particular #208 (comment) challenges the premise of this issue |
This seems not a dupe of #208. Even though pulling operation should never call Also, there can be 3rd party tools (e.g. malware scanner, back-up) that are not aware of OCI manifest and hence result in calling Can we reconsider this issue? |
Since the layout of Some my ideas and pros/cons:
My preference is 1. Also, we would need to define new field for the list of supported blob layouts in the e.g. {
"imageLayoutVersion": "42.0.0"
"supportedBlobLayouts": [ // if empty, "v1compat" is implicitly selected
"v1compat",
"sharded"
// there can be other layouts that is specific to the distribution protocol? (e.g. "ipfs")
]
} |
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
For folks who want to diverge as little as possible from things already in image-spec. Downsides to this approach include: * Non-sharded blobs [1], although it's not clear to me that modern filesystems suffer from having many entries in one directory [2]. * Possible duplicate blobs between two layouts. You can address this with symlinks or similar, but you'd need extra tooling to do that. With a single CAS bucket, there's only one place that the blob could be, so deduping is free (but garbage collection becomes more complicated). [1]: opencontainers/image-spec#449 [2]: opencontainers/image-spec#94 (comment)
Bump the layout to v1.1 to support this. This makes it possible to distribute layouts that use other protocols, for example new ref-engine protocols or a sharded blob store [1]. You can also reference external ref- and CAS-engines, although obviously the utility of such depends on the availability of those external engines. [1]: opencontainers#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
With this change, users can configure their blob storage once at init time with an optional --blob-uri. Most other commands (which do not need path -> blob conversion) can load the blob location from the oci-layout layout file (the 1.1.0 format is in flight with [1,2]). The only other user-facing change is to 'umoci gc', which gains a --digest-regexp. Folks who customized their blob URI will need to supply --digest-regexp to reverse whichever blob URI they're using. This seems like a more convenient interface to me than requiring all callers to provide the custom blob location [3]. And it is more powerful as well, allowing users to shard their blob storage [4], etc. if they feel moved to do so. [1]: xiekeyang/oci-discovery#20 [2]: https://github.com/wking/image-spec/blob/ref-engine-discovery-layout/image-layout.md [3]: https://github.com/openSUSE/umoci/pull/190 [4]: opencontainers/image-spec#449 Signed-off-by: W. Trevor King <wking@tremily.us>
One issue that I'm quite worried about is the performance impact of having too many blobs inside an OCI image. Now, practically speaking I would be surprised if
n > 20
in most cases, but some people have expressed that they would like to have the entire universe bottled into an OCI image. I will refrain from commenting on how good of an idea I think that is, but if it's going to be a "valid usecase" then we should reconsider how we've organised the blob directory.Namely, the current method of
blobs/<algo>/<digest>
will cause problems if the number of digests becomes quite large, due to implementation issues of filesystems. Essentially all filesystems are not designed to handle accesses of directories with many files well. If you look at howgit
,camlistore
and many other such projects implement their blob storage it looks more likeblobs/<algo>/<prefix>/<suffix>
(or incamlistore
's case, three sets of<prefix>/
).Naturally this would be a backwards incompatible change (you can't really implement this scheme as well as retaining the old one because then you have an exponential number of ways to read the same blob data, almost certainly leading to countless implementation bugs). So we should probably consider this for post-1.0.0.
The text was updated successfully, but these errors were encountered: