Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DirectoryTree download support #358

Merged
merged 6 commits into from
May 11, 2023

Conversation

mandelsoft
Copy link
Contributor

@mandelsoft mandelsoft commented May 6, 2023

What this PR does / why we need it:

The standard configuration now provides a downloader for resources of type directoryTreeand the legacy type filesystem.

It acts on the mimetypes for an artifact set (application/vnd.oci.image.manifest.v1+tar+gzip) and a tar/tgz archive (application/x-tar, application/x-tar+gzip, application/x-tgz). The default configuration extracts the content to a
filesystem folder. If the blob format is an artifact set, for example provided by the access method ociArtifact,
the default configuration accepts the image config mimetype (application/vnd.oci.image.config.v1+json).
In this case the final filesystem content provided by the image is download by evaluating the layered image file system.

The default behaviour can just be used with

import "github.com/open-component-model/ocm/pkg/contexts/ocm/download"
download.For(octx).Download(printer,rsourceAccess,targetdir,vfs)

If the resource access describes an appropriate artifact, the new handler is automatically selected.

As usual, the target is always a virtual filesystem.

Like all download handlers. the dirtree download handler can also explicitly be used with

import "github.com/open-component-model/ocm/pkg/contexts/ocm/download/handlers/dirtree"
dirtree.New().Download(...)

In this mode, the behaviour can be influenced by specifying any list of accepted OCI artifact config mime types.

With New(mimetypes ...string).SetArchiveMode(true) it is possible to enable the archive target mode. The content is downloaded to an archive instead of extracted filesystem content.

The handler checks the mimetypes, only, but the default registration is done exclusively for the directory content resource types.
A context can be extended for other resource types with

download.For(octx),Register(type, [mimetype], dirtree.New...(...))

The handler also provides additional methods, which can be used to execute more specific tasks, for example
methods for an optimized content access providing an internal virtual filesystem or an archive byte stream trying to avoid unnecessary conversions depending on the actual input format,

The localization package has been adapted, accordingly. The localize.Instantiate method now prefers to use the
download handlers to get to the filesystem content to be configured, instead of expecting an archive resource.
Therefore, any (potentially own) resource type with any format can be used, as long as there is an appropriate downloader configured for the used OCM context. An optional additional parameter can be used to restrict the accepted resource types
to an explicitly given set of types.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Release Notes:

jensh007
jensh007 previously approved these changes May 10, 2023
@phoban01
Copy link
Contributor

phoban01 commented May 10, 2023

With the following component:

NAME      VERSION IDENTITY TYPE          RELATION
config    1.0.0            PlainText     local
image     6.3.5            ociImage      external
manifests 1.0.0            directoryTree local

I find that I have to use dirtree package in order for the resource to be extracted correctly into the filesystem:

memfs := memoryfs.New()

_, _, err = dirtree.New().Download(common.NewPrinter(io.Discard), res, "", memfs)
if err != nil {	
  log.Fatal(err)
}

vfs.Walk(memfs, "", func(path string, info fs.FileInfo, err error) error {
	fmt.Println(info.Name(), info.IsDir())
	return nil
}}

From reading the PR I believe it should be possible to simply execute download.ForContext(...

@mandelsoft
Copy link
Contributor Author

@phoban01 If you use dirtree.New(...).Download(...) you explicitly use the dirtree downloader and nothing else.
It only checks the mimetypes, but not the resource types. So, you can enforce to use it on resources, regardless of their type to download dirtree-like resources (with a matching mime type).

If you use download.For(...).Download(...) it tries to find a registered downloader with registration criterias matching the actual resource. This can be used without bothering with the kind of actually used resource (to just download it, whatever it is in a standard manner). If a matching downloader is found, it is used, otherwise just the blob is downloaded as provided by the access method. Here, for sure the dirtree downloader is used for the standard scenarios (it is registered for). This is especially the directoryTree resource type with the tar-like mimetypes and the oci artifact archive mime type. But it is not used for other scenarios.

So, if you want to use an own resource type (directly expressing the dedicated new meaning of a general filesystem content), for example gitOpsTemplate, which is more expressive than just directoryTree. you can

  • either register the dirtree handler in advance for your OCM context and for this resource type at the registry (then it would automatically be chosen for all downloads using this context)
  • or you know what you are doing, and explicitly call the dirtree downloader on such a resource.

If, for example gitOpsTemplate should be a standard resource type, we should add such a registration as part of the standard.

Another possible scenario, where you might want to use the explicit dirtree usage is to overwrite the standard behaviour for a special use case. For example, an OCI image is typically downloaded as OCI artifact with the distribution spec format. But. if you want to access the effective filesystem, you could explicitly use the dirtree downloader for an OCI image, which handles this for you. It would make less sense to use it on a helm chart OCI artifact, because here the layers have a different meaning than building a directory tree.

@phoban01
Copy link
Contributor

Thanks for the explanation @mandelsoft. My assumption was that handlers for built-in OCM types would be registered by default but it seems that I need to explicitly register handlers which is fine.

@mandelsoft
Copy link
Contributor Author

mandelsoft commented May 11, 2023

Yes, for builtin types they should already be registered. So far for dirtree-like types, this is only directoryTree and the legacy type filesystem. Do we have other matching standard types?

@phoban01: What about gitOpsTemplate.Should we officially introduce this type? Then I'll add it to the default registration.

@phoban01
Copy link
Contributor

I don't personally ever use this type gitOpsTemplate so I don't think it needs to be added. There are other types and formats which we should start considering (for example web assembly) but I think that's a separate discussion :smile.

@mandelsoft
Copy link
Contributor Author

mandelsoft commented May 11, 2023

@phoban01 , may be you could approve this PR, It just added the explanation to the docu of the dirtree package and rebased to the fixed validation job on main.

@mandelsoft mandelsoft merged commit 8b33f96 into open-component-model:main May 11, 2023
robertwol pushed a commit that referenced this pull request Sep 25, 2023
* oci image dirtree downloader

* cleanup reader handling for artifactset

* archive downloader for dirtree

* adapt localization to dirtree downloader

* get rid of os package when working with vfs

* add dirtree downloader docu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants