Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove redundant intermediate images in build --oci flow #2308

Closed
preminger opened this issue Nov 3, 2023 · 3 comments · Fixed by #2936
Closed

remove redundant intermediate images in build --oci flow #2308

preminger opened this issue Nov 3, 2023 · 3 comments · Fixed by #2936
Assignees
Labels
enhancement New feature or request

Comments

@preminger
Copy link
Contributor

Currently, building an OCI-SIF using buildkit first builds a tar file of the OCI image, and then converts that tar to an OCI-SIF by treating it as a source for oci-archive: URI.

We should see if we can pipe the output of buildkit straight into OCI-SIF through a pipe, to avoid the creation of the intermediate tar file.

@preminger preminger added the enhancement New feature or request label Nov 3, 2023
@dtrudg dtrudg added this to the SingularityCE 4.2.0 milestone Apr 26, 2024
@dtrudg dtrudg self-assigned this May 20, 2024
@dtrudg
Copy link
Member

dtrudg commented May 20, 2024

This is somewhat difficult...

To convert the OCI tar to OCI-SIF we first ensure we have the image in an OCI layout, from which it's easy to perform mutations etc.

We 'fetch' the tar file into a layout using ggcr. Within the ggcr code the tar will be opened multiple times to retrieve different files. It's not possible to work on a tar stream that cannot be re-opened from the start.

We need to directly extract the tar into the layout to bypass the tar temp file.

@dtrudg
Copy link
Member

dtrudg commented May 20, 2024

It's not that difficult... the buildkit tar is an OCI archive rather than a legacy Docker archive these days, so we can simply extract it and we have a layout.

@dtrudg
Copy link
Member

dtrudg commented May 20, 2024

It appears that after #2935 then the better approach is to keep the tarball, but read from it without extracting to a layout directory. This can be acheived by treating it as a docker-archive, rather than an oci-archive.

The buildkit tar is an OCI layout, but with the additional docker-specific manifest.json, so ggcr can read from it directly as a tarball.

@dtrudg dtrudg changed the title remove intermediate tar creation in build --oci flow remove redundant intermediate images in build --oci flow May 20, 2024
dtrudg added a commit to dtrudg/singularity that referenced this issue May 20, 2024
With sylabs#2935 in place we can now translate a Docker archive tarball into
an OCI-SIF without creating a redundant copy of the image as a temporary
OCI layout.

We've previously been processing the buildkit tarball as an oci-archive,
which still requires extraction to a temporary layout, as ggcr's tarball
package doesn't accept an archived layout without a manifest.json.

Because the buildkit output tarball includes the Docker specific
manifest.json, in addition to the standard OCI layout files, it can be
treated as a docker-archive - now avoiding the need for the temporary
layout.

Significantly reduces disk space needed by `build --oci`, and the
associated I/O.

Fixes sylabs#2308
dtrudg added a commit to dtrudg/singularity that referenced this issue May 21, 2024
With sylabs#2935 in place we can now translate a Docker archive tarball into
an OCI-SIF without creating a redundant copy of the image as a temporary
OCI layout.

We've previously been processing the buildkit tarball as an oci-archive,
which still requires extraction to a temporary layout, as ggcr's tarball
package doesn't accept an archived layout without a manifest.json.

Because the buildkit output tarball includes the Docker specific
manifest.json, in addition to the standard OCI layout files, it can be
treated as a docker-archive - now avoiding the need for the temporary
layout.

Significantly reduces disk space needed by `build --oci`, and the
associated I/O.

Fixes sylabs#2308
cyanezstange pushed a commit to cyanezstange/singularity that referenced this issue Jun 4, 2024
With sylabs#2935 in place we can now translate a Docker archive tarball into
an OCI-SIF without creating a redundant copy of the image as a temporary
OCI layout.

We've previously been processing the buildkit tarball as an oci-archive,
which still requires extraction to a temporary layout, as ggcr's tarball
package doesn't accept an archived layout without a manifest.json.

Because the buildkit output tarball includes the Docker specific
manifest.json, in addition to the standard OCI layout files, it can be
treated as a docker-archive - now avoiding the need for the temporary
layout.

Significantly reduces disk space needed by `build --oci`, and the
associated I/O.

Fixes sylabs#2308
cyanezstange pushed a commit to cyanezstange/singularity that referenced this issue Jun 20, 2024
With sylabs#2935 in place we can now translate a Docker archive tarball into
an OCI-SIF without creating a redundant copy of the image as a temporary
OCI layout.

We've previously been processing the buildkit tarball as an oci-archive,
which still requires extraction to a temporary layout, as ggcr's tarball
package doesn't accept an archived layout without a manifest.json.

Because the buildkit output tarball includes the Docker specific
manifest.json, in addition to the standard OCI layout files, it can be
treated as a docker-archive - now avoiding the need for the temporary
layout.

Significantly reduces disk space needed by `build --oci`, and the
associated I/O.

Fixes sylabs#2308
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants