Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explode flattened buildpacks when pack build #1787

Merged
merged 39 commits into from
Jul 10, 2023
Merged

Conversation

jjbustamante
Copy link
Member

@jjbustamante jjbustamante commented May 30, 2023

Summary

After the pack 0.30.0-pre2 was released some users reported some troubles when trying to use the new flatten flag added to the pack builder create and pack buildpack package commands. See slack thread

  1. When pack build app --buildpack was used with a flattened buildpack on a GHA runner it was causing the runner to run out of disk space
  2. When pack build app --buildpack was used with a flattened buildpack that is also present into the builder we want to use, pack was not able to deduplicate those buildpacks correctly.

This PR is trying to fix both issues.

Output

Before

  • Issue 1, when a buildpack is flattened we are saving in the same layer N number of buildpacks, during pack build an ephemeral builder is created, in this process we read each buildpack layer tar to disk in a temp folder and then create this builder (that is deleted at the end of the command execution) adding those tars to it. Because each flattened layer contains more than 1 buildpack, the iteration over the layers creates a file on disk with all the flattened buildpacks. that is the reason for the process to consume more hard drive space.

  • Issue 2, when pack is creating the ephemeral builder and some buildpacks are pass through -b flag, pack calculates the hash of the tar file inside the provided buildpack and then compares this value against the hash of the same buildpack in the builder, because the provided buildpack is flattened (there are more than 1 buildpack in the layer) we need to split those buildpacks into individual ones.

After

  • To fix issue 1: This PR adds some logic to keep track of the layers DiffIds when a buildpack is fetched, if the same layer DiffID is found in more than 1 buildpack then only 1 of them returns the content of the layer tar (with all the other buildpacks inside of it) and the other returns an empty tar. In this way we only use the same hard drive space when the layer tar is written into disk.

  • To fix issue 2: This PR adds some logic when a flattened buildpack is fetched, Pack will try to write different tar files for each buildpack flattened inside a layer and after that the calculation of the hash to execute the deduplication logic remains the same.

Documentation

  • Should this change be documented?
    • Yes, see #___
    • No

Related

Resolves #1798

@github-actions github-actions bot added this to the 0.30.0 milestone May 30, 2023
@github-actions github-actions bot added type/chore Issue that requests non-user facing changes. type/enhancement Issue that requests a new feature or improvement. labels May 30, 2023
pkg/buildpack/buildpack.go Fixed Show fixed Hide fixed
When `pack build --buildpack <flattened buildpack>` is invoked,
`pack` doesn't know the buildpack is flattened, so it treats it like
`pack build --buildpack bp/1 --buildpack bp2`,
resulting in unnecessary tar files being created
and duplicate layers being added to the ephemeral builder,
which can exhaust disk space depending on the environment.

This optimization avoids writing the tar file more than once
by returning a no-op read closer
when the diffID has already been queued for downloading.

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
@jjbustamante jjbustamante force-pushed the fix/flatten-tar-extras-2 branch 2 times, most recently from 3cb9f5a to 6f76a88 Compare June 7, 2023 20:27
@jjbustamante jjbustamante added type/bug Issue that reports an unexpected behaviour. and removed type/enhancement Issue that requests a new feature or improvement. labels Jun 7, 2023
@github-actions github-actions bot added the type/enhancement Issue that requests a new feature or improvement. label Jun 8, 2023
@jjbustamante jjbustamante force-pushed the fix/flatten-tar-extras-2 branch 2 times, most recently from 97e1fda to c8041e7 Compare June 9, 2023 16:24
@jjbustamante jjbustamante force-pushed the fix/flatten-tar-extras-2 branch 2 times, most recently from 285b0dc to da011d3 Compare June 13, 2023 18:42
@jjbustamante jjbustamante added flatten feature and removed type/chore Issue that requests non-user facing changes. labels Jun 13, 2023
@github-actions github-actions bot added the type/chore Issue that requests non-user facing changes. label Jun 14, 2023
pkg/buildpack/buildpack.go Fixed Show resolved Hide resolved
@jjbustamante jjbustamante force-pushed the fix/flatten-tar-extras-2 branch 2 times, most recently from 56fc8bf to d4441ca Compare June 15, 2023 02:03
calculate which buildpacks must be added into the ephemeral builder

Skipping pack build -b <flattened-buildpack> in windows container
because acceptance tests doesn't support it yet

Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
natalieparellano and others added 3 commits June 26, 2023 12:59
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
internal/builder/builder.go Outdated Show resolved Hide resolved
pkg/buildpack/buildpack.go Fixed Show resolved Hide resolved
pkg/buildpack/buildpack.go Outdated Show resolved Hide resolved
pkg/buildpack/buildpack.go Outdated Show resolved Hide resolved
pkg/buildpack/buildpack.go Outdated Show resolved Hide resolved
Signed-off-by: Natalie Arellano <narellano@vmware.com>
natalieparellano and others added 7 commits June 27, 2023 13:17
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
// write the rest of the package
var header *tar.Header
for {
header, err = tr.Next()

Check failure

Code scanning / CodeQL

Arbitrary file write during zip extraction ("zip slip")

Unsanitized archive entry, which may contain '..', is used in a [file system operation](1).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled at line 435

if origVersion == "" {
// the first header only contains the id - e.g., /cnb/buildpacks/<buildpack-id>,
// read the next header to get the version
secondHeader, err := tr.Next()

Check failure

Code scanning / CodeQL

Arbitrary file write during zip extraction ("zip slip")

Unsanitized archive entry, which may contain '..', is used in a [file system operation](1).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled at line 397

)

for {
header, err = tr.Next()

Check failure

Code scanning / CodeQL

Arbitrary file write during zip extraction ("zip slip")

Unsanitized archive entry, which may contain '..', is used in a [file system operation](1).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled at line 353

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we don't need to do it as part of this PR (because we use tar readers all over) but it would be cleaner to wrap tar.NewReader with our own reader that sanitizes the path as part of calling tr.Next()

Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
Signed-off-by: Juan Bustamante <jbustamante@vmware.com>
@jjbustamante
Copy link
Member Author

@natalieparellano @joe-kimmel-vmw @jkutner I will love if you can review the PR again

Copy link
Member

@natalieparellano natalieparellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me :) Thanks for all of your hard work in getting this over the finish line @jjbustamante

@joe-kimmel-vmw
Copy link
Contributor

IMO this is in good shape. I did make one tiny suggestion in this PR (set to merge into this branch): https://github.com/buildpacks/pack/pull/1829/files

@jkutner jkutner merged commit 34a6b7c into main Jul 10, 2023
@jkutner jkutner deleted the fix/flatten-tar-extras-2 branch July 10, 2023 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flatten feature type/bug Issue that reports an unexpected behaviour. type/chore Issue that requests non-user facing changes. type/enhancement Issue that requests a new feature or improvement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fixups for flatten feature on pack 0.30.0
4 participants