Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix snapshotter ignore list; do not attempt to delete whiteouts of ignored paths #1652

Merged

Conversation

kamaln7
Copy link
Contributor

@kamaln7 kamaln7 commented May 18, 2021

Fixes #1626

Description

With a Dockerfile like:

FROM ubuntu:18.04
RUN apt-get update && apt-get install -y sudo && rm -rf /var/lib/apt/lists/*

and --use-new-run, Kaniko would build a faulty cache image that adds a whiteout for /dev. This results in any builds that attempt to reuse the cache image to try and delete /dev, which obviously does not work and so fails the whole build:

DEBU[0007] Whiting out /.wh.dev                         
error building image: error building stage: failed to execute command: extracting fs from image: removing whiteout .wh.dev: unlinkat //dev/console: device or resource busy   

Change 1: Add a safe-guard to GetFSFromLayers that prevents Kaniko from deleting whiteouts of ignored paths

This is more of a bandaid in that it prevents bad cache images from failing builds entirely. With this change, bad cache images can still be used as Kaniko no longer tries to delete /dev even though there is a whiteout for it.

Change 2: Fix the snapshotter ignore list not including filesystem mounts

This is the actual root issue and the change stops Kaniko from erroneously adding a whiteout for /dev (or other similar paths). The apt-get command in the Dockerfile adds the following symlink:

symlink /lib/systemd/system/sudo.service -> /dev/null

When building the list of changed files, this symlink is resolved and so /dev/null is added to the list (which causes the parent dir /dev to be added as well). There is an ignore list check in the snapshot code, but it turns out that the ignore list that it gets doesn't include filesystem mounts. This is because the snapshotter is created before DetectFilesystemIgnoreList is run, and snapshotter takes a copy of the ignore list.

To fix this, I changed up the order of operations a bit and made sure that the ignore list is fully initialized as early as possible (in newStageBuilder). I also cleaned up the ignore list code a bit.

Before/After

docker run -ti --rm \
          -v "$PWD:/workspace" \
          -v docker-config.json:/kaniko/.docker/config.json:ro \
          kaniko \
          --force \
          --dockerfile=Dockerfile \
          --cache \
          --use-new-run=1 \
          --snapshotMode redo \
          --destination=image -v debug

Before:

INFO[0003] RUN apt-get update && apt-get install -y sudo && rm -rf /var/lib/apt/lists/*
DEBU[0003] using new RunMarker command
INFO[0003] cmd: /bin/sh
INFO[0003] args: [-c apt-get update && apt-get install -y sudo && rm -rf /var/lib/apt/lists/*]
INFO[0003] Running: [/bin/sh -c apt-get update && apt-get install -y sudo && rm -rf /var/lib/apt/lists/*]
Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:2 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [2150 kB]
....
Unpacking sudo (1.8.21p2-3ubuntu1.4) ...
Setting up sudo (1.8.21p2-3ubuntu1.4) ...
DEBU[0008] files changed [/tmp /usr/share/apport/package-hooks /usr/share/apport/package-hooks/source_sudo.py /usr/share/doc /usr/share/doc/sudo /usr/share/doc/sudo/changelog.Debian.gz /usr/share/doc/sudo/copyright /usr/share/doc/sudo/examples /usr/share/lintian/overrides /usr/share/lintian/overrides/sudo /usr/include /usr/include/sudo_plugin.h /usr/lib /usr/lib/sudo /usr/lib/sudo/libsudo_util.so.0 /usr/lib/sudo/libsudo_util.la /usr/lib/sudo/group_file.so /usr/lib/sudo/libsudo_util.so.0.0.0 /usr/lib/sudo/sudoers.so /usr/lib/sudo/sudoers.la /usr/lib/sudo/system_group.so /usr/lib/sudo/group_file.la /usr/lib/sudo/libsudo_util.so /usr/lib/sudo/sesh /usr/lib/sudo/sudo_noexec.so /usr/lib/sudo/system_group.la /usr/lib/sudo/sudo_noexec.la /usr/lib/tmpfiles.d /usr/lib/tmpfiles.d/sudo.conf /usr/sbin /usr/sbin/visudo /usr/bin /usr/bin/sudo /usr/bin/sudoedit /usr/bin/sudoreplay /etc /etc/sudoers.d /etc/sudoers.d/README /etc/pam.d /etc/pam.d/sudo /etc/sudoers /var/cache/apt/archives /var/cache/apt/archives/partial /var/log/apt /var/log/apt/eipp.log.xz /var/log/apt/term.log /var/log/apt/history.log /var/log/dpkg.log /var/lib /var/lib/sudo /var/lib/sudo/lectured /var/lib/dpkg /var/lib/dpkg/status /var/lib/dpkg/triggers/Lock /var/lib/dpkg/info /var/lib/dpkg/info/sudo.md5sums /var/lib/dpkg/info/sudo.prerm /var/lib/dpkg/info/sudo.preinst /var/lib/dpkg/info/sudo.postrm /var/lib/dpkg/info/sudo.list /var/lib/dpkg/info/sudo.postinst /var/lib/dpkg/info/sudo.conffiles /var/lib/dpkg/lock /var/lib/dpkg/status-old /var/lib/dpkg/updates /var/lib/apt /var/lib/apt/lists /var/lib/apt/extended_states /lib/systemd/system /lib/systemd/system/sudo.service]
INFO[0008] Taking snapshot of files...
DEBU[0008] Taking snapshot of files [/var/lib/dpkg/info/sudo.conffiles /tmp /usr/share/doc/sudo/copyright /usr/lib /etc/sudoers.d /var/log/apt/term.log /var/lib/dpkg/info/sudo.md5sums /var/lib/dpkg/info/sudo.list /lib /lib/systemd/system/sudo.service /usr/share/lintian/overrides/sudo /usr/lib/sudo/libsudo_util.so.0.0.0 /usr/lib/sudo/group_file.so /usr/bin /var/cache /usr/share/doc/sudo /usr/lib/sudo/sudoers.la /usr/lib/tmpfiles.d/sudo.conf /var/log /dev/null /lib/systemd /usr/bin/sudo /etc /etc/sudoers.d/README /var/cache/apt /var/cache/apt/archives/partial /var/log/apt/eipp.log.xz /var/lib/dpkg/status-old /usr/bin/sudoreplay //usr/share /usr/share/lintian/overrides /usr/include /usr/include/sudo_plugin.h /usr/lib/sudo/group_file.la /usr/lib/tmpfiles.d /var/cache/apt/archives /var/lib/sudo/lectured /var/lib/dpkg/triggers/Lock /var/lib/dpkg/info/sudo.postrm /var/lib/dpkg/lock /var/lib/apt/extended_states /usr/lib/sudo/system_group.la /var/lib/dpkg /var/lib/dpkg/info/sudo.prerm /usr/share/doc/sudo/examples /usr/lib/sudo/libsudo_util.la /usr/sbin/visudo /var/lib/dpkg/info/sudo.preinst /usr/share/apport/package-hooks /usr/share/apport/package-hooks/source_sudo.py /usr/lib/sudo/libsudo_util.so.0 /usr/lib/sudo/sudo_noexec.so /etc/pam.d/sudo /etc/sudoers /var /lib/systemd/system /usr/share/apport /usr/lib/sudo/libsudo_util.so /dev /usr/lib/sudo /usr/lib/sudo/sesh /var/lib/sudo /var/lib/dpkg/status /var/lib/dpkg/updates /usr/lib/sudo/sudo_noexec.la /etc/pam.d /usr/share/doc /var/log/apt /var/lib/dpkg/triggers /var/log/dpkg.log /var/lib /var/lib/apt/lists /usr/share/doc/sudo/changelog.Debian.gz /usr/share/lintian /var/log/apt/history.log /usr/lib/sudo/sudoers.so /usr/lib/sudo/system_group.so /var/lib/dpkg/info /var/lib/dpkg/info/sudo.postinst /usr /usr/sbin /usr/bin/sudoedit /var/lib/apt]
DEBU[0008] Adding whiteout for /dev

Note that the Taking snapshot of files list contains /dev/null and /dev, and the /dev whiteout.

After:

...
Unpacking sudo (1.8.21p2-3ubuntu1.4) ...
Setting up sudo (1.8.21p2-3ubuntu1.4) ...
DEBU[0009] files changed [/tmp /usr/share/apport/package-hooks /usr/share/apport/package-hooks/source_sudo.py /usr/share/doc /usr/share/doc/sudo /usr/share/doc/sudo/changelog.Debian.gz /usr/share/doc/sudo/copyright /usr/share/doc/sudo/examples /usr/share/lintian/overrides /usr/share/lintian/overrides/sudo /usr/include /usr/include/sudo_plugin.h /usr/lib /usr/lib/sudo /usr/lib/sudo/libsudo_util.so.0 /usr/lib/sudo/libsudo_util.la /usr/lib/sudo/group_file.so /usr/lib/sudo/libsudo_util.so.0.0.0 /usr/lib/sudo/sudoers.so /usr/lib/sudo/sudoers.la /usr/lib/sudo/system_group.so /usr/lib/sudo/group_file.la /usr/lib/sudo/libsudo_util.so /usr/lib/sudo/sesh /usr/lib/sudo/sudo_noexec.so /usr/lib/sudo/system_group.la /usr/lib/sudo/sudo_noexec.la /usr/lib/tmpfiles.d /usr/lib/tmpfiles.d/sudo.conf /usr/sbin /usr/sbin/visudo /usr/bin /usr/bin/sudo /usr/bin/sudoedit /usr/bin/sudoreplay /etc /etc/sudoers.d /etc/sudoers.d/README /etc/pam.d /etc/pam.d/sudo /etc/sudoers /var/cache/apt/archives /var/cache/apt/archives/partial /var/log/apt /var/log/apt/eipp.log.xz /var/log/apt/term.log /var/log/apt/history.log /var/log/dpkg.log /var/lib /var/lib/sudo /var/lib/sudo/lectured /var/lib/dpkg /var/lib/dpkg/status /var/lib/dpkg/triggers/Lock /var/lib/dpkg/info /var/lib/dpkg/info/sudo.md5sums /var/lib/dpkg/info/sudo.prerm /var/lib/dpkg/info/sudo.preinst /var/lib/dpkg/info/sudo.postrm /var/lib/dpkg/info/sudo.list /var/lib/dpkg/info/sudo.postinst /var/lib/dpkg/info/sudo.conffiles /var/lib/dpkg/lock /var/lib/dpkg/status-old /var/lib/dpkg/updates /var/lib/apt /var/lib/apt/lists /var/lib/apt/extended_states /lib/systemd/system /lib/systemd/system/sudo.service]
DEBU[0009] resolved symlink /usr/lib/sudo/libsudo_util.so.0 to /usr/lib/sudo/libsudo_util.so.0.0.0
DEBU[0009] resolved symlink /usr/lib/sudo/libsudo_util.so to /usr/lib/sudo/libsudo_util.so.0.0.0
DEBU[0009] resolved symlink /usr/bin/sudoedit to /usr/bin/sudo
DEBU[0009] resolved symlink /lib/systemd/system/sudo.service to /dev/null
DEBU[0009] path /dev/null is ignored, ignoring it
INFO[0009] Taking snapshot of files...
DEBU[0009] Taking snapshot of files [/usr/lib/sudo/sesh /var/log/apt/eipp.log.xz /var/log/apt/term.log /var/lib/dpkg/info/sudo.conffiles /usr/share/doc/sudo/examples /usr/lib/sudo/libsudo_util.so.0.0.0 /usr/bin/sudo /var/lib/dpkg/info/sudo.list /var/lib/apt/extended_states /var/cache/apt/archives /var/cache/apt/archives/partial / /usr/share/apport /usr/lib/tmpfiles.d /usr/lib/tmpfiles.d/sudo.conf /usr/bin /etc/sudoers /var/log/apt /var/lib /lib/systemd /lib/systemd/system/sudo.service /var/lib/dpkg/info/sudo.prerm /usr/share/apport/package-hooks /usr/lib/sudo/sudo_noexec.so /usr/lib/sudo/system_group.la /usr/bin/sudoreplay /var/log /var/log/apt/history.log /usr/lib/sudo/libsudo_util.so /usr/sbin/visudo /etc/sudoers.d/README /usr/lib/sudo/sudoers.so /var/cache/apt /var/cache /var/lib/sudo /var/lib/dpkg/info/sudo.md5sums /var/lib/dpkg/info/sudo.postrm /var/lib/dpkg/status-old /usr/share/apport/package-hooks/source_sudo.py /usr/share/doc/sudo/changelog.Debian.gz /etc/pam.d /var/lib/dpkg /var/lib/dpkg/updates /var/lib/apt /var/lib/apt/lists /lib/systemd/system /usr/share/lintian/overrides /usr/share/lintian /usr/lib /usr/lib/sudo/libsudo_util.la /usr/sbin /var/log/dpkg.log /lib /etc/pam.d/sudo /var/lib/dpkg/triggers/Lock /usr /usr/share/lintian/overrides/sudo /usr/include /usr/lib/sudo/system_group.so /usr/lib/sudo/group_file.la /etc/sudoers.d /usr/share/doc/sudo /usr/include/sudo_plugin.h /usr/lib/sudo /usr/bin/sudoedit /tmp /usr/lib/sudo/libsudo_util.so.0 /var/lib/dpkg/triggers /var/lib/dpkg/info /usr/share /usr/lib/sudo/sudoers.la /usr/lib/sudo/sudo_noexec.la /var/lib/dpkg/lock /usr/lib/sudo/group_file.so /etc /var /var/lib/dpkg/status /var/lib/dpkg/info/sudo.preinst /var/lib/dpkg/info/sudo.postinst /usr/share/doc /usr/share/doc/sudo/copyright /var/lib/sudo/lectured]

Note that /dev/null is correctly ignored and there is no whiteout for /dev.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes unit tests
  • Adds integration tests if needed.

See the contribution guide for more details.

Reviewer Notes

  • The code flow looks good.
  • Unit tests and or integration tests added.

Release Notes

- Fix creation of faulty cache images in certain scenarios when `--use-new-run` is set

Kamal Nasser added 2 commits April 29, 2021 05:36
* include filesystem mounts in ignorelist of snapshotter
* clean up ignore list logic
@google-cla google-cla bot added the cla: yes CLA signed by all commit authors label May 18, 2021
@kamaln7 kamaln7 force-pushed the fix-snapshotter-ignorelist branch from 6c36ea0 to a954eb1 Compare May 18, 2021 02:05
@kamaln7
Copy link
Contributor Author

kamaln7 commented May 18, 2021

Some of the integration tests are failing with this error—is this a known error?

INFO[0003] Retrieving image gcr.io/kaniko-test/hardlink-base:latest from registry gcr.io 
2061        error building image: GET https://gcr.io/v2/token?scope=repository%3Akaniko-test%2Fhardlink-base%3Apull&service=gcr.io: DENIED: Project kaniko-test has been deleted.

@tejal29
Copy link
Contributor

tejal29 commented May 18, 2021

Some of the integration tests are failing with this error—is this a known error?

INFO[0003] Retrieving image gcr.io/kaniko-test/hardlink-base:latest from registry gcr.io 
2061        error building image: GET https://gcr.io/v2/token?scope=repository%3Akaniko-test%2Fhardlink-base%3Apull&service=gcr.io: DENIED: Project kaniko-test has been deleted.

They could be flakes. Running the jobs again.

@kamaln7
Copy link
Contributor Author

kamaln7 commented May 26, 2021

Thanks @tejal29. I noticed that the two newer PRs are also failing with the same error 🤔

@tejal29
Copy link
Contributor

tejal29 commented Jun 1, 2021

Thanks @kamaln7. Somehow the project was deleted.
The error should be fixed now and I will release kaniko this Friday.

@kamaln7
Copy link
Contributor Author

kamaln7 commented Jun 1, 2021

Got it, thank you @tejal29! I assume the changes are also good? I’m worried about accidentally breaking existing behavior, although the tests pass.

@tejal29 tejal29 merged commit f21639d into GoogleContainerTools:master Jun 4, 2021
@kamaln7
Copy link
Contributor Author

kamaln7 commented Jun 14, 2021

Hey @tejal29, I just wanted to follow up and check on the status of the release?

@kvaps
Copy link
Contributor

kvaps commented Aug 20, 2021

This PR is breaking unit test:

--- FAIL: Test_GetFSFromLayers_with_whiteouts_include_whiteout_enabled (0.00s)
    fs_util_test.go:1124: expected [/tmp/layers-test123265391/foobar] to equal [/tmp/layers-test123265391/foobar /tmp/layers-test123265391/.wh.foobar]
--- FAIL: Test_GetFSFromLayers_with_whiteouts_include_whiteout_disabled (0.00s)
    fs_util_test.go:1238: expected whiteout foobar file to be deleted. However found it.
--- FAIL: Test_GetFSFromLayers_ignorelist (0.00s)
    fs_util_test.go:1319: expected [/tmp/layers-test933363577/testdir/file /tmp/layers-test933363577/other-file] to equal [/tmp/layers-test933363577/.wh.testdir /tmp/layers-test933363577/testdir/file /tmp/layers-test933363577/other-file]

@kingcrunch
Copy link

kingcrunch commented Mar 11, 2023

The unit tests are still broken

--- FAIL: Test_GetFSFromLayers_with_whiteouts_include_whiteout_enabled (0.00s)
    fs_util_test.go:1097: expected [/tmp/Test_GetFSFromLayers_with_whiteouts_include_whiteout_enabled871778633/001/foobar] to equal [/tmp/Test_GetFSFromLayers_with_whiteouts_include_whiteout_enabled871778633/001/foobar /tmp/Test_GetFSFromLayers_with_whiteouts_include_whiteout_enabled871778633/001/.wh.foobar]
--- FAIL: Test_GetFSFromLayers_with_whiteouts_include_whiteout_disabled (0.00s)
    fs_util_test.go:1207: expected whiteout foobar file to be deleted. However found it.
--- FAIL: Test_GetFSFromLayers_ignorelist (0.00s)
    fs_util_test.go:1284: expected [/tmp/Test_GetFSFromLayers_ignorelist593273375/001/testdir/file /tmp/Test_GetFSFromLayers_ignorelist593273375/001/other-file] to equal [/tmp/Test_GetFSFromLayers_ignorelist593273375/001/.wh.testdir /tmp/Test_GetFSFromLayers_ignorelist593273375/001/testdir/file /tmp/Test_GetFSFromLayers_ignorelist593273375/001/other-file]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
candidate-release cla: yes CLA signed by all commit authors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"--use-new-run" problem with /dev/shm
4 participants