-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to build image with --isolation chroot #1568
Comments
Any idea what NPM is doing? |
Thanks @42wim for the issue. Just to make sure, were you running as a root or non-root user? |
Running as root user on all tests. Also just did the test as a non-root user (only on the system with 5.x kernel) with exactly the same result
|
Any |
Most likely something to do with the way /proc or /sys is setup in the chroot. It would be really helpful if we could see exactly what is failing in npm Could you add strace to your container and then run strace npm version |
BTW This worked for me.
|
buildah version |
Is this what you mean? FROM centos:7
RUN yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
RUN yum -y install npm strace
RUN strace -s4000 -o /tmp/strace/output -ff npm version
|
Didn't try on 1.8.0 but juist build it, same results for me (as in still failing)
|
Maybe runc? What version do you have @42wim? |
On the centos 7.5 box the default centos one: runc-1.0.0-59.dev.git2abd837.el7.centos.x86_64 |
Did some more testing with running different container images:
FROM centos:7
RUN yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
RUN yum -y install npm
RUN npm version
So something strange in centos that causes it to fail, I'm pretty sure it was working a month or two ago. |
Interesting, thanks for all the testing and info @42wim ! |
Not runc, --chroot does not use runc |
Always the same stack trace? |
A strace of the buildah process itself near the end where it actually opens
|
That is a seccomp failure. Latest seccomp.json file allows statx. If you grab the one from upstream it should fix the issue. |
We need an updated version of containers-common on Centos. |
That's unfortunately not the issue, The rpm is Same version as on https://src.fedoraproject.org/rpms/skopeo/raw/master/f/seccomp.json and as on This is how the strace looks when it works (eg on
I'm wondering if it doesn't have something to do with the symlink |
Have you tried this on a newer kernel? IE Do you see the same issue on Fedora? |
The tests in my previous comment: #1568 (comment) are all running in docker containers on the same centos 7 host (3.10.0-862.14.4.el7.x86_64) and only the centos:latest container fails. So this doesn't look like a kernel issue as they all are using the same kernel. But I also ran on a 5.0.5-1.el7.elrepo.x86_64 kernel, with the same issues (I don't have a "physical" fedora host) |
Ok so this is container image specific, most likely not a Buildah bug. Has any Centos image ever worked? |
Yes, we're using buildah (in a centos docker container) in our Jenkins pipeline (since October last year), mostly building centos images. They all worked fine and all them still do except the ones with a npm command in it. |
Progress! |
Grab this seccomp.json file and see replace the one in /etc/share/containers/seccomp.json |
Still fails with that seccomp.json |
Strange. I will have to try this here. grep statx /usr/share/containers/seccomp.json |
I did an strace again on the build with
While on the failing build the syscall_332 passes....
For both builds makes no difference if statx is in Without
With
|
@pcmoore Any ideas on this? |
@weirdwiz PTAL |
From what I can tell it looks like statx(...) is allowed by the seccomp filter so it makes me wonder if there is another problem somewhere. What is the host system running distribution wise? What version of libseccomp is on the host (NOTE: libseccomp v2.4.1 is current)? |
@pcmoore the host doesn't seem to matter you should be able to reproduce this by just running docker as described in #1568 (comment) Originally tested on a centos 7.5 and centos 7.6 host I just spun up a ubuntu 18.04 LTS on digitalocean and ran the same commands, with the same issue. Only difference is that I needed to use the |
Okay, I just wanted to make sure the host's libseccomp filter wasn't the issue. Looking closer at the log it appears that this is just the libseccomp filter for buildah and not the entire container that is the problem, yes? Assuming that is the case I suspect the problem is the old version of libseccomp that ships in Centos. The libseccomp v2.3.x stream has a number of bugs and is no longer supported upstream. Can you confirm the version of libseccomp present in the fedora/ubuntu/archlinux:latest? It may be that CentOS/RHEL simply needs to update their libseccomp package. |
@pcmoore that was the solution! |
@42wim Could you open a Bugzilla, on this issue on RHEL7 at bugzilla.redhat.com. CC Me on this bug and reference this issue. |
Done |
@42wim Could you link the bugzilla here? |
Bugzilla issue can be found on https://bugzilla.redhat.com/show_bug.cgi?id=1712146 |
Thanks, I will close the issue and follow the progress there. |
TLDR; Redhat is not rebasing it, centos7/rhel7 users are out of luck. |
Well it does work on RHEL8. |
Description
Fails to build an image using
--isolation chroot
(works fine with default setting)Steps to reproduce the issue:
buildah bud --isolation chroot .
Describe the results you received:
npm version
failsDescribe the results you expected:
The output of
buildah bud .
Output of
buildah version
:Running 1.8.1 but same issue on 1.6 and 1.7.x
Tested on centos 7.5
Linux test-1 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
And centos 7.6
Linux test-2 5.0.5-1.el7.elrepo.x86_64 #1 SMP Wed Mar 27 13:30:25 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: