Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpmdb damaged in Docker build with fedora:29 base #3

Closed
robnagler opened this issue May 22, 2019 · 14 comments
Closed

rpmdb damaged in Docker build with fedora:29 base #3

robnagler opened this issue May 22, 2019 · 14 comments

Comments

@robnagler
Copy link

Here's the error during dnf clean:

error: rpmdb: damaged header #680 retrieved -- skipping.

Some versions:

  • Docker version 18.09.1, build 4c52b90
  • fedora:29 d09302f77cfc
  • dnf-plugin-ovl-0.0.2-1.20181107gitfd1a5a5.fc29.noarch

In looking at the code, I think the file actually needs to be opened with O_CREAT, at least that's what touch does. touch /var/lib/rpm/* works fine.

Also reported in https://bugzilla.redhat.com/show_bug.cgi?id=1646543 and radiasoft/containers#91.

@FlorianLudwig
Copy link
Owner

FlorianLudwig commented May 23, 2019

Hi @robnagler

thank you for reporting this. So far I was unable to reproduce this and I use several images based on fedora:29 and did not come across this yet.

Could you provide a Dockerfile to reproduce this?

Thanks.

@robnagler
Copy link
Author

Thanks @FlorianLudwig for the quick response.

We are building large images so bear with me, because the error only shows up after a couple of base images are built, even when dnf-plugin-ovl is installed early in the process.

The following script produces the error:

#!/bin/bash
set -euo pipefail
export build_no_touch_rpmdb=1
export build_image_base=radiasoft/beamsim:20190521.170321
docker pull radiasoft/beamsim:20190521.170321 | cat
[[ -d sirepo ]] || git clone -q https://github.com/radiasoft/sirepo
cd sirepo
curl https://depot.radiasoft.org/index.sh | bash -s container-build

It will not produce the error if you disable build_no_touch_rpmdb with:

export build_no_touch_rpmdb=

This (default mode) triggers touch of /var/lib/rpm/*.

radiasoft/beamsim:20190521.170321 contains:

$ docker run --rm radiasoft/beamsim:20190521.170321 rpm -q dnf-plugin-ovl
dnf-plugin-ovl-0.0.2-1.20181107gitfd1a5a5.fc29.noarch

And, radiasoft/beamsim builds without the error, even though the same code runs for the dnf clean. Not sure what's triggering the error in the sirepo build, but I tested it multiple times to sure it was reproducible.

The script can be run on VirtualBox centos/7 with these params in /etc/docker/daemon.json:

{
    "storage-driver": "overlay2",
    "storage-opts": [
        "overlay2.override_kernel_check=true"
    ]
}

We have curl installers for CentOS 7 so if you want to get a copy of our VM with Docker setup (works on Mac and Linux):

curl https://depot.radiasoft.org/index.sh | bash -s vagrant-dev centos/7
vagrant ssh -c 'radia_run redhat-docker'
vagrant reload
vagrant ssh -c 'radia_run redhat-docker'

The images are large and take a tens of minutes to build:

$ docker images
REPOSITORY              TAG                 IMAGE ID            CREATED             SIZE
d/sirepo                20190524.144425     fa3b2dcda727        14 minutes ago      9.94GB
d/sirepo                alpha               fa3b2dcda727        14 minutes ago      9.94GB
d/sirepo                dev                 fa3b2dcda727        14 minutes ago      9.94GB
d/sirepo                latest              fa3b2dcda727        14 minutes ago      9.94GB
radiasoft/beamsim       20190521.170321     4a4a024db504        2 days ago          5.42GB

@FlorianLudwig
Copy link
Owner

@robnagler

somewhat unrelated:

curl https://depot.radiasoft.org/index.sh | bash -s

it outputs:

usage: curl https://depot.radiasoft.org | bash -s [verbose|quiet] [<installer>|*/*] [extra args]

which seems incorrect:

  1. https://depot.radiasoft.org should be https://depot.radiasoft.org/index.sh
  2. [verbose|quiet] [<installer>|*/*] [extra args] suggests that all arguments are optional but are not

@robnagler
Copy link
Author

Thanks @FlorianLudwig. depot.radiasoft.org now returns index.sh instead of a useless, blank HTML page.

@robnagler
Copy link
Author

Arguments are optional in some cases. For example,

mkdir sirepo
cd sirepo
curl https://depot.radiasoft.org | bash

It's a bit hard to provide the appropriate context in a one liner usage...

@FlorianLudwig
Copy link
Owner

I see.

Well, now I am running

curl https://depot.radiasoft.org/index.sh | bash -s container-build

inside a git checkout of sirepo on a centos 7 box.

The dockerfile that is building this image that is building currently - could you point me to the source?

@robnagler
Copy link
Author

The Dockerfile is dynamically generated, which isn't very helpful.

The basic workflow is:

        build_run_yum
        build_sudo_install
        build_root_setup
        build_fedora_patch
        build_home_env
        build_run_dir
        build_as_root
        build_rsmanifest
        chown -R "$build_run_user:" "$build_guest_conf"
        su "$build_run_user" "$0"
        build_sudo_remove
        build_fedora_clean

build_run_yum is the first entry point, which is:

   if [[ ! ${build_no_touch_rpmdb:-} ]]; then
        build_msg 'touch /var/lib/rpm/*'
        # Avoid corrupting rpm db
        # https://github.com/moby/moby/issues/10180#issuecomment-378005800
        # Tried dnf-plugin-ovl, but that did not work. This definitely works:
        touch /var/lib/rpm/*
    fi

This is how all our CentOS & Fedora containers are built so this same code is always executed before any other yum/dnf operations.

@FlorianLudwig
Copy link
Owner

FlorianLudwig commented May 24, 2019

I ran the following without error:

git clone -q https://github.com/radiasoft/sirepo
cd sirepo/
export build_no_touch_rpmdb=1
export build_image_base=radiasoft/beamsim:20190521.170321
curl https://depot.radiasoft.org/index.sh | bash -s container-build

results:

docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             SIZE
docker.io/centos/sirepo       20190524.160711     2d318d669d55        16 minutes ago      9.94 GB
docker.io/centos/sirepo       alpha               2d318d669d55        16 minutes ago      9.94 GB
docker.io/centos/sirepo       dev                 2d318d669d55        16 minutes ago      9.94 GB
docker.io/centos/sirepo       latest              2d318d669d55        16 minutes ago      9.94 GB
docker.io/radiasoft/beamsim   20190521.170321     4a4a024db504        2 days ago          5.42 GB

@robnagler
Copy link
Author

dnf clean does not error, but it reliably produces this message:

error: rpmdb: damaged header #680 retrieved -- skipping.

With the touch, it does not produce this message.

@FlorianLudwig
Copy link
Owner

oh, indeed it does!

I will have another look next week.

@AaronDMarasco
Copy link

#6 fixes this in Fedora 32; it may have also fixed F29 if it was the same parsing issue @robnagler .

@robnagler
Copy link
Author

Thanks @AaronDMarasco. The problem, I believe, is that this line needs to change to use O_CREAT, because that's what touch does. See the first comment in this issue.

@FlorianLudwig
Copy link
Owner

@robnagler in python "a" mode is equal to "O_WRONLY|O_CREAT|O_APPEND|O_CLOEXEC".

Sorry I never followed up on this issue, I got lost in the build system of sirepo. I think @AaronDMarasco might be right and the issue is not related to the open/touch part but to the detection of overlayfs.

You could either run the current release 0.0.3 and check if OverlayFS detected is logged, try the current master branch or wait for the next release. I am looking into how to write some tests to cover the differences Aaron pointed me towards and will release a new version soonish.

@robnagler
Copy link
Author

We've been using touch, and I'll stick with that. Thanks for trying to fix the problem more generally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants