Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman/buildah cannot see each other's local images #13970

Closed
NetDwarf opened this issue Apr 22, 2022 · 25 comments · Fixed by #14499
Closed

podman/buildah cannot see each other's local images #13970

NetDwarf opened this issue Apr 22, 2022 · 25 comments · Fixed by #14499
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@NetDwarf
Copy link

NetDwarf commented Apr 22, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

buildah and podman can have a different storage-driver on the same host system and in that case cannot see each other's images. This in itself is possibly expected behavior, but it presents as a bug to the user. (See expected results section)

Steps to reproduce the issue:

  1. buildah and podman have different storage-driver (i.e. overlay and vfs)
  2. buildah commit $(buildah from scratch) foo
  3. podman run --rm -it localhost/foo

Describe the results you received:

WARN[0000] failed, retrying in 1s ... (1/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
WARN[0001] failed, retrying in 1s ... (2/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
WARN[0002] failed, retrying in 1s ... (3/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused

Describe the results you expected:

This is not cut and dry as there are at least three conceivable results.

  1. Warning includes a reference to the localhost image on another storage-driver and that you need to change storage-driver to <x> in order to use it
  2. buildah and podman share the same storage-driver (setting)
  3. podman uses a different storage-driver when it sees an image in another one

Additional information you deem important (e.g. issue happens only occasionally):
The obvious workaround is to set the driver i.e.:

Use podman run --storage-driver=<driver> ...

Or STORAGE_DRIVER=<driver> podman run ...

Or set it permanently in ~/.config/containers/storage.conf

[storage]
driver = "overlay" # or vfs

and execute podman system reset, but be aware that it removes all current containers and images.

Output of podman version:

Version:      3.4.6
API Version:  3.4.6
Go Version:   go1.18.1
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

No (not latest version)

Additional environment details (AWS, VirtualBox, physical, etc.):
This is on Debian/testing and in rootless mode, however I think this happens with other settings as well as long as the storage-drivers are different.

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 22, 2022
@vrothberg
Copy link
Member

Thanks for reaching out, @NetDwarf.

buildah and podman have different storage-driver (i.e. overlay and vfs)

In order to share images, the tools need to use the same storage driver. May I ask why you're using vfs for Podman? The performance of vfs is extremely poor compared to overlay.

@NetDwarf
Copy link
Author

NetDwarf commented Apr 22, 2022

Thanks for reaching out, @NetDwarf.

buildah and podman have different storage-driver (i.e. overlay and vfs)

In order to share images, the tools need to use the same storage driver. May I ask why you're using vfs for Podman? The performance of vfs is extremely poor compared to overlay.

This was not intentional at all. I was just presented with the problem above and had to do some digging to find the culprit. It was actually a bit more complicated than that, however to keep the bug report somewhat relevant/concise I picked the resulting behavior.

The part that I did not add also because I am missing some information like previous storage-driver settings:

Actually it worked at the start while podman images didn't see buildah images, but podman run <buildah-image> worked (which in hindsight is weird). After pulling docker.io/golang:alpine (not necessarily related) I got:

`ERRO[0000] User-selected graph driver "overlay" overwritten by graph driver "vfs" from database - delete libpod local files to resolve

(or similar, that's the one that is still in buffer)
And a possible solution I found was to rm -rf ~/.local/share/containers and then it started to not work anymore as described above. Doing podman system reset -f didn't do anything, however given that this does not adjust the storage-driver it's obvious this couldn't work (in hindsight).

I assume PEBKAC, however the resulting behavior still presents as a bug.

@rhatdan
Copy link
Member

rhatdan commented Apr 22, 2022

`ERRO[0000] User-selected graph driver "overlay" overwritten by graph driver "vfs" from database - delete libpod local files to resolve

This error is usually caused by someone changing the storage driver after they have pulled an image. Podman records the storage driver in its internal database. If you want to change the storage driver, then you need to do a podman system reset to reset (destroy) all storage.

This can also happen on older systems where the user used a rootless container without fuse-overlayfs being installed, which would default to vfs. Later installing fuse-overlayfs will cause podman to switch to overlay and potentially show this error.

@NetDwarf
Copy link
Author

NetDwarf commented Apr 22, 2022

Neither of you even read the issue. Thank you for nothing! Sorry.

@vrothberg
Copy link
Member

vrothberg commented Apr 22, 2022

I considered the issue done given the above answers. As mentioned above, the same driver must be used.

If something's left unaddressed please point it out; ideally without sarcasm.

@NetDwarf
Copy link
Author

NetDwarf commented Apr 22, 2022

I did not explicitly use a different driver leading to the described behavior.

The issue is either:

  1. That there is no error message that hints to a different storage driver being used by buildah (or just other containers in the container storage)
  2. That podman does not select the correct storage-driver given an existing image with the same name but different storage-driver
  3. That buildah and podman do not share a storage-driver setting.

As the expected behavior is not necessarily obvious you can pick one of the above. Given the described behavior in this issue can happen without explicitly setting a storage driver (and is not worked around by podman system reset -f btw), the user is left with a broken podman/buildah combination for no obvious reason.

@vrothberg
Copy link
Member

Thanks for elaborating! Replying in-line below.

1. That there is no error message that hints to a different storage driver being used by buildah (or just other containers in the container storage)

Curious what @rhatdan thinks. Some users may desire running with multiple storage drivers simultaneously, where a warning would harm the experience.

2. That podman does not select the correct storage-driver given an existing image with the same name but different storage-driver

The storage driver is a global setting. Supporting the simultaneously is out of scope.

3. That buildah and podman do not share a storage-driver setting.

That is indeed an interesting case:

$ podman system reset -f
$ podman --storage-driver=vfs pull alpine > /dev/null
$ buildah images
REPOSITORY   TAG   IMAGE ID   CREATED   SIZE

Explanation: buildah and podman default to using overlay if supported on the host. A custom storage-driver is usually configured in storage.conf. However, Podman stores the storage driver it has been initialized with in the it's database.

As the expected behavior is not necessarily obvious you can pick one of the above.

I concur.

Given the described behavior in this issue can happen without explicitly setting a storage driver (and is not worked around by podman system reset -f btw), the user is left with a broken podman/buildah combination for no obvious reason.

I do not know how/if the storage drivers can change like that without a user/caller setting it explicitly, or stemming from an upgrade.

@vrothberg vrothberg reopened this Apr 22, 2022
@NetDwarf
Copy link
Author

Explanation: buildah and podman default to using overlay if supported on the host. A custom storage-driver is usually configured in storage.conf. However, Podman stores the storage driver it has been initialized with in the it's database.

Possibly this is the bug, as podman was set to vfs albeit overlayfs was supported. I also did sudo apt autoremove --purge podman buildah and sudo apt install podman buildah out of desperation and that didn't work either. So there is either a setting left over (I am not sure where the database is stored) or podman uses another heuristic to detect overlay support than buildah and they come to different conclusions.

storage.conf was not present on the whole system. I had to create it with the correct storage setting (as explained in the first post).

@vrothberg
Copy link
Member

vrothberg commented Apr 22, 2022

So there is either a setting left over (I am not sure where the database is stored)

Very likely that's the case. It's stored in $HOME/.local/share/containers/.

or podman uses another heuristic to detect overlay support than buildah and they come to different conclusions.

They use the very same code.

I think the tools should probably only default to overlay if the storage is empty and otherwise use whatever is there.

But I may very well miss some background. @rhatdan @giuseppe WDYT?

@NetDwarf
Copy link
Author

Very likely that's the case. It's stored in $HOME/.local/share/containers/.

Hmm, it happened after removing that folder (correlation, not necessarily causation) and I tried also some combinations with removing that folder again and podman system reset -f and system restarts and reinstalling the tools etc to no avail. Basically I tried to clear any old configuration.

@rhatdan
Copy link
Member

rhatdan commented Apr 22, 2022

The only way to fix this would be to write the storage.conf file to the users homedir if it did not exists, when you executed

$ podman --storage-driver=vfs pull alpine > /dev/null

With "vfs" as the storage driver. The problem with this is storage.conf does not inherit. So once this is written any global settings from /usr/share/containers/storage.conf and /etc/containers/storage.conf are ignored. We did this back when we supported libpod.conf, and it ended up being an update headache.

We could change storage.conf to inherit missing fields, like we do with containers.conf, but that would be a fairly big change.

I am not sure this is worth the risk for the benefit, or a real corner case.

@vrothberg
Copy link
Member

vrothberg commented Apr 22, 2022

@rhatdan, I think we can behave slightly different, see my upper comment:

I think the tools should probably only default to overlay if the storage is empty and otherwise use whatever is there.

@NetDwarf
Copy link
Author

Yeah I figured that it is not easy to have a global setting or adaptive behavior. Is an error message in a very specific case adequate instead? That is when podman does not find the image with that storage driver, but there exists an image with the same name and another storage driver. Something like:
Could not find image localhost/foo for storage driver <y>, but there exists an image with the same name for storage driver <x>. Try the same command with --storage-driver <x> instead.

@rhatdan
Copy link
Member

rhatdan commented Apr 22, 2022

@vrothberg I think this might be more difficult then you think. If I did a second

buildah --storage-driver=btrfs pull alpine
buildah --storage-driver=overlayfs pull fedora
buildah --storage-driver=vfs pull ubi8

What is the default? I am not crazy about containers/storage guessing what has happened in the directory beforehand and guessing at the driver.

@vrothberg
Copy link
Member

@rhatdan, I'd think overlayfs > btrfs > vfs. Isn't there a preference list in c/storage?

@giuseppe
Copy link
Member

As I understood it, the issue is that podman overwrites the storage driver depending on what it has in its database:

`ERRO[0000] User-selected graph driver "overlay" overwritten by graph driver "vfs" from database - delete libpod local files to resolve

should we just make the error clearer and mention podman might not see images created from other tools?

@vrothberg
Copy link
Member

should we just make the error clearer and mention podman might not see images created from other tools?

I like the idea. @rhatdan WDYT?

@rhatdan
Copy link
Member

rhatdan commented May 4, 2022

SGTM

@NetDwarf
Copy link
Author

NetDwarf commented May 4, 2022

This is a bit of a mixture now. The error message that giuseppe quoted is not the one you get when the image is in another format. It was an error message leading to the described behavior. This is the actual error message:

WARN[0000] failed, retrying in 1s ... (1/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
WARN[0001] failed, retrying in 1s ... (2/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
WARN[0002] failed, retrying in 1s ... (3/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused

Among possibly other error handling options you can:

  1. Change the error message to include a warning that the image could be in another format
  2. Only include the warning when local image is explicitly selected (i.e. localhost/foo)
  3. When a local image is explicitly selected, look for images with the same name in other formats and either load it dynamically or give a precise warning based on that information

The first option is the worst in my opinion as it would be included whenever someone selects the wrong image for example if you select docker.io/ubutnu.
The third option depends on how easy/cost effective it is to look for images in other formats. It is the best option from a user perspective.
The second option is an acceptable compromise in my opinion. It might however be hard to determine if an image is explicitly local (how many options are there?).

@github-actions
Copy link

github-actions bot commented Jun 4, 2022

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jun 6, 2022

@giuseppe could you change the error message so we can close this issue?

@giuseppe
Copy link
Member

giuseppe commented Jun 6, 2022

opened a PR: #14499

giuseppe added a commit to giuseppe/libpod that referenced this issue Jun 6, 2022
make the error clearer and state that images created by other tools
might not be visible to Podman when it overrides the graph driver.

Closes: containers#13970

[NO NEW TESTS NEEDED]

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@NetDwarf
Copy link
Author

NetDwarf commented Jun 7, 2022

The error message the user is going to see is this:

WARN[0000] failed, retrying in 1s ... (1/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
WARN[0001] failed, retrying in 1s ... (2/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
WARN[0002] failed, retrying in 1s ... (3/3). Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused 
Error: initializing source docker://localhost/foo:latest: pinging container registry localhost: Get "https://localhost/v2/": dial tcp 127.0.0.1:443: connect: connection refused

So, the fix does not change anything as the new (and still cryptic) error message is in a different place. And I called attention to that just before the PR. I am underwhelmed to say the least.

mheon pushed a commit to mheon/libpod that referenced this issue Jun 14, 2022
make the error clearer and state that images created by other tools
might not be visible to Podman when it overrides the graph driver.

Closes: containers#13970

[NO NEW TESTS NEEDED]

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@CtrlC-Root
Copy link

CtrlC-Root commented Oct 9, 2022

I just ran into this and found this issue after searching online. Here's one workaround:

$ rm -rf ~/.local/share/containers
$ STORAGE_DRIVER=overlay podman images
$ STORAGE_DRIVER=overlay buildah images

At this point they are both configured to use the same storage driver. However every time I run podman I see this error message:

$ buildah images                                                                                                                                                       
REPOSITORY       TAG      IMAGE ID       CREATED         SIZE
localhost/test   latest   ec68aa4219e4   8 seconds ago   2.34 KB
$ podman images                                                                                                                                                        
ERRO[0000] User-selected graph driver "vfs" overwritten by graph driver "overlay" from database - delete libpod local files to resolve.  May prevent use of images created by other tools 
ERRO[0000] User-selected graph driver "vfs" overwritten by graph driver "overlay" from database - delete libpod local files to resolve.  May prevent use of images created by other tools 
REPOSITORY      TAG         IMAGE ID      CREATED         SIZE
localhost/test  latest      ec68aa4219e4  10 seconds ago  2.34 kB

I installed these tools through system packages and did not create any new configuration files or tweak any existing ones.

$ ls -l /etc/containers/                                                                                                                                               
total 8
-rw-r--r-- 1 root root  209 Dec  7  2019 policy.json
-rw-r--r-- 1 root root 3721 Dec  7  2019 registries.conf
$ ls .local/share/containers/                                                                                                                                          
storage/
$ ls .local/share/containers/storage/                                                                                                                                  
defaultNetworkBackend  libpod/  mounts/  overlay/  overlay-containers/  overlay-images/  overlay-layers/  storage.lock  tmp/  userns.lock

I guess I need to create a storage.conf somewhere to tell podman to use overlay by default? But why is the default for buildah and podman different in the first place? And why does podman say the user-selected driver is vfs?

@mheon
Copy link
Member

mheon commented Oct 10, 2022

You need to remove the Podman database to remove that error. Config file will not help. Podman has detected a change that will break all existing containers, pods, etc, and it is rejecting it until podman system reset is issued to remove said existing containers/pods/volumes.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 13, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants