-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pasta regression: starting user container fails with "No routable interface for IPv4: IPv4 is disabled" #21896
Comments
That message means that IPv4 is going to be disabled in the container. We already have an improved message as suggested by @Luap99 upstream: https://archives.passt.top/passt-dev/20240227162949.2442048-1-sbrivio@redhat.com/
The problem is that pasta(1) doesn't know what to do in this case, as routes in the container are copied from the outer network namespace, by default. A question from my side: you don't expect connectivity to work in the container in this case, right? Shouldn't then Podman be started with |
I do expect connectivity to the host. Also, when running it with e.g.
Hmm, I don't know how pasta works, but conceptually the container is it's own machine in network terms and should have its own routing? I.e. usually that means setting a default route to the host and letting the host decide where to go on. The old slirp stack looked like this in the container:
Of course pasta has to emulate this somehow, I'm afraid I don't know how this works internally. |
That doesn't fail, but is also way too strong -- you can't do port mappings any more, and the container can't communicate with anything outside more. |
Sure, it's a separate network namespace. This is a matter of default configuration: pasta(1) is designed to avoid NAT if not needed, so, by default, it sources addresses and routes from the parent network namespace. It doesn't have a hardcoded 10.0.2.2 like slirp4netns has. One can change that with
There's a problem with this: the container is told that 10.0.2.2 will route any IPv4 traffic. But, if they don't match the single route you have on the host, packets won't go anywhere. It's also a lie in some sense. We could change the default in pasta to not fail if there's no interface with a default route, and just copy the routes that actually exist, instead of creating a default route in the container which can't work, like slirp4netns does. Is this going to mislead users, though? |
That is, |
That fails with the same error, I'm afraid.
I don't understand this, I'm afraid. From the container's POV this is completely correct, same as for any other machine: You have some explicit routes about the ranges you know, and delegate unknown addresses to the default route, i.e. "kick the can down the road". The container cannot, and also does not need to know if the host (or your home router, or your ISP, etc.) can actually reach any IP.
Yeah, that sounds right. I'm curious (please forgive my ignorance): why does it make a particular check on default route, conceptually? If it "just copies the routes", that should be fine? In our case, the host does have a route:
so the container can also reach anything within the 172.27* network, and of course the host itself. I.e. it can do exactly what the host does, with one extra hop between container and host, which is true for any kind of IP network. |
Ah, sorry, you have to add
Several standards aim at coordinating network configuration in such a way that things ultimately work. Say, DHCP tells a host that it can use a particular default gateway, and router advertisements carry information about a prefix, usually global, that can be used. pasta(1) tries to do the same while configuring the container. If the container had absolutely nothing to do with parent (user, networking) namespaces, then it also wouldn't need to know that
Because it tries to figure out which interface should be used to copy addresses and routes, and a sensible choice is to use an upstream interface, which will have a default route.
Fine, yes, except that picking We can/should change pasta to cover this obvious case, but with multiple interfaces and no default routes, it should force the user to provide explicit configuration. I would wait for @Luap99 to chime in before going ahead with this. And you also need a temporary workaround, I suppose? |
Thanks for the explanation! (To fully understand it I'd need to know how pasta works, but I take it this is not just a simple oversight, but a deliberate implemenation).
And this bit shows my ignorance -- with "real" networking there wouldn't be anything to "pick" for the container, it'd just need to set up a route to the host. But I suppose that's not how an userspace networking stack works, as the kernel doesn't actually do the routing -- pasta has to make that decision.
Don't worry about that, I'll find one for our tests (presumably by just adding a fake default route which leads to nowhere or so). I'm more worried about breaking actual user scenarios -- isolated networks are not uncommon especially in corporate or CI environments. Thanks! |
Well, kind of... pasta doesn't actually do routing -- you can't do routing if you're running unprivileged, you just have Layer-4 sockets. Look at those pink sockets on the right of this diagram: pasta can't touch the blue stuff. The kernel does it. But we need to configure something in the inner network namespace for the kernel to do it.
Isolated networks, sure, but usually there's a default route. Anyway, if this is not blocking for you, I'll wait a bit for more feedback and then implement the change I was mentioning. |
As far as actual workarounds go users can set the pasta options in containers.conf so it will effect everything:
or revert back to slirp4netns if pasta is not working for them for whatever reason(s)
I think making pasta better at picking a correct interface is desirable like in this case but I also agree that there is not much that can be done if there is more than one interface available. |
Also another noticeable difference with pasta is that you cannot connect to the interface ip from the container and reach the host interface because pasta uses the same ip address inside the namespace so it will never reach the host. |
The introduction of pasta [1] regressed user containers if there is no default route [2]. While that is being sorted out, add a fake interface with a default route for our offline tests, to unbreak upstream podman PRs testing. [1] containers/podman#21563 [2] containers/podman#21896
The introduction of pasta [1] regressed user containers if there is no default route [2]. While that is being sorted out, add a fake interface with a default route for our offline tests, to unbreak upstream podman PRs testing. Fixes cockpit-project#1595 [1] containers/podman#21563 [2] containers/podman#21896
I sent cockpit-project/cockpit-podman#1600 to work around this. This shouldn't affect the revdeps tests in podman, just our nightly podman-next scenario. So not much of your concern I figure 😁 |
The introduction of pasta [1] regressed user containers if there is no default route [2]. While that is being sorted out, add a fake interface with a default route for our offline tests, to unbreak upstream podman PRs testing. Fixes #1595 [1] containers/podman#21563 [2] containers/podman#21896
can this be closed? |
I haven't seen any fix for it. |
I haven't implemented that yet. As the workaround is already in place, it's not extremely urgent I'd say. |
podman 5 has landed in Fedora >= 40, so our tests start failing due to containers/podman#21896 Extend the hack from commit cecb2cc to apply to all our images, not just the podman-next scenario.
podman 5 regressed user containers if there is no default route [1]. While that is being sorted out, add a fake interface with a default route for our offline tests, to unbreak upstream podman PRs testing. Same hack as in cockpit-project/cockpit-podman@cecb2cc6e8f2 [1] containers/podman#21896
podman 5 has landed in Fedora >= 40, so our tests start failing due to containers/podman#21896 Extend the hack from commit cecb2cc to apply to all our images, not just the podman-next scenario.
podman 5 regressed user containers if there is no default route [1]. While that is being sorted out, add a fake interface with a default route for our offline tests, to unbreak upstream podman PRs testing. Same hack as in cockpit-project/cockpit-podman@cecb2cc6e8f2 [1] containers/podman#21896
Fixed in version 2024_03_18.615d370 and matching Fedora 39 update. |
@sbrivio-rh Thanks, I am going to close this one then. |
There might be isolated testing environments where default routes and global connectivity are not needed, a single interface has all non-loopback addresses and routes, and still passt and pasta are expected to work. In this case, it's pretty obvious what our upstream interface should be, so go ahead and select the only interface with at least one route, disabling DHCP and implying --no-map-gw as the documentation already states. If there are multiple interfaces with routes, though, refuse to start, because at that point it's really not clear what we should do. Reported-by: Martin Pitt <mpitt@redhat.com> Link: containers/podman#21896 Signed-off-by: Stefano brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
This is still problematic. Tracked in https://bugzilla.redhat.com/show_bug.cgi?id=2277954 now, but at least it has a workaround which isn't as intrusive/unsave as adding general internet access to the VM. |
see #22737 |
Excellent, thanks @sbrivio-rh ! I tested the passt F40 update in cockpit-project/cockpit-podman#1768 and it's happy! 🌟 |
Yesterday's PR #21563 introduced a major regression with starting user containers on machines without a default route. This wasn't spotted by the "revdeps" tests, as they run in tmt and are always connected to the internet. But in cockpit-podman's CI the tests run without a default route (as we want to ensure that nothing in the test requires internet access, to make tests reproducible and avoid network flakes), and they started to fail last night.
Reproducer:
dnf -y copr enable rhcontainerbot/podman-next >&2; dnf -y update --repo 'copr*'
podman pull docker.io/busybox
podman run -it --rm docker.io/busybox whoami
works fine.sudo ip route del default
(that's not what happens in our tests, but it's easiest to do as a human)podman run -it --rm docker.io/busybox whoami
exits with code 126 and fails withFirst of all it's kind of a lie -- IPv4 is not disabled, there is both a
lo
and aneth0
with routing to a local network (in our case), just no default route. And also, that should not be an error, and even less so fatal.This is with podman-5.0.0~dev-1.20240301003639672061.main.140.38546de7b.fc39.x86_64 and passt-0^20240220.g1e6f92b-1.fc39.x86_64 on Fedora 39.
The text was updated successfully, but these errors were encountered: