-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[regression] Multipass corrupts packet filter rules (iptables) on older distros such as Bionic #2183
Comments
Hi @julius-ziegler, We made some fairly significant changes to detecting when Which distro are you using? Could you also please post the output of:
We'll go from there. Thanks! |
Sorry for the lengthy dumps, this is the output of the commands, in the "broken" state
I also made a dump in the "working" state. This is only the diff between "working" and "broken":
Hope this helps! |
Hey @julius-ziegler, Thanks for those dumps. It will be helpful. When you say the difference between "working" and "broken", do mean "working" is Multipass v.1.6.2 and "broken" is Multipass v1.7 with everything else being the same? |
No, I can restore "working" state by
or equally
|
I will experiment a bit with downgrading multipass, I did not think of this before. |
Oh, that's really weird because, if I'm reading the diff correctly, "working" has Multipass-related rules in the tables and I'm not sure how it would if the snap is disabled/removed. |
Yes, that is strange indeed. But I just double checked, this is exactly how the tables look at the moment, and the ping is working. |
Here are the logs as txt file, I like to look at them like this with something like meld: |
So the one difference I see is that in the "broken" state, we are first in the various sets of rules whereas when it's working, we are last. I'll see if maybe I missed a flag on the refactoring of this for |
It is not trivial to go back to 1.6.2 for a regression test, right? Via the snap store it seems to not be possible. Thanks again! |
Another difference in the refactoring is that we now use Right, going back to the older version is not trivial at all and we'll just pass on that for now. I'll keep digging. |
Actually a |
@Saviq I am getting @townsend2010 I forgot to mention that when shelling into the multipass instance, I also do not get internet (can't ping the host e.g.). |
That might be because the refresh was too long ago, not sure. |
I've been trying to chase this down to no avail. I looked at what we are doing when setting up the tables between 1.6.2 and 1.7 and our logic is still the same. I'm still stumped on how you have persistent Multipass entries in your I also installed docker on my machine, issued the same command as you, and ping inside the docker container works regardless if Multipass is installed or not. Now, let's try to drill down into differences between our hosts. Could you please provide the following?
Thanks! |
This problem has hit us so hard because it occurred on a very busy server running a couple of services via docker (but there is also some kubernetes and lxd stuff on it). Equivalently complicated is the whole environment on the server. I will try to reproduce it myself on a "cleaner" server. |
@julius-ziegler I temporarily published 1.6.2 to the |
Thanks a ton @Saviq, I actually can not reproduce the problem with 1.6.2. |
@julius-ziegler that's a great data point then. Can you please dump all of your tables again and compare to those when broken with 1.7.0? |
This is the log of the iptables commands in the "works-with-1.6.2" state. It is identical to the broken state, except for some (randomized/dynamically-chosen?) IP addresses |
I'm starting to think this may be some incompatibility of |
For reference, the version of |
Here is some data on the affected machine:
Ubuntu 18.04.5 LTS Docker version 19.03.11, build 42e35e61f3 Here is configured networks. The lxd one might be something exotic. I will try to get my colleague involved in the discussion who knows more about this. IP address for ens3: 192.168.179.4 |
Hi, I just read through this issue which affected our company network (I am said colleague) and while most things have already been said by @julius-ziegler I wanted to add a few things: I was able to reproduce this issue in an empty ubuntu bionic VM: I installed docker then successfully ran:
I then installed multipass:
Now, the above ping no longer works. This dismisses the explanation that this issue is caused by the host being otherwise cluttered with miscellaneous iptables rules. I tested this with:
@townsend2010 The reason for the rules being persistent was that we used EDIT:
followed by a reboot is enough to regain connectivity. |
Hey @julius-ziegler & @younishd, I'm able to reproduce this in a Bionic VM as described in #2183 (comment). An interesting data point is that this does work using a Focal VM, so it's looking like there may be some incompatibility with Bionic and the version of |
Hey @julius-ziegler and @younishd, I finally found the issue in our code and have proposed a fix. After CI successfully runs on the PR, you should be able to test this via Due to the nature of the bug, the iptables are messed up, so you'll need to reboot the machine after refreshing Multipass. Thanks! |
The symptom of this issue is that on older distros such as Bionic, installing version 1.7 of Multipass would cause any NAT'ed networks such as LXD, Docker, etc. to no longer have access outside of the host. This was due to a change in Multipass where it tries to detect if nftables or legacy iptables are in use. On kernels older than 5.2, nftables isn't properly supported, but the logic to detect the kernel version was broken, so |
@townsend2010 thanks, we will give it a whirl. Ironically, we are just in the process of moving the affected service to a Focal machine. |
Hi @julius-ziegler, Yeah, Focal will not be affected by this since the kernel can handle the |
2190: [firewall] Fix detection of nftables support in kernel r=Saviq a=townsend2010 Fixes #2183 2191: Fix compilation with Clang 12 on Linux. r=Saviq a=luis4a0 Clang is picky with lambda captures. Co-authored-by: Chris Townsend <christopher.townsend@canonical.com> Co-authored-by: Luis Peñaranda <luis.penaranda@canonical.com>
This is now released as version 1.7.1 in the Snap Store (revision 5309). |
Hello, after update there is still no connectivity. Ping simply doesn't work. Nor anything else...apt update. Last working version 1.6. @Saviq Can you restore stable/1.6 channel. I tried with snap refresh multipass --channel stable/1.6 but it still installs 1.7.1 |
We have an issue that seems clearly related, and reproduction is even more fundamental (no name resolution involved, can't ping an ip address):
sudo docker run -it --rm alpine /bin/sh
/ # ping 1.1.1.1
works fine.
then on host system:
sudo snap install multipass
After that, the ping in the docker container stops working.
This error seems to have popped up relatively recently (ca. last week). We observed it first in a more complicated environment, where docker is used to run some CI, and multipass is run by snapcraft, but what I showed here is the minimal setup to reproduce it.
Originally posted by @julius-ziegler in #1435 (comment)
The text was updated successfully, but these errors were encountered: