Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.58] backport my networking fixes #1954

Merged
merged 7 commits into from
Apr 15, 2024
Merged

Conversation

Luap99
Copy link
Member

@Luap99 Luap99 commented Apr 12, 2024

Backport #1946 and #1945

When the netns program fails to configure the netns or we fail for any
other reason during the setup we must make sure to remove the netns
mount again. Without it the next command sees the existing mount and
thinks the netns was setup correctly but than later fails during the
custom resolv.conf mount because the resolv.conf source file was never
created.

For future we should consider adding checks due ensure pasta/slirp4netns
is still running when we access the netns to make it more fault
tolerant.

The reason this is a common problem is that on boot pasta can likely
fail if it was started before the networking was fully configured (i.e.
no default route).

Fixes containers/podman#22168

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add a function to read a pidfile, this helps to avoid some duplication.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We have little to no control over what happens tot he slirp4netns/pasta
process after we started it. It could crash or get killed then we end up
in state where we think networking works when it doesn't.

To fix this each time we access the rootless-netns we should also make
to program is still running, if it is not try to recover by starting it
again. This ensures that we are much more fault tolerant.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The IsRootless() check is dangerous in a sense that it may not do what
you think it does. It also returns true even when podman is run as root
and not in the podman userns but as part of a different userns. Could be
a other container manager that configured the userns.

This results in us doing the rootless-netns logic even when we should
not need to. To fix this we now check for the
_CONTAINERS_USERNS_CONFIGURED env var to make sure we are not
re-exe'ed. This is what we actually care about.

This is a regression compared to podman 4.X, because the code was moved
into c/common the IsRootless() check was changed to the c/storage
version which has the userns special case.

Fixes containers/podman#22218

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
While this is a none issue normally because we run in a unprivileged
userns we cannot modify the host mounts in any way. However in case
where the rootless netns logic might be executed from a non userns
context we might change the mount tree if the mounts are shared which is
the systemd default. While this should never happen let's make sure we
never mess up the system by accident in case there are more bugs and
explicitly make our mount tree private.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This here just logs unnecessary errors in case there is an error during
the Run() call (podman unshare --rootless-netns). runInner() will
already call cleanup on errors if it created a new netns so we only need
to cleanup when there is no error.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This is good to prevent any leaks but more important here there is a
bug because we cache the last assigned ip. However when a network is
removed the recreated with a different LeaseRange that ip might be very
well outside the expected range and the logic seems to handle this
correctly. I could fix it there but deleting the full bucket seems best
as it avoid other issues and leaking the bucket forever.

Fixes containers/podman#22034

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Copy link
Contributor

openshift-ci bot commented Apr 12, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mheon
Copy link
Member

mheon commented Apr 12, 2024

LGTM

@rhatdan
Copy link
Member

rhatdan commented Apr 13, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Apr 13, 2024
@Luap99 Luap99 merged commit 173bc93 into containers:v0.58 Apr 15, 2024
5 checks passed
@Luap99 Luap99 deleted the 0.58-backports branch April 15, 2024 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants