-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[remote] error creating cgroup [...]: dial unix /run/systemd/private: connect: EPERM #12362
Comments
This one is getting nasty. It took me three or four re-runs to get CI passing in my PR. [sys] 278 podman shell completion test
|
And I think this is the same symptom, in unit tests. The EPERM is on
|
@Luap99 PTAL |
The failure has nothing to do with the shell completion test IMO. It fails at a simple pod create. I cannot help to debug this. |
I cannot reproduce this on a 1minutetip VM. I tried this and let it run for 4 hours, no failures: [build podman from sources @ main]
# bin/podman system service -t 0 &
[...]
# while :;do env PODMAN=$(pwd)/bin/podman-remote hack/bats completion || break;done I'm seeing a weird EPERM in a completely unrelated test (Static Build):
I've restarted that test, operating on the assumption that it's a flake. I don't know if it's related, and if it is, I can't imagine what it would mean. |
@giuseppe Can you look at this, it looks like podman cannot create the pod crgoup via systemd. |
Something is giving permission denied. The process running Podman is running as root and unconfined_t, and the socket is 777. |
I see the following AVC in the Cirrus logs:
|
Oh, this is interesting. This time the failure happens near the end of the test, not (as usual) in the beginning:
What I find interesting:
I know I shouldn't bash systemd, but I'm wondering, is it completely impossible that this is a systemd bug? |
The AVC indicates that podman inside of a container is attempting to connect to systemd.
This is being blocked by SELinux and never getting to systemd. The question is why is podman running within a locked down container when this happens. |
The common factor are remote tests, can the server change the label of itself instead of the container by accident? |
Theoretically yes, but I do not see how, The likely candidate was right here: |
selinux.SetTaskLabel would change the label of the service process, but no one calls that. |
There are actually more AVCs.
|
There are provisions for cat'ing the server log, but I don't see them triggered anywhere. If you're still playing with your reproducer PR, you could try: diff --git a/Makefile b/Makefile
index ceecda274..8cef891a4 100644
--- a/Makefile
+++ b/Makefile
@@ -586,6 +586,7 @@ remotesystem:
rc=$$?;\
kill %1;\
rm -f $$SOCK_FILE;\
+ echo "------------";echo "server log";cat $(PODMAN_SERVER_LOG);\
else \
echo "Skipping $@: 'timeout -v' unavailable'";\
fi;\ |
I think the issue is here: |
We just turned on remote checkpoint testing, and the previous tests before the failing test is a checkpoint test. |
This should fix the SELinux issue we are seeing with talking to /run/systemd/private. Fixes: containers#12362 Also unset the XDG_RUNTIME_DIR if set, since we don't know when running as a service if this will cause issue.s Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
New podman-remote flake. Two triggers, both in the
command completion
system test:Seen November 17 and 18 on different PRs, both times
sys remote f34 root
.The text was updated successfully, but these errors were encountered: