Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running containers inside of a container environment (with docker-compose.yml) using podman? #746

Closed
Conan-Kudo opened this issue May 10, 2018 · 91 comments
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@Conan-Kudo
Copy link

/kind feature

Description

I'd like to be able to run and test batches of containers defined with docker-compose.yml. As it is now, doing this with actual Docker inside an environment that runs through Docker gets rather risky and leaky in all kinds of bad ways.

For building containers, I'm starting to use buildah for this, but I don't quite yet have an answer for running them. The goal is to be able to build and test in a manner that is consistent with how people can do it on their local machines, and easily transition to OpenShift for production run environments.

Additional environment details (AWS, VirtualBox, physical, etc.):
GitLab CI runners with Docker container (of Fedora with buildah + podman)

@mheon
Copy link
Member

mheon commented May 10, 2018

I'd love to get something similar to docker compose up and running (I don't know if we'd go for exact compatability in this case, though). The ability to define a pod with a number of containers, sharing various namespaces and starting in a specific order as needed, is already baked into the backend, though work would be needed to make a good user interface to expose all of that.

@giuseppe
Copy link
Member

slightly related: we need to ensure that we correctly pass NOTIFY_SOCKET from systemd down to runC. We will not only have startup ordering with systemd dependencies, but containers wouldn't need to poll for another service to be ready (and if they do, it will return immediately) if it configured to use NOTIFY_SOCKET.

@rhatdan
Copy link
Member

rhatdan commented May 11, 2018

@giuseppe I worked on that a while ago. But not sure if I got it all working. Would also like to get socket activation working properly. Both would be cool features that don't work in Docker.

@rhatdan
Copy link
Member

rhatdan commented May 14, 2018

@edsantiago @TomSweeneyRedHat Could you guys attempt to setup a test to make sure NOTIFY_SOCKET and SD_NOTIFY works with podman?

@edsantiago
Copy link
Member

@rhatdan in progress ... but infuriatingly nonworking. And according to my notes, reminiscent of my frustrations in October 2016. It Just Ain't Working. The simplest reproducer I can come up with is:

# NOTIFY_SOCKET=/run/systemd/notify podman run --rm fedora date

This just hangs. No error, also no output. It also hangs in such a way that podman ps also hangs -- maybe the locking issue in #658? /bin/ps reports a long-running, presumably-also-hung runc create process:

# ps auxww --forest |grep -5 runc
...
root     20545  0.0  0.0  86084  1808 ?        Ssl  20:50   0:00 /usr/libexec/crio/conmon -c ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7 -u ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7 -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7/userdata -p /var/run/containers/storage/overlay-containers/ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7/userdata/pidfile -l /var/lib/containers/storage/overlay-containers/ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7/userdata/ctr.log --exit-dir /var/run/libpod/exits --socket-dir-path /var/run/libpod/socket
root     20546  0.0  0.2 392772 10656 ?        Sl   20:50   0:00  \_ /usr/bin/runc create --bundle /var/lib/containers/storage/overlay-containers/ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7/userdata --pid-file /var/run/containers/storage/overlay-containers/ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7/userdata/pidfile ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7
root     20555  0.1  0.2 316848  8836 ?        Ssl  20:50   0:00      \_ /usr/bin/runc init

The init process is not killable, even with -9. The create process can be killed, but only with -9. Attempting to podman rm the container while runc is running results in:

failed to delete container ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7: cgroups: unable to remove paths /sys/fs/cgroup/systemd/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/freezer/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/pids/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/net_cls/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/net_prio/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/perf_event/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/cpuset/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/cpu/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/cpuacct/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/memory/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/blkio/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/devices/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7, /sys/fs/cgroup/hugetlb/libpod_parent/libpod-ced645cf9da7b61613126a938300b796c47ec76ae275dda4d76719af101949b7

Same results when running from a systemd init file (without the NOTIFY_SOCKET= declaration, since systemd presumably sets that). Using NOTIFY_SOCKET=/nonexistentfile works perfectly fine.

Am close to giving up for today. This has taken a good chunk of time.

@mheon
Copy link
Member

mheon commented May 14, 2018

@edsantiago Separate locking issue from #658 - this is us holding the container lock until runc has finished executing, to try and order container operations. The root cause here appears to be the runc init hang.

@rhatdan
Copy link
Member

rhatdan commented May 15, 2018

This seems to be close, although I am getting Connection Refused.

#!/bin/sh
export NOTIFY_SOCKET=/run/podman_notify.sock
$(rm -f ${NOTIFY_SOCKET}; nc -U ${NOTIFY_SOCKET} -l) &
sleep 1
podman run -v /usr/bin/nc:/usr/bin/nc fedora /usr/bin/nc -U ${NOTIFY_SOCKET}<<EOF
echo ready
EOF
# sh -x notify_sock.sh
+ export NOTIFY_SOCKET=/run/podman_notify.sock
+ NOTIFY_SOCKET=/run/podman_notify.sock
+ sleep 1
++ rm -f /run/podman_notify.sock
++ nc -U /run/podman_notify.sock -l
+ podman run fedora mount
+ grep podman
tmpfs on /run/podman_notify.sock type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
+ podman run -v /usr/bin/nc:/usr/bin/nc fedora /usr/bin/nc -U /run/podman_notify.sock
Ncat: Connection refused.

The passing of the socket is there and the mounting of the socket is there. I don't know why ncat is refusing the connection.

@edsantiago
Copy link
Member

I think systemd uses a DGRAM socket, so you need --udp. In my runs this morning, this hangs consistently:

window1# rm -f /run/mysock;ncat -l -U --udp /run/mysock

window2# NOTIFY_SOCKET=/run/mysock podman run --rm fedora date     (hangs as described above)

It does not hang without --udp, but my sdnotify test container fails with EPROTOTYPE (Protocol wrong type for socket).

@rhatdan
Copy link
Member

rhatdan commented May 15, 2018

Date is not doing anything with the socket, so this looks like the integration between podman/runc and the socket file is causing issues. I will see if I can repeat the failure on my machine.

@edsantiago
Copy link
Member

Yes - my use of date was simply to try the simplest container that would not be mucking with sd_notify. ISTM that runc is the bottleneck

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

Seems to be working for me now

# NOTIFY_SOCKET=/run/mysock podman run --rm fedora date  
Thu May 24 11:43:32 UTC 2018

With podman in master.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

Never mind It is hanging.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

This is where runc is hanging.
openat(AT_FDCWD, "/proc/self/fd/4", O_WRONLY|O_CLOEXEC

@edsantiago
Copy link
Member

I'm somewhat leaning toward it being a runc issue, not podman, but have no actual evidence to base that on.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

I agree, I am now thinking this is an issue with runc. I need to setup runc with the NOTIFY_SOCKET to see if it hangs also.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

It looks like the --udp is the key flag that is causing the issue. If I remove the --udp runc finishes right a way.

@edsantiago
Copy link
Member

Yes, but as best I can tell --udp recreates the way systemd creates the /run/systemd/notify socket

@edsantiago
Copy link
Member

# lsof /run/systemd/notify
COMMAND PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
systemd   1 root   30u  unix 0x000000006092aaba      0t0 1412 /run/systemd/notify type=DGRAM

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

So it this the equivalent of doing what Systemd does for socket activation.

@edsantiago
Copy link
Member

That's what I think, and it's what I'm trying to do, and the behavior is consistent... but I don't really know.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

I sent an email off to systemd-maint/lennart asking them what is the best way to implement this.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

lsof /run/systemd/notify /run/mysock
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/3267/gvfs
Output information may be incomplete.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 33u unix 0x000000009b008e4d 0t0 13639 /run/systemd/notify type=DGRAM
ncat 13931 root 3u unix 0x00000000cddfc00c 0t0 362330 /run/mysock type=DGRAM

But it looks like you are correct. I would figure runc will hang if done with socket activation.

@giuseppe Did you ever run systemd containers using sd_nodify?

@giuseppe
Copy link
Member

I think this happens because of the interaction of conmon and runc. We have system containers using NOTIFY_SOCKET, but system containers don't pass in additional file descriptors and this makes the difference.

In any case, the issue is for sure in runc:

$ NOTIFY_SOCKET=/run/systemd/notify sudo -E podman  --runtime /usr/local/bin/crun run --rm alpine date  
Thu May 24 14:19:42 UTC 2018

@edsantiago
Copy link
Member

I will be gone for 2 weeks and unable to play here during that time. Once the runc hang gets resolved, Shishir has a great, tiny, simple way to test sdnotify in a container: https://github.com/shishir-a412ed/runc-notify

@rhatdan
Copy link
Member

rhatdan commented May 24, 2018

Actually I think this is pure runc.
NOTIFY_SOCKET=/run/systemd/notify /usr/bin/runc create --bundle /var/lib/containers/storage/overlay-containers/f4a79a66f2ece89aae2038017596cb4b3928bebae05095e598a8695506962809/userdata --pid-file /var/run/containers/storage/overlay-containers/f4a79a66f2ece89aae2038017596cb4b3928bebae05095e598a8695506962809/userdata/pidfile f4a79a66f2ece89aae2038017596cb4b3928bebae05095e598a8695506962809

Hangs Not conmon involved.

@giuseppe
Copy link
Member

this is probably a regression, I remember NOTIFY_SOCKET working well with runc, I'll take a look

@giuseppe
Copy link
Member

I've opened a PR for runc: opencontainers/runc#1807

@rhatdan
Copy link
Member

rhatdan commented Jun 7, 2018

@mrunalp Suggests that we should go back to runc run, rather then using runc create/start. He believes this is the best way to run runc, and that the PR opened by @giuseppe is not likely to get merged.

@mheon @baude Why did we switch to runc crate/start? Can we go back?
@mrunalp Any comments?

giuseppe added a commit to giuseppe/libpod that referenced this issue Nov 28, 2018
with opencontainers/runc#1807 we moved the
systemd notify initialization from "create" to "start", so that the
OCI runtime doesn't hang while waiting on reading from the notify
socket.  This means we also need to set the correct NOTIFY_SOCKET when
start'ing the container.

Closes: containers#746

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@Conan-Kudo
Copy link
Author

Wait, what? Why did this get closed? I don't see anything relating to having functionality like this implemented in podman git master...

@mheon mheon reopened this Nov 28, 2018
@mheon
Copy link
Member

mheon commented Nov 28, 2018

At some point we started discussing issues related to sdnotify. Those are fixed. The core of the issue, is not.

@baude is working on some things that do touch the scope of the original issue, but I don't think they're exactly what you're looking for

@Conan-Kudo
Copy link
Author

@mheon If it's something like OpenShift ImageStream+BuildConfig+DeployConfig yaml with podman, that works too.

@mheon
Copy link
Member

mheon commented Nov 28, 2018

The opposite, actually - Kube (and maybe Openshift) YAML from Podman containers

@rhatdan
Copy link
Member

rhatdan commented Nov 28, 2018

@Conan-Kudo Not really sure what that means.

We want to experiment with using podman commands that people are used to to generate the environment. Our goal is not to force a user to edit a configuration/yaml/json... file to build an application containing multiple pods/containers working together, using podman. Then use podman to extract out of the libpod configuration, kubernetes yaml files to be able to easily launch the same environment into OpenShift/Kubernetes.

Simplest would be to launch a container with podman and then extract out a yaml file to describe how to run the same container in kubernetes.

@Conan-Kudo
Copy link
Author

@rhatdan The idea would be that you'd be able to make a minimal YAML/JSON definition in the OpenShift style to spin up groups of containers with Podman that also happened to just easily import right into OpenShift, so the mechanical process of starting an application as a container would work the same way for single node (Podman) and multi-node (OpenShift).

@mheon
Copy link
Member

mheon commented Nov 28, 2018

I would really like to get that case covered - I think our current Kubernetes/Openshift JSON generation misses the original point of Compose (single-node orchestration)

@rhatdan
Copy link
Member

rhatdan commented Nov 28, 2018

Well one case @baude is looking at is Replay which would take the generated kubernetes yaml and recreate the containers/pods with podman. I was actually asked about that last night after mentioning it in a talk I was giving to the NYLUG.

My fear though is going to be trying to support all Kubernetes Yaml config, which could become a huge time sync.

@baude
Copy link
Member

baude commented Nov 28, 2018

I wouldnt worry about supporting all the kube yaml config. Most of it will be too tough to process and make assumptions in podman about. Again, if we take this approach of only kube, we will have what i would refer to as a "lite" approach to this.

@Gert-dev
Copy link

Gert-dev commented Feb 5, 2019

Perhaps slightly unrelated, but as Podman provides a basic "Docker-compatible CLI" via the docker script (which seems to just be a script to place in /usr/bin/docker that in turn executes /usr/bin/podman), how well does Docker Compose work with this script? Does it not work at all due to reliance on specifics of Docker itself? Does it work, but require minor adjustments?

@jess-sch
Copy link

jess-sch commented Feb 5, 2019 via email

@rhatdan
Copy link
Member

rhatdan commented Feb 6, 2019

We don't support Docker-compose, so I am not sure how much of it is talking to the docker socket versus executing the command. POdman is a replacement for the Docker CLI, not the Docker engine API.
We do have podman varlink for a remote API, but it does not follow the Docker API.

@SergeyBear
Copy link

SergeyBear commented Feb 25, 2019

Also tried to find alternative of docker-compose for podman, because creating/updating pods using bash ends with heavy logic and ansible roles - executes much slower.

Using 'podman play' is very promissing feature, but I think this might lead to podman will have to adopt to k8s api changes all the time.

Docker-compose is a separate tool, so maybe separate podman-compose/podman-kube/podman-<your_name_here> wrapper over podman will do better job.

@SergeyBear
Copy link

It's probably need to add a notice in README that docker-compose functionality is out of the scope of podman. People sometime thinks that docker-compose is a part of docker cli and want to find it in podman.

vrothberg added a commit to vrothberg/libpod that referenced this issue Feb 25, 2019
Also mention that Podman does/will not support `docker-compose`.

Fixes: containers#746 (comment)
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
@vrothberg
Copy link
Member

It's probably need to add a notice in README that docker-compose functionality is out of the scope of podman. People sometime thinks that docker-compose is a part of docker cli and want to find it in podman.

Thanks for the suggestion, @SergeyBear. I've opened #2428 to address it.

@SergeyBear
Copy link

SergeyBear commented Feb 26, 2019

IMHO, adding k8s generator to podman may lead to problems like k8s had with runtime and storage drivers, when developers had to add and support a lots of techs, that eventually ended with creating CRI and CSI. Some will need k8s support, others - swarm and so on...
May be it is better to represent 'podman play' as 'composer alternative' for local deployment with k8s api compliant syntax? Because eventually people will want to have ReplicaSets, Services and so on to work on podman and will start to create issues...

@rhatdan
Copy link
Member

rhatdan commented Feb 26, 2019

@SergeyBear Sure. the goal was to make it easy to transition from a traditional container environment to a Kubernetes environment. But once we did that we needed a way to allow users to transition back, which is why we added play. But we did not want to lock our selfs to just Kubernetes, so I definitely could see us supporting other formats. Which is why we have podman generate kube, if some other format took off we might support that also.

@SergeyBear
Copy link

SergeyBear commented Feb 26, 2019

I just found great post about podman play and generate that describes in detail usecases.
Looks like it will indeed can be an alternative to docker-compose 👍

@dustymabe
Copy link
Contributor

I found this project called podman compose. Haven't tried it but maybe what you are looking for: https://github.com/muayyad-alsadi/podman-compose

@rhatdan
Copy link
Member

rhatdan commented Aug 22, 2019

We are actually working to move this under the github.com/containers umbrella.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests