Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sd-notify: do not hang when NOTIFY_SOCKET is used with create #1807

Merged
merged 1 commit into from
Mar 13, 2020

Conversation

giuseppe
Copy link
Member

@giuseppe giuseppe commented May 25, 2018

if NOTIFY_SOCKET is used, do not block the main runc process waiting for events on the notify socket. Bind mount the parent directory of the notify socket, so that "start" can create the socket and it is still accessible from the container.

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

@giuseppe
Copy link
Member Author

/cc @rhatdan @mrunalp

@dqminh
Copy link
Contributor

dqminh commented Jun 2, 2018

@giuseppe can you describe a minimal test case for this ? Would be nice to setup an integration test too.

With this, do we even need to pass notifySocket to signalHandler anymore ?

@giuseppe
Copy link
Member Author

giuseppe commented Jun 3, 2018

@dqminh currently if the NOTIFY_SOCKET environment variable is set for "runc create" then runc just hangs indefinitely.
signalHandler.forward() is still the caller for the new process. Would you prefer to move it somewhere else?

@rhatdan
Copy link
Contributor

rhatdan commented Jun 4, 2018

@giuseppe Couldn't you move the wait for ever functionality out of the run and into the start?

@giuseppe
Copy link
Member Author

giuseppe commented Jun 4, 2018

I cannot do that as the socket needs to be opened by create so that it can be bind mounted into the container. We would need to unlink it before creating it again with start (and the container would not see it)

@rhatdan
Copy link
Contributor

rhatdan commented Jun 20, 2018

@dqminh We are waiting for reply from you.
@avagin @caniszczyk @crosbymichael @cyphar @hqhq, @mrunalp @rjnagal @vmarmol PTAL

Would love to get podman support for sd-notify so we can run podman/runc with sd-notify containers.

@mrunalp
Copy link
Contributor

mrunalp commented Jul 9, 2018

The cost of the extra process falls on the users of the socket so I am fine with this as there is no simpler way.

@mrunalp
Copy link
Contributor

mrunalp commented Jul 9, 2018

LGTM

Approved with PullApprove

@rhatdan
Copy link
Contributor

rhatdan commented Jul 9, 2018

@rhatdan
Copy link
Contributor

rhatdan commented Jul 27, 2018

Could someone @dqminh @avagin @caniszczyk @crosbymichael @cyphar @hqhq, @rjnagal @vmarmol PTAL???

@dqminh
Copy link
Contributor

dqminh commented Aug 5, 2018

Hi, sorry for the delay !

Having to spawn one more process is unfortunate, but i cant think of a better way if we want to keep the same flow where the forwarding of notification is setup when we first create the container. (However in this case perhaps we should not overload init and instead give the process a separate operation ).

Another way i can think of is to have Create just handling the mounting of socket in host to socket in container and setup correct env variable (perhaps set MAINPID ?), but dont actually start handling readiness notification at all. Instead we defer the handling of notifications' forwarding to Start. So Start will be something like

  1. read state of the container if required ( path to notify socket on the host needs to be here somewhere)
  2. if notify socket path exists, then start listener
  3. notify the container process that it should start executing
  4. start forwarding messages

In this case, NOTIFY_SOCKET=something runc create will be completed immediately, but runc start of such container will also act as the sd_notify forwarder hence will stay alive until such forwarding completed. Basically a change in behavior, but we dont need to execute another long-live process without context attached.

Does that make sense ?

@giuseppe giuseppe force-pushed the notify-no-block branch 2 times, most recently from db96894 to cd9b959 Compare August 6, 2018 12:03
@giuseppe
Copy link
Member Author

giuseppe commented Aug 6, 2018

what stopped me from doing that in the first iteration was that we cannot create the socket once the container is already created (and the socket bind mounted), so that the listener had to be created from create.
I've done an additional change now, where we bind mount the parent directory of the socket, so we can create the socket later.
With this change we lose the ability to keep in the container the same file name specified in the host NOTIFY_SOCKET, in the container it is always /run/notify/notify.sock but I don't think it really matters for the applications.

signals.go Outdated
@@ -70,7 +70,7 @@ func (h *signalHandler) forward(process *libcontainer.Process, tty *tty, detach
h.notifySocket.run(pid1)
return 0, nil
} else {
go h.notifySocket.run(0)
h.notifySocket.run(os.Getpid())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: does it matter here that we set PID to runc process ? Cant we always set it to the container's process instead ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we keep runc running, the container's process won't be a direct child of systemd and systemd won't get the SIGCHLD once it terminates. Differently, it will get the SIGCHLD for the runc process and can link it to the service being terminated.

@dqminh
Copy link
Contributor

dqminh commented Aug 6, 2018

@giuseppe Thanks ! This looks great and much simpler than before.

@giuseppe
Copy link
Member Author

giuseppe commented Aug 9, 2018

does the new version look fine? @dqminh @mrunalp

@TomSweeneyRedHat
Copy link

TomSweeneyRedHat commented Aug 16, 2018

Re-ping @dqminh @mrunalp . This is part of the fix for a Podman issue containers/podman#746

@giuseppe
Copy link
Member Author

ping, can this be finally reviewed?

@crosbymichael
Copy link
Member

I don't know enough about this to be able to give a good review or even test. Someone other maintainers will have to as I wouldn't feel comfortable either way. Sorry.

@crawford
Copy link

crawford commented Oct 2, 2018

What does this need in order to be merged? I'd love to have sdnotify support and I'm happy to help out with this PR where I can.

@mrunalp
Copy link
Contributor

mrunalp commented Oct 2, 2018

LGTM

Approved with PullApprove

@mrunalp
Copy link
Contributor

mrunalp commented Oct 2, 2018

@dqminh Could you take a look?

@giuseppe
Copy link
Member Author

can someone please take a look?

@filbranden
Copy link
Contributor

@giuseppe In #2014 I'm adding support for integration tests in an environment where systemd is running and also a Travis-CI job to run those tests on PRs.

Do you think you might be able to add an integration test that would test the NOTIFY_SOCKET feature, perhaps on top of that PR?

I guess having a test case pass would make it easier to get this merged.

Cheers!
Filipe

@giuseppe
Copy link
Member Author

@filbranden thanks for the suggestion. I've looked into it but I don't see any way for using UNIX datagram sockets from a busybox container so that I could communicate back with the systemd socket

@filbranden
Copy link
Contributor

@giuseppe Yes busybox will probably not be enough... Can we consider a slightly larger minimal image that would have enough tools (like a somewhat more complete nc) to perform the test? Something based on Alpine perhaps?

Or, alternatively, can we consider copying a binary into the unpacked tree of busybox image before running the runc test on it?

Let me know if I can help in any way.

PR #2014 has been merged, so rebasing on top of latest head should already get you the tests including systemd and the Travis CI config to run them as part of every pending PR...

Cheers!
Filipe

@giuseppe
Copy link
Member Author

yes, using Alpine will help, then I can run the equivalent of:

$ systemd-run --unit foo --service-type=notify -p NotifyAccess=all podman run --rm alpine sh -c 'apk add nmap-ncat; echo READY=1 | ncat -u -U $NOTIFY_SOCKET ; sleep 100'
$ systemctl is-active foo

@giuseppe
Copy link
Member Author

The PR was opened almost 2 years ago.

Since there is no interest in fixing the issue, I am closing it.

@giuseppe giuseppe closed this Feb 17, 2020
@saschagrunert
Copy link
Contributor

The PR was opened almost 2 years ago.

Since there is no interest in fixing the issue, I am closing it.

Hey @giuseppe, now is the time where we're also interested in fixing this bug. May I ask you to re-open the issue that we can finally merge it? 😇

@rhatdan
Copy link
Contributor

rhatdan commented Mar 11, 2020

Yes @giuseppe Let's reopen and see if we can get @AkihiroSuda to review.

@AkihiroSuda AkihiroSuda reopened this Mar 11, 2020
@AkihiroSuda
Copy link
Member

Does crun already have this? Can we have an integration test?

notify_socket.go Outdated Show resolved Hide resolved
@giuseppe
Copy link
Member Author

giuseppe commented Mar 11, 2020

Does crun already have this? Can we have an integration test?

yes crun already supports it. As well as the runc distributed by Fedora/RHEL, where we are backporting this patch.

if NOTIFY_SOCKET is used, do not block the main runc process waiting
for events on the notify socket.  Bind mount the parent directory of
the notify socket, so that "start" can create the socket and it is
still accessible from the container.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@AkihiroSuda
Copy link
Member

AkihiroSuda commented Mar 12, 2020

LGTM, but I'd like to see integration test suite if possible. Can be follow-up PR if difficult.

Approved with PullApprove

@mrunalp
Copy link
Contributor

mrunalp commented Mar 13, 2020

LGTM

Approved with PullApprove

@mrunalp mrunalp merged commit 167e33c into opencontainers:master Mar 13, 2020
Snaipe pushed a commit to aristanetworks/monsoon that referenced this pull request Apr 13, 2023
* Running the etcd container with NOTIFY_SOCKET mounted
(to use systemd Type=notify) causes podman to hang so
for now just use exec
* opencontainers/runc#1807
liuyangxy pushed a commit to fedora-riscv/runc that referenced this pull request May 13, 2023
runc and podman to work with sd_notify
liuyangxy pushed a commit to fedora-riscv/runc that referenced this pull request May 13, 2023
runc and podman to work with sd_notify
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.