Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fifo for create / start instead of signal handling #886

Merged
merged 1 commit into from
Jun 13, 2016

Conversation

crosbymichael
Copy link
Member

@crosbymichael crosbymichael commented Jun 6, 2016

This removes the use of a signal handler and SIGCONT to signal the init
process to exec the users process.

Closes #871

@@ -168,6 +168,9 @@ func (l *LinuxFactory) Create(id string, config *configs.Config) (Container, err
if err := os.MkdirAll(containerRoot, 0700); err != nil {
return nil, newGenericError(err, SystemError)
}
if err := syscall.Mkfifo(filepath.Join(containerRoot, execFifoFilename), 0655); err != nil {
Copy link
Contributor

@wking wking Jun 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0655 is a pretty strange permission set. I'm not clear on the intended auth here, but the signal-based approach was limited by kill(2)'s CAP_KILL and/or UID requirements (although SIGCONT had a same-session special case). I expect you want 0600 here to match that (ish ;).

// use channel 'c' to ensure that the goroutine was scheduled so that the open is called
// before we pivot into the rootfs or else we won't have access to the path on the host.
go func() {
close(goStarted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really help? Because scheduler still can reschedule on OpenFile call

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets us closer than without

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's like 30% less race chance, but still... It's really bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this code is definitely racy :( Maybe we can create fifo in rootfs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we just call the raw syscall or still an issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think putting it in the rootfs will help because of readonly containers and such. we would want to delete it after we are done and could'nt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no race free way to make this work, we could just use either SIGUSR1/SIGUSR2 signals that are reserved for user code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it in a race free way

8767396

;)

@cyphar
Copy link
Member

cyphar commented Jun 7, 2016

Personally, I liked the signal approach to create-start. What is the reason for the change? While SIGCONT has issues with Go 1.6, I'm not sure why we aren't just switching to SIGUSR1 -- IMO signals are the right Unix interface here (not FIFOs).

While I understand @wking's concerns with PID namespaces, it feels like we still have several other problems that we can't solve this way (if we used sockets) -- why should runc start work if runc ps or runc exec won't (since the PID is hardcoded into the namespace_paths)?

// clean up the signal handler
signal.Stop(s)
close(s)
// wait for the fifo be be opened on the other side before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "be be"

@wking
Copy link
Contributor

wking commented Jun 7, 2016

On Mon, Jun 06, 2016 at 10:05:32PM -0700, Aleksa Sarai wrote:

Personally, I liked the signal approach to create-start. What is the
reason for the change? While SIGCONT has issues with Go 1.6, I'm
not sure why we aren't just switching to SIGUSR1 -- IMO signals
are the right Unix interface here (not FIFOs).

The problem with signals is that you're just launching them, and don't
know:

  • If/when they are recieved and acted upon, or
  • Which code will be running when the process recieves the signal 1.

I'm not entirely sure how much of that you fix with a FIFO, but the
ccon implementation [2](which uses a two-way Unix socket for this)
can (but doesn't actually implement yet) have the container process
send a “message received, I'll go exec that code” message back to the
‘start’ process. That way:

  • The ‘start’ process that actually triggers the exec knows it, and
    can exit success. Any parallel ‘start’ requests who are queued up
    waiting for the socket will know that they didn't trigger the exec,
    and can exit failure (as required by the spec 3).

  • You know you'll never racily send a signal that is processed by
    user-specified code, because the container process should be closing
    (and in ccon, the host-process will remove post-post start, once I
    have time to code that up, the filesystem portion of the
    socket/FIFO).

    “Attempting to start a container that does not exist MUST
    generate an error. Attempting to start an already started
    container MUST have no effect on the container and MUST generate
    an error.”

@crosbymichael crosbymichael force-pushed the start-pipe branch 5 times, most recently from 70cfec3 to ae91b58 Compare June 7, 2016 18:41
@crosbymichael
Copy link
Member Author

@LK4D4 @cyphar i updated and now its "race free' ;)

@wking
Copy link
Contributor

wking commented Jun 7, 2016

On Mon, Jun 06, 2016 at 10:05:32PM -0700, Aleksa Sarai wrote:

While I understand @wking's concerns with PID namespaces, it feels
like we still have several other problems that we can't solve this
way (if we used sockets) -- why should runc start work if runc ps or
runc exec won't (since the PID is hardcoded into the
namespace_paths)?

I don't see “we have out-of-spec operations that break under these
conditions” as a reason to avoid having more reliable support for
in-spec operations (although in this case, my main concern with
signals is the race which may lead to user-specified code handling a
signal intended for the runtime's idler code 1).

And while using a filesystem object (in this cases a named pipe) gets
you out of the “I'm in a different PID namespace” issue 2, it still
has “I'm in a different mount namespace and can't access runC's state
registry”. I don't see a way around that, although I do wish we had
better wording about this issue than the spec's current [3](e.g. [4]).

 Subject: Linux: Adding the PID namespace inode to state JSON
 Date: Wed, 30 Dec 2015 21:34:57 -0800
 Message-ID: <20151231053457.GG3680@odin.tremily.us>

@crosbymichael
Copy link
Member Author

ping,

I've been running stress tests and everything looks good with this change

@dqminh
Copy link
Contributor

dqminh commented Jun 13, 2016

LGTM

Approved with PullApprove

@LK4D4
Copy link
Contributor

LK4D4 commented Jun 13, 2016

@crosbymichael LGTM, but need rebase. Feel free to merge yourself after it.

@mrunalp
Copy link
Contributor

mrunalp commented Jun 13, 2016

I am reviewing/testing this.

@crosbymichael
Copy link
Member Author

rebased

}
return nil
}

func (c *linuxContainer) Exec() error {
path := filepath.Join(c.root, execFifoFilename)
f, err := os.OpenFile(path, os.O_RDONLY, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If two processes end up opening this file at the same time, then one of them will be stuck waiting on the ReadAll below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True but we don't support multiple processes handling the container. I can try to protect the libcontainer call with a lock but its not going to help out of process calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mon, Jun 13, 2016 at 11:12:02AM -0700, Michael Crosby wrote:

}
return nil

}

+func (c *linuxContainer) Exec() error {

  • path := filepath.Join(c.root, execFifoFilename)
  • f, err := os.OpenFile(path, os.O_RDONLY, 0)

True but we don't support multiple processes handling the container.
I can try to protect the libcontainer call with a lock but its not
going to help out of process calls.

I thought putting a byte in the pipe would help with this 1. Could
we make the reads unblocking? If you got the byte, success. If you
didn't get the byte, exit 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crosbymichael I agree that with current usage this should be fine. We could add flocks in the future if necessary.

@@ -21,6 +20,7 @@ import (
type linuxStandardInit struct {
pipe io.ReadWriteCloser
parentPid int
rootfd int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we call this statedirfd? rootfd seems to imply that it is a fd to the rootfs.

@cyphar
Copy link
Member

cyphar commented Jun 13, 2016

I still want to take some time to review this. Unfortunately I have exams until Friday. Can we wait on merging until next week?

@crosbymichael
Copy link
Member Author

@cyphar we need this to fix the racyness and 1.6 issues that we currently have with using signals

@cyphar
Copy link
Member

cyphar commented Jun 13, 2016

Alright. I'll take a look at this after merging, and make follow-up PRs if necessary.

@mrunalp
Copy link
Contributor

mrunalp commented Jun 13, 2016

LGTM

Approved with PullApprove

@crosbymichael
Copy link
Member Author

@mrunalp updating your nits and added a lock to that function

@mrunalp
Copy link
Contributor

mrunalp commented Jun 13, 2016

@crosbymichael Thanks! I'll wait for that before merging.

This removes the use of a signal handler and SIGCONT to signal the init
process to exec the users process.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
@crosbymichael
Copy link
Member Author

updated

@mrunalp
Copy link
Contributor

mrunalp commented Jun 13, 2016

We have some more references to root instead of state but some of it predates this PR so we can fixup in a follow on.

@mrunalp
Copy link
Contributor

mrunalp commented Jun 13, 2016

LGTM

Approved with PullApprove

@mrunalp
Copy link
Contributor

mrunalp commented Jun 13, 2016

@LK4D4 this needs another LGTM as the PR was updated :)

@LK4D4
Copy link
Contributor

LK4D4 commented Jun 13, 2016

@mrunalp CI is not ready too :)

@LK4D4
Copy link
Contributor

LK4D4 commented Jun 13, 2016

LGTM

Approved with PullApprove

@LK4D4 LK4D4 merged commit 85873d9 into opencontainers:master Jun 13, 2016
@crosbymichael crosbymichael deleted the start-pipe branch June 13, 2016 19:41
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants