-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use fifo for create / start instead of signal handling #886
Conversation
@@ -168,6 +168,9 @@ func (l *LinuxFactory) Create(id string, config *configs.Config) (Container, err | |||
if err := os.MkdirAll(containerRoot, 0700); err != nil { | |||
return nil, newGenericError(err, SystemError) | |||
} | |||
if err := syscall.Mkfifo(filepath.Join(containerRoot, execFifoFilename), 0655); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0655
is a pretty strange permission set. I'm not clear on the intended auth here, but the signal-based approach was limited by kill(2)'s CAP_KILL
and/or UID requirements (although SIGCONT had a same-session special case). I expect you want 0600
here to match that (ish ;).
// use channel 'c' to ensure that the goroutine was scheduled so that the open is called | ||
// before we pivot into the rootfs or else we won't have access to the path on the host. | ||
go func() { | ||
close(goStarted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really help? Because scheduler still can reschedule on OpenFile call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gets us closer than without
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's like 30% less race chance, but still... It's really bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this code is definitely racy :( Maybe we can create fifo in rootfs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we just call the raw syscall or still an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think putting it in the rootfs will help because of readonly containers and such. we would want to delete it after we are done and could'nt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no race free way to make this work, we could just use either SIGUSR1/SIGUSR2 signals that are reserved for user code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I liked the signal approach to create-start. What is the reason for the change? While While I understand @wking's concerns with PID namespaces, it feels like we still have several other problems that we can't solve this way (if we used sockets) -- why should |
// clean up the signal handler | ||
signal.Stop(s) | ||
close(s) | ||
// wait for the fifo be be opened on the other side before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "be be"
On Mon, Jun 06, 2016 at 10:05:32PM -0700, Aleksa Sarai wrote:
The problem with signals is that you're just launching them, and don't
I'm not entirely sure how much of that you fix with a FIFO, but the
|
70cfec3
to
ae91b58
Compare
On Mon, Jun 06, 2016 at 10:05:32PM -0700, Aleksa Sarai wrote:
I don't see “we have out-of-spec operations that break under these And while using a filesystem object (in this cases a named pipe) gets
|
ping, I've been running stress tests and everything looks good with this change |
@crosbymichael LGTM, but need rebase. Feel free to merge yourself after it. |
I am reviewing/testing this. |
d038207
to
483a68a
Compare
rebased |
} | ||
return nil | ||
} | ||
|
||
func (c *linuxContainer) Exec() error { | ||
path := filepath.Join(c.root, execFifoFilename) | ||
f, err := os.OpenFile(path, os.O_RDONLY, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If two processes end up opening this file at the same time, then one of them will be stuck waiting on the ReadAll below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True but we don't support multiple processes handling the container. I can try to protect the libcontainer call with a lock but its not going to help out of process calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Mon, Jun 13, 2016 at 11:12:02AM -0700, Michael Crosby wrote:
} return nil
}
+func (c *linuxContainer) Exec() error {
- path := filepath.Join(c.root, execFifoFilename)
- f, err := os.OpenFile(path, os.O_RDONLY, 0)
True but we don't support multiple processes handling the container.
I can try to protect the libcontainer call with a lock but its not
going to help out of process calls.
I thought putting a byte in the pipe would help with this 1. Could
we make the reads unblocking? If you got the byte, success. If you
didn't get the byte, exit 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crosbymichael I agree that with current usage this should be fine. We could add flocks in the future if necessary.
@@ -21,6 +20,7 @@ import ( | |||
type linuxStandardInit struct { | |||
pipe io.ReadWriteCloser | |||
parentPid int | |||
rootfd int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we call this statedirfd
? rootfd
seems to imply that it is a fd to the rootfs.
I still want to take some time to review this. Unfortunately I have exams until Friday. Can we wait on merging until next week? |
@cyphar we need this to fix the racyness and 1.6 issues that we currently have with using signals |
Alright. I'll take a look at this after merging, and make follow-up PRs if necessary. |
@mrunalp updating your nits and added a lock to that function |
@crosbymichael Thanks! I'll wait for that before merging. |
This removes the use of a signal handler and SIGCONT to signal the init process to exec the users process. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
483a68a
to
3aacff6
Compare
updated |
We have some more references to root instead of state but some of it predates this PR so we can fixup in a follow on. |
@LK4D4 this needs another LGTM as the PR was updated :) |
@mrunalp CI is not ready too :) |
config-schema: Optimize code
This removes the use of a signal handler and SIGCONT to signal the init
process to exec the users process.
Closes #871