-
Notifications
You must be signed in to change notification settings - Fork 43
shim: Start shims inside appropriate namespaces #637
Conversation
b206a54
to
a3b5c9b
Compare
pkg/nsenter/nsenter.go
Outdated
} | ||
|
||
func getNSPathFromPID(pid int, nsType NSType) string { | ||
return fmt.Sprintf("/proc/%d/ns/%s", pid, nsType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: (file)path.Join()
maybe (here and below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I will replace it.
@@ -0,0 +1,183 @@ | |||
// | |||
// Copyright (c) 2018 Intel Corporation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit mentions "Based on the CNI implementation of the
package 'ns' on the plugins repository", so do you need to reference some of those details here? Or is this a clean-room re-implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really a brand new re-implementation here. I have just got inspiration from what the ns
package is doing for CNI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's anything that was loosely copied from the ns package, we have to include the right copyrights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I understand, then I'll add them.
pkg/nsenter/nsenter.go
Outdated
|
||
// List of namespace types. | ||
const ( | ||
NSTypeCGroup NSType = "cgroup" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bonus points for sorting the NSType
's and CloneFlagsTable
alphabetically 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
;)
pkg/nsenter/nsenter.go
Outdated
return file, nil | ||
} | ||
|
||
func sortNSList(nsList []Namespace) []Namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a unit-test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely !
pkg/nsenter/nsenter.go
Outdated
|
||
// NsEnter executes the passed closure under the given namespace, | ||
// restoring the original namespace afterwards. | ||
func NsEnter(unsortedNSList []Namespace, toRun func() error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the importance and delicacy of this change, we really could do with some unit tests. I think you should be able to write unit tests for these functions that for examle create a new ns, switch to it and record their (hopefully new) namespace, assuming they all start with:
if os.Geteuid() != 0 {
t.Skip(testDisabledAsNonRoot)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was planning on implementing some testing here. I'll work on that today.
pkg/nsenter/nsenter.go
Outdated
NSTypeCGroup NSType = "cgroup" | ||
NSTypeIPC = "ipc" | ||
NSTypeNet = "net" | ||
NSTypeMount = "mnt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sboeuf As discussed I had tried doing the exact same thing taking inspiration from the CNI implementation for net namespaces. Unfortunately setns in golang does not work for mount namespaces due to multithreaded nature of golang. We may have to resort to C code to be able to switch to mount namespaces, similar to what libcontainer does.
If we are trying to solve the shim exit bug for exec, we may get away with solving this by just switching to PID namespaces.
@sboeuf Did some digging up on the issues filed surrounding the mount namespace issue when I was implementing support for devices. See this: Also, that issue has code that checks if the namespace change was successful, we could do something similar in our tests: |
@sboeuf Doing things the way you are doing currently works (most of the time) and provides for a simple solution (instead of reexecing a new process), but I came across this article which does state that it is not 100% foolproof : https://www.weave.works/blog/linux-namespaces-and-go-don-t-mix Basically golang may spawn a new thread for a blocking syscall even after LockOsThread() The issue seems to have been resolved in golang 1.10, but we may see an unsuccessful setns call in earlier versions. Doing things in the shim code, may not be such a bad approach after all. |
cbf2a11
to
acfe46d
Compare
@jodh-intel PR updated with almost full coverage for unit tests and more than that a real test that we're entering namespaces.
Basically, this package does the most that a Go package can do with netns. So my proposal is that we could take it in since it will help spawning our shims into expected PID and network namespaces, and worst case, it won't do any harm in case we hit the issues explained above once every 10000000000 times. At least this whole PR brings the structure for the shims to be switched to the expected namespaces. I think we should try to look into a better way to solve this, but this will involve more researches. |
acfe46d
to
9f1b05c
Compare
@sameo @jodh-intel @amshinde PR is ready for another review ;) |
pkg/nsenter/nsenter.go
Outdated
return nil | ||
} | ||
|
||
func getFileFromNS(nsPath string) (*os.File, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this function is not exported, but it is not very intuitive what this function does without looking at the code. Could you add some documentation comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/nsenter/nsenter.go
Outdated
go func() { | ||
defer wg.Done() | ||
runtime.LockOSThread() | ||
innerError = containedCall() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be an UnlockOsThread() call here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sure I think we can do that. But it won't change anything since this goroutine ends after that.
pkg/nsenter/nsenter_test.go
Outdated
assert.Nil(t, err, "Should not fail since closure should return nil: %v", err) | ||
} | ||
|
||
func areNSIdentical(nsPath1, nsPath2 string) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think you are calling this function anywhere in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah you're right, that was a temp function that I should have removed !
pkg/nsenter/nsenter_test.go
Outdated
|
||
evalExpectedNSPath, err := filepath.EvalSymlinks(ns.Path) | ||
if err != nil { | ||
evalExpectedNSPath = err.(*os.PathError).Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to add comments here describing how you are doing the comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good point
@@ -314,7 +315,22 @@ func (h *hyper) exec(pod *Pod, c Container, cmd Cmd) (*Process, error) { | |||
Process: *hyperProcess, | |||
} | |||
|
|||
process, err := prepareAndStartShim(pod, h.shim, c.id, token, h.state.URL, cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer passing additional arguments here rather than duplicating the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough ! I have removed it.
This package goal is to make very simple the execution of any code inside any set of namespaces. Based on the CNI implementation of the package 'ns' on the plugins repository, this new package takes a list of namespaces as input and a routine that should be called once all those namespaces have been entered. This way, this package provide a generic way to spawn any binary into the expected set of namespaces. It exports only one function NsEnter() making the usage very simple. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit gives the ability to start a new shim process inside a set of new namespaces. This will be useful to start the container process which should be "init" process of its PID namespace. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This new commit adds to the shim the ability to be run within any existing namespace. This will be very useful in order to start shims representing exec'ed processes related to an existing container. In this case, we will need to start those shims in the namespaces of the container process shim. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The shim representing the container process needs to be in its own PID namespace as the "init" process. The reason is that we need to make sure it will wait for its exec'ed processes in the same PID namespace to be killed before to return. This happens inside each agent, but if this is not reproduced on the host, we get some race conditions leading the shim container process to return before its exec'ed processes are killed. And this leads those shims to never return their exit codes. This commit solves this issue by making sure a new shim representing a container process enters a new PID namespace, and any exec'ed process related to this container enters this PID namespace previously created. Fixes #613 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
9f1b05c
to
5c73774
Compare
@amshinde PR updated, PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@grahamwhaley @jodh-intel ready to be merged |
This PR first creates a new package
nsenter
so that we can spawn any code into a set of namespaces. Then, the virtcontainers code is modified to rely on this new package. At last, it starts shims from agents within the right PID namespace. In details, a shim representing a container process should enter its own PID namespace so that we can spawn other shims corresponding to exec'ed processes inside the same namespace. This is needed if we don't want some race conditions to happen, due to the exec'ed shims to never return their exit codes if the container process returned first.Fixes #613