Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validator: ensure user doesn't try to mount /sys without userns #807

Closed
wants to merge 1 commit into from
Closed

validator: ensure user doesn't try to mount /sys without userns #807

wants to merge 1 commit into from

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented May 8, 2016

Inside a user namespace, mount permissions become more nuanced. sysfs
requires you to have either CAP_SYS_ADMIN in the root user namespace
(for us this means that we're not using a user namespace) or that you
have CAP_SYS_ADMIN in the user namespace the network namespace was
created in. This means that having just a user namespace and no private
network namespace will result in errors when mounting sysfs1. Warn the
user about this in the validator.

Closes #799.

Signed-off-by: Aleksa Sarai asarai@suse.de

/cc @dqminh @mrunalp

Inside a user namespace, mount permissions become more nuanced. sysfs
requires you to have either CAP_SYS_ADMIN in the root user namespace
(for us this means that we're not using a user namespace) or that you
have CAP_SYS_ADMIN in the user namespace the network namespace was
created in. This means that having just a user namespace and no private
network namespace will result in errors when mounting sysfs[1]. Warn the
user about this in the validator.

[1]: http://lists.linuxfoundation.org/pipermail/containers/2013-August/033388.html

Signed-off-by: Aleksa Sarai <asarai@suse.de>
@cyphar
Copy link
Member Author

cyphar commented May 16, 2016

PTAL.

/ping @mrunalp @dqminh @hqhq

@dqminh
Copy link
Contributor

dqminh commented May 16, 2016

mmm, I think it's still possible that the user have cap_sys_admin on both the user namespace and the root ? ( though i dont know what's the purpose for that ). I dont think there's a lightweight method to check if an user possess cap_sys_admin either.

So i think this LGTM. I guess we can tune it if user complaints :p

@cyphar
Copy link
Member Author

cyphar commented May 16, 2016

@dqminh I'm not sure how that works actually. Because you lose all capabilities in your previous user namespace if you setns into a namespace (or unshare it). Maybe there's some odd setup where this happens, but it shouldn't happen within runC -- but we can change it if someone complains (as you said).

@crosbymichael
Copy link
Member

I'm not really a fan of these checks that prohibit runc from trying and letting the system return the error. The point being is that when/if this is fixed in future kernels we have to make a change and push an update for runc instead of not having this type of kernel validation and letting the kernel return the errors.

@cyphar
Copy link
Member Author

cyphar commented May 16, 2016

My main justification when it comes to things like this is that the errors people get from the operating system don't make sense (you get an error when trying to mount /sys, but there's no other information apart from EACCES). And if the error happens in the new process we create, we don't even send the error text to the parent so the user only gets exit status 1 as an error message. I noticed this particularly when working with trying to implement rootless containers, I had to keep recompiling runC with debugging statements and messing around with process setup so I could actually get the error.

@crosbymichael
Copy link
Member

idk, pros and cons on both sides. i don't know what is best

@cyphar
Copy link
Member Author

cyphar commented May 16, 2016

At the very least, we should probably fix the current issues with not actually outputting the error you get inside the init process. Unfortunately, we still have to do a lot if we want to return errors from inside nsenter (we'll have to serialise the JSON manually in C).

@mrunalp
Copy link
Contributor

mrunalp commented May 16, 2016

I agree with @crosbymichael. The configuration isn't 100% foolproof and we can provide better examples on how to make certain features work.

@hqhq
Copy link
Contributor

hqhq commented May 17, 2016

I also think we should not prohibit runc from trying because in theory the config could be valid and kernel rules might change. And the poor error message in this case is also pain, maybe some kinds of warning can be a compromise?

@crosbymichael
Copy link
Member

Ya, if we cannot get a useful error message out of the C code then we should probably do a warning and not abort start for something like this.

@cyphar
Copy link
Member Author

cyphar commented Oct 1, 2016

The nsenter rewrites have improved the logging part of the nsenter code (though there still is some left to be desired).

@cyphar cyphar closed this Oct 1, 2016
@cyphar cyphar deleted the sysfs-validator-netns branch October 1, 2016 13:16
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants