-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Rework of start*.sh scripts #1052
[WIP] Rework of start*.sh scripts #1052
Conversation
This looks like the kind of thing that would be useful to me, too... I have done some big workarounds to set up my users' environments, including setting things like environment variables. My second thought is "there's a reason these are removed", but it doesn't fully apply here. When starting the image, we sudo root -> user. But if sudo is granted to users, then one can do user->root, and security could matter more. It's not inconceivable to me that someone would allow users to run certain commands with sudo, without wanting them to have full root access. I don't know the sudoers syntax myself, but could this be limited to just the root -> user direction? |
Hi Richard! I think as we have it setup right now, the user can escape and become a root user using sudo, and then it can do whatever it pleases, such as edit these files. I think... This PR is mainly a small refactoring and allowing also LD_LIBRARY_PATH and environment variables match PYTHON* and not only PYTHONPATH to be passed from the initial environment to the |
I ran into this error, I wonder if it is a temporary fluke though, hmm...
|
I'm testing this on my JupyterHub deployment and failed to have both the PATH and LD_LIBRARY_PATH transferred now. Hmm...... I'm running as root, then... # investigate if this env var survive a user switch
root# export LD_LIBRARY_PATH=YES
root# cat /etc/sudoers.d/notebook
Defaults env_keep += "PATH LD_LIBRARY_PATH PYTHON*"
jovyan ALL=(ALL) NOPASSWD:ALL
root# sudo -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: YES
root# sudo -E -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: NO
# comment out the env_keep part
root# visudo -f /etc/sudoers.d/notebook
root# cat /etc/sudoers.d/notebook
# Defaults env_keep += "PATH LD_LIBRARY_PATH PYTHON*"
jovyan ALL=(ALL) NOPASSWD:ALL
root# sudo -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: NO
root# sudo -E -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: NO |
I'm digging in on this issue. I'm currently on my way to resolve it by properly reviewing the start.sh script and learning everything about sudoers. |
Woah, crazy timing! I just submitted this. #1053 |
9ef56b6
to
7fff90b
Compare
Only if Of course this only matters if someone opts-in. But they could opt-in with sudo permission to only sudo to run certain commands. But then this change decreases that security level. Of course we don't support people limiting to certain commands, but it could be done and some expects it to be as secure as regular sudo. and if something is made to be secure, I'd rather make decisions explicit. ... maybe we already do worse things and this doesn't matter. |
base-notebook/start.sh
Outdated
# - Can multiple usernames be coupled to the userid 0? | ||
# - In this code, we see a big if/else clause about "id --uid == 0", but "id | ||
# --gid == 0" would be quite powerful as well right? Would it make sense to | ||
# check for either "id --uid == 0" or "id --gid == 0"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# - Can multiple usernames be coupled to the userid 0?
# - In this code, we see a big if/else clause about "id --uid == 0", but "id
# --gid == 0" would be quite powerful as well right? Would it make sense to
# check for either "id --uid == 0" or "id --gid == 0"?
I think the user management (usermod
, mv /home/jovyan ...
) won't work with just gid=0 ... it needs to be uid=0 to have those permissions. Backed up by a quick but not very complete test...
Fantastic work, @consideRatio! I really like how you improved the documentation and made things more explicit. As for your questions, I don't know any of the answers off the top of my head. I think I'm at a similar level to you; I would only be able to find the answers by digging into the man pages (and possibly source code) for several hours. As for the security implications, I am getting confused by the following:
I wish I could be more help here, but I don't actually have experience with configuring @rkdarst, assuming that we can configure |
I realized that it would be good to explicitly decide what is the appropriate default Some possibilities:
Unfortunately these two are incompatible. To argue against the security-conscious sysadmin perspective, the documentation states explicitly in bold, "You should only enable sudo if you trust the user or if the container is running on an isolated host." I feel like if someone enables |
I am trying to understand a bit about how If we decide to make a |
3665eb2
to
3308271
Compare
@maresb yeah haha #include is a statement which has meaning i think, while # include would be a comment :) I've started to think quite clearly now about everything: Our modification of
Our use of -E / --preserve-env: I think there shouldn't be a difference between starting the docker container as root or jovyan with regards to what the user will have in its environment, and we should assume that any environment variables set when starting the container should be presumed accessible to the actual user we transition to run as in the start.sh script. But... If we want to hide something from the user, we can create an exception, for example of any variable listed in |
3308271
to
407f7ff
Compare
You know, it just occured to me that #787 is relevant here (which I've been using at least as long as it's been open). These problems come because we try to start the directly starts the notebook, without a chance for user configuration. In my cluster, I use #787 and then have an extensive hooks directory that would do things like this after the sudo. It seems more natural we are doing complex things here, so first the system is booted, then we change to the user, then the user's environment is set up, then the notebook runs. If I reworked this, I would remove all the conda stuff before the sudo, and have a user hook that does "source activate conda" in there. This doesn't work if someone starts the image manually and sudos manually as part of testing stuff (but well, then what we see here wouldn't happen either...). |
Thanks for linking that PR! I have been thinking about these hooks as well, lets work on this also while making it safe to merge without risk breaking peoples deployments! Really glad to have you thinking about these issues with me @rkdarst and @marrsb! ❤️ |
I don't want to sidetrack this PR, but for the record, my current use case for
so that I can do something like
With @consideRatio's major improvements already in this PR, perhaps it's wise to take this one step at a time, deferring these to a follow-up PR. |
run-hooks /usr/local/bin/before-notebook.d | ||
echo "Executing the command: ${cmd[@]}" | ||
exec sudo -E -H -u $NB_USER PATH=$PATH XDG_CACHE_HOME=/home/$NB_USER/.cache PYTHONPATH=${PYTHONPATH:-} "${cmd[@]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the PATH=$PATH
is here, it actually does the right thing and path is correct (/opt/conda/bin is in the front, not at the end). It seems like a lot of work to remove this and then find the ways to add it again (even though the new way is actually a bit more "correct", though - but with wider-reaching side-effects and more quirks).
Simple and explicit would be adding in LD_LIBRARY_PATH=$LD_LIBRARY_PATH here... I'm not sure it's best but it's minimal until we decide what's best (about to file a new issue with thoughts...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the PATH=$PATH is here, it actually does the right thing and path is correct (/opt/conda/bin is in the front, not at the end).
I'm pretty sure that PATH=$PATH
is not doing the right thing here, even though it may look like it is. See #1053 for details. It's very subtle, and even the which
command gives the "wrong" answer.
Hello, I did not have the time to review all the changes. But maybe while performing this big refactoring, it could be interesting (or not) to also implement what has been suggested here: #1034 (comment). Since you have checked in detail the behavior of the big Best |
Allow you to define variables to be unset before running the command that start.sh is supposed to start. These variables will still be available in the hooks run before.
It is meant to allow you to opt out from non-error non-warning logs generated by start.sh.
565903a
to
63295ba
Compare
Current considerationsI'm letting this PR become a workbench fix all kinds of issues and limitations relating to the startup scripts. Here are some considerations. root/user based hooks@rkdarst have made work in #787 regarding hooks to run as root / user.
exec command in an already started container as a user
When a It would be good to have a documented approach on how to do this. We may need a dedicated helper script since re-executing start.sh could trigger chown and hooks etc to re-execute. At the same time, we have some parts in start.sh that we may want to also have in this script such as the newly added JUPYTER_ENV_VARS_TO_UNSET. Currently I suggest we extract the "run command" part of
NB_UMASK implementation adjustmentI think it may be problematic that the NB_UMASK configuration from #781 is applied from the @rkdarst I think your initial questions section in #1055 is addressed by this discussion. I'm not sure on how to go, but I think it can be suitable to:
|
@rkdarst I'm struggling with sourcing of scripts. I don't know how the start.sh script started as root can source something that is meant to run as the user, and I dislike to introduce something that will behave inconsistently. I think one may need something like this in the case where sudo needs to be used to switch to the user environment: https://unix.stackexchange.com/a/269080/257111 |
I think, like that , sourcing won't work well. If it's setting
environment variables (the reason for source instead of executing),
they wont' be propagated back up to the script.
What I did was sourced the root ones during the root part, and the
user parts after the primary sudo to the user. If this is what I
think it is, it's what we'll have to do anyway...
p.s. I'm sorry I stopped responding to this... it seems like it became
a much larger task than I could focus on (and also had too many
different goals to focus on). But if it would help, I'm happy to
start paying more attention. Though I'd sort of rather restarting and
porting over the relevant changes one step at a time, last time I
looked it seemed like this was changing a whole lot at once...
|
Based on my understanding, it now looks to me like The root of the problem (which @rkdarst seems to have been trying to solve with the hooks system) seems to be the lack of a good script to run a command as a user. I think we should do this first, and then we can reevaluate if the |
@consideRatio thank you for your time and work on rethinking the startup scripts. I agree they have grown unwieldy over the years as use of these images has grown and we (I) failed to define a clear scope for the project. I truly appreciate that others have stepped up to review these changes and share their experience (@maresb, @rkdarst) as I have not had much energy to contribute deep thought to open source lately. That said, if there's something specific I can contribute here, please let me know and I'll try to do it. |
I'm taking a pass through old PRs and cleaning up ones that haven't been touched in some time. I'm planning to leave this one alone given the depth of conversation and relation to other on-going discussions (e.g., jupyterhub/mybinder.org-deploy#1474) |
I think this PR has a lot of relevant discussion and changes, but is at this point by itself too large and hard to overview what it tries to accomplish. I suggest we close it. It is still referenced from various issues it meant to close for findability. |
PR Summary
Fixes #1053, Fixes #1034, Closes #1054
When we transition from a root user to the actual user that runs the command passed to start.sh, we are doing a lot of complicated things to preserve the environment variables, but we currently fail with this while also having a complicated bash script that is hard to understand.
With this PR, we will start to properly preserve the
PATH
,LD_LIBRARY_PATH
,and
PYTHON*
variables. We will also allow the $NB_USER's that were granted sudo to retain its own PATH when running sudo commands.The original use case for me was that I needed to set
LD_LIBRARY_PATH
andGRANT_SUDO
, butLD_LIBRARY_PATH
was stripped by the default configuration when usingsudo
even with the custom configuration to preserve it, because it was part of the defaultenv_delete
list of environment variables in the sudoers configuration.Related extra part of the PR
I've also added a feature in the last commit, which I could extract to a standalone PR later, introduced an environment variable named
JUPYTER_ENV_VARS_TO_UNSET
that allow us to list environment variables to unset just before we exec the command given to start.sh. This allow hooks to use something sensitive to be stripped from the environment later.Review suggestions
Read the PR summary, then review the commit changes commit by commit in order.
Original outdated text below
When we run
sudo -u jovyan
, even with-E
(--preserve-env
), we endup with reset environment variables. By running
sudo -V
as a root useror
sudo sudo -V
we get information about what environment variableswill be reset when we would do for example
sudo -E -u jovyan
.A use case I ran into was the need to preserve the
LD_LIBRARY_PATH
variable. I want to help provide information about where to find
libraries installed after the image is built. If they were installed
before the image was built I could have done...
... which would have
ldconfig
search through the folder and summarizeinformation about libraries into
/etc/ld.so.cache
. But, they are notthere yet, and there will be plenty of files in there later.
Anyhow. This PR ensures we preserve LD_LIBRARY_PATH as well when we
switch from the root user to the jovyan kind of user with sudo
privileges.