Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Rework of start*.sh scripts #1052

Closed

Conversation

consideRatio
Copy link
Collaborator

@consideRatio consideRatio commented Mar 27, 2020

PR Summary

Fixes #1053, Fixes #1034, Closes #1054

When we transition from a root user to the actual user that runs the command passed to start.sh, we are doing a lot of complicated things to preserve the environment variables, but we currently fail with this while also having a complicated bash script that is hard to understand.

With this PR, we will start to properly preserve the PATH, LD_LIBRARY_PATH,
and PYTHON* variables. We will also allow the $NB_USER's that were granted sudo to retain its own PATH when running sudo commands.

The original use case for me was that I needed to set LD_LIBRARY_PATH and GRANT_SUDO, but LD_LIBRARY_PATH was stripped by the default configuration when using sudo even with the custom configuration to preserve it, because it was part of the default env_delete list of environment variables in the sudoers configuration.

Related extra part of the PR

I've also added a feature in the last commit, which I could extract to a standalone PR later, introduced an environment variable named JUPYTER_ENV_VARS_TO_UNSET that allow us to list environment variables to unset just before we exec the command given to start.sh. This allow hooks to use something sensitive to be stripped from the environment later.

Review suggestions

Read the PR summary, then review the commit changes commit by commit in order.

Original outdated text below

When we run sudo -u jovyan, even with -E (--preserve-env), we end
up with reset environment variables. By running sudo -V as a root user
or sudo sudo -V we get information about what environment variables
will be reset when we would do for example sudo -E -u jovyan.

Environment variables to remove:
        *=()*
        RUBYOPT
        RUBYLIB
        PYTHONUSERBASE
        PYTHONINSPECT
        PYTHONPATH
        PYTHONHOME
        TMPPREFIX
        ZDOTDIR
        READNULLCMD
        NULLCMD
        FPATH
        PERL5DB
        PERL5OPT
        PERL5LIB
        PERLLIB
        PERLIO_DEBUG
        JAVA_TOOL_OPTIONS
        SHELLOPTS
        BASHOPTS
        GLOBIGNORE
        PS4
        BASH_ENV
        ENV
        TERMCAP
        TERMPATH
        TERMINFO_DIRS
        TERMINFO
        _RLD*
        LD_*
        PATH_LOCALE
        NLSPATH
        HOSTALIASES
        RES_OPTIONS
        LOCALDOMAIN
        CDPATH
        IFS

A use case I ran into was the need to preserve the LD_LIBRARY_PATH
variable. I want to help provide information about where to find
libraries installed after the image is built. If they were installed
before the image was built I could have done...

RUN echo "/usr/local/nvidia/lib64" > /etc/ld.so.conf.d/nvidia.conf \
 && ldconfig

... which would have ldconfig search through the folder and summarize
information about libraries into /etc/ld.so.cache. But, they are not
there yet, and there will be plenty of files in there later.

Anyhow. This PR ensures we preserve LD_LIBRARY_PATH as well when we
switch from the root user to the jovyan kind of user with sudo
privileges.

@rkdarst
Copy link
Contributor

rkdarst commented Mar 27, 2020

This looks like the kind of thing that would be useful to me, too... I have done some big workarounds to set up my users' environments, including setting things like environment variables.

My second thought is "there's a reason these are removed", but it doesn't fully apply here. When starting the image, we sudo root -> user. But if sudo is granted to users, then one can do user->root, and security could matter more. It's not inconceivable to me that someone would allow users to run certain commands with sudo, without wanting them to have full root access.

I don't know the sudoers syntax myself, but could this be limited to just the root -> user direction?

@consideRatio
Copy link
Collaborator Author

Hi Richard!

I think as we have it setup right now, the user can escape and become a root user using sudo, and then it can do whatever it pleases, such as edit these files. I think... This PR is mainly a small refactoring and allowing also LD_LIBRARY_PATH and environment variables match PYTHON* and not only PYTHONPATH to be passed from the initial environment to the jovyan environment.

@consideRatio
Copy link
Collaborator Author

I ran into this error, I wonder if it is a temporary fluke though, hmm...

=================== 3 passed, 1 deselected in 15.26 seconds ====================
docker build --build-arg TEST_ONLY_BUILD=1 --rm --force-rm -t jupyter/tensorflow-notebook:latest ./tensorflow-notebook
Sending build context to Docker daemon  8.192kB
Step 1/4 : ARG BASE_CONTAINER=jupyter/scipy-notebook
Step 2/4 : FROM $BASE_CONTAINER
 ---> d0b40dffbc41
Step 3/4 : LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
 ---> Running in b2d80307fae2
Removing intermediate container b2d80307fae2
 ---> c2d1fcea414c
Step 4/4 : RUN pip install --quiet     'tensorflow==2.1.0' &&     fix-permissions $CONDA_DIR &&     fix-permissions /home/$NB_USER
 ---> Running in f5c16fadec8d
ERROR: Exception:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/contrib/pyopenssl.py", line 313, in recv_into
    return self.connection.recv_into(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1840, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "/opt/conda/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1646, in _raise_ssl_error
    raise WantReadError()
OpenSSL.SSL.WantReadError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/contrib/pyopenssl.py", line 313, in recv_into
    return self.connection.recv_into(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1840, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "/opt/conda/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1663, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/response.py", line 425, in _error_catcher
    yield
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/response.py", line 507, in read
    data = self._fp.read(amt) if not fp_closed else b""
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 62, in read
    data = self.__fp.read(amt)
  File "/opt/conda/lib/python3.7/http/client.py", line 457, in read
    n = self.readinto(b)
  File "/opt/conda/lib/python3.7/http/client.py", line 501, in readinto
    n = self.fp.readinto(b)
  File "/opt/conda/lib/python3.7/socket.py", line 589, in readinto
    return self._sock.recv_into(b)
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/contrib/pyopenssl.py", line 328, in recv_into
    return self.recv_into(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/contrib/pyopenssl.py", line 318, in recv_into
    raise SocketError(str(e))
OSError: (104, 'ECONNRESET')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 186, in _main
    status = self.run(options, args)
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 331, in run
    resolver.resolve(requirement_set)
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/legacy_resolve.py", line 177, in resolve
    discovered_reqs.extend(self._resolve_one(requirement_set, req))
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/legacy_resolve.py", line 333, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/legacy_resolve.py", line 282, in _get_abstract_dist_for
    abstract_dist = self.preparer.prepare_linked_requirement(req)
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 482, in prepare_linked_requirement
    hashes=hashes,
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 287, in unpack_url
    hashes=hashes,
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 159, in unpack_http_url
    link, downloader, temp_dir.path, hashes
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 303, in _download_http_url
    for chunk in download.chunks:
  File "/opt/conda/lib/python3.7/site-packages/pip/_internal/network/utils.py", line 39, in response_chunks
    decode_content=False,
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/response.py", line 564, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/response.py", line 529, in read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
  File "/opt/conda/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/opt/conda/lib/python3.7/site-packages/pip/_vendor/urllib3/response.py", line 443, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
pip._vendor.urllib3.exceptions.ProtocolError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))
Removing intermediate container f5c16fadec8d
The command '/bin/sh -c pip install --quiet     'tensorflow==2.1.0' &&     fix-permissions $CONDA_DIR &&     fix-permissions /home/$NB_USER' returned a non-zero code: 2
Makefile:47: recipe for target 'build/tensorflow-notebook' failed
make: *** [build/tensorflow-notebook] Error 2

@consideRatio
Copy link
Collaborator Author

consideRatio commented Mar 27, 2020

I'm testing this on my JupyterHub deployment and failed to have both the PATH and LD_LIBRARY_PATH transferred now. Hmm......

I'm running as root, then...

# investigate if this env var survive a user switch
root# export LD_LIBRARY_PATH=YES

root# cat /etc/sudoers.d/notebook
Defaults env_keep += "PATH LD_LIBRARY_PATH PYTHON*"
jovyan ALL=(ALL) NOPASSWD:ALL

root# sudo -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: YES
root# sudo -E -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: NO

# comment out the env_keep part
root# visudo -f /etc/sudoers.d/notebook 

root# cat /etc/sudoers.d/notebook
# Defaults env_keep += "PATH LD_LIBRARY_PATH PYTHON*"
jovyan ALL=(ALL) NOPASSWD:ALL

root# sudo -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: NO
root# sudo -E -u jovyan bash -c 'echo OK: ${LD_LIBRARY_PATH:-NO}'
OK: NO

@consideRatio
Copy link
Collaborator Author

I'm digging in on this issue. I'm currently on my way to resolve it by properly reviewing the start.sh script and learning everything about sudoers.

@maresb
Copy link
Contributor

maresb commented Mar 27, 2020

Woah, crazy timing! I just submitted this. #1053

@maresb
Copy link
Contributor

maresb commented Mar 27, 2020

I like your approach, actually better than mine. Simply passing through PATH solves my problem and ensures the expected behavior.

The question of security implications seems subtle. The secure_path stuff was added by @parente in 6fa67cc. If he's available it'd be great to hear his thoughts.

@consideRatio consideRatio changed the title Preserve LD_LIBRARY_PATH and PYTHON* as sudo powered jovyan WIP: Preserve LD_LIBRARY_PATH and PYTHON* as sudo powered jovyan Mar 27, 2020
@consideRatio consideRatio force-pushed the preserve-ld-library-path branch from 9ef56b6 to 7fff90b Compare March 27, 2020 22:44
@rkdarst
Copy link
Contributor

rkdarst commented Mar 27, 2020

I think as we have it setup right now, the user can escape and become a root user using sudo, and then it can do whatever it pleases, such as edit these files.

Only if GRANT_SUDO=1, an explicit opt in. If this was the default, almost certainly somewhere in the world, people could access data not intended for them. At least nbgrader will have security problems.

Of course this only matters if someone opts-in. But they could opt-in with sudo permission to only sudo to run certain commands. But then this change decreases that security level. Of course we don't support people limiting to certain commands, but it could be done and some expects it to be as secure as regular sudo. and if something is made to be secure, I'd rather make decisions explicit.

... maybe we already do worse things and this doesn't matter.

# - Can multiple usernames be coupled to the userid 0?
# - In this code, we see a big if/else clause about "id --uid == 0", but "id
# --gid == 0" would be quite powerful as well right? Would it make sense to
# check for either "id --uid == 0" or "id --gid == 0"?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkdarst and @maresb, perhaps you can help me understand some of these questions I've raised above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# - Can multiple usernames be coupled to the userid 0?
# - In this code, we see a big if/else clause about "id --uid == 0", but "id
#   --gid == 0" would be quite powerful as well right? Would it make sense to
#   check for either "id --uid == 0" or "id --gid == 0"? 

I think the user management (usermod, mv /home/jovyan ...) won't work with just gid=0 ... it needs to be uid=0 to have those permissions. Backed up by a quick but not very complete test...

@maresb
Copy link
Contributor

maresb commented Mar 28, 2020

Fantastic work, @consideRatio! I really like how you improved the documentation and made things more explicit.

As for your questions, I don't know any of the answers off the top of my head. I think I'm at a similar level to you; I would only be able to find the answers by digging into the man pages (and possibly source code) for several hours.

As for the security implications, I am getting confused by the following:

[This is me, typing in a block quote for indentation.]

So start.sh is effectively two scripts: one root user, one non-root user. (The non-root-user script we can ignore for this discussion.) As far as I understand, the root-user part of start.sh is intended to be run with docker run --user root. It modifies /etc/sudoers and changes the user to jovyan.

As I understand @rkdarst, we should avoid making changes to jovyan's resulting environment which would weaken security, for example, allowing jovyan to pass PATH for his sudo commands, allowing him to effectively replace the system binaries with his own. Thus in any case, we should forbid jovyan from passing PATH to sudo, but maybe we could allow it for sudo only from root.

Now if jovyan can run sudo start.sh, and sudo start.sh can pass PATH to sudo, then we have effectively allowed jovyan to pass PATH to sudo.

Maybe this is okay for the following reason. If someone restricts jovyan to only be able to use sudo with certain commands, the start.sh would presumably not be on the list.

I wish I could be more help here, but I don't actually have experience with configuring /etc/sudoers.

@rkdarst, assuming that we can configure /etc/sudoers so that root can pass any environment variables to sudo but other users cannot, would that be an acceptable solution?

@maresb
Copy link
Contributor

maresb commented Mar 28, 2020

I realized that it would be good to explicitly decide what is the appropriate default secure_path for jovyan running with GRANT_SUDO. The current settings in the master branch are in my opinion all scrambled: start.sh tries to pass PATH in the sudo command, but this is overridden by secure_path. If we want to keep secure_path secure, then it shouldn't have any directories owned by jovyan. However, start.sh currently appends /opt/conda/bin! Moreover in this context, appending (as opposed to prepending) does not seem sensible, since the resulting environment has the mixture of /usr/bin/python and /opt/conda/bin/conda.

Some possibilities:

  • From a user's perspective, I would naively expect sudo to carry over my full Conda environment and (almost) all environment variables.

  • From a security-conscious sysadmin perspective, I might expect sudo to have a path consisting only of system binaries.

Unfortunately these two are incompatible.

To argue against the security-conscious sysadmin perspective, the documentation states explicitly in bold, "You should only enable sudo if you trust the user or if the container is running on an isolated host." I feel like if someone enables GRANT_SUDO but doesn't fully trust the user, then it's their responsibility to properly configure their sudoers to fit their needs, not ours.

@maresb
Copy link
Contributor

maresb commented Mar 28, 2020

I am trying to understand a bit about how sudoers works. As I understand, in the master branch, start.sh writes a new Defaults secure_path= line to /etc/sudoers.d/path. The sudoers file uses # as a comment character, except for the last line #includedir /etc/sudoers.d, which is a directive containing the # character (WTF). Thus it seems that the Defaults secure_path= in /etc/sudoers.d/path overrides the one from /etc/sudoers.

If we decide to make a secure_path just for jovyan, it looks like we can do Defaults:jovyan secure_path=...

@consideRatio consideRatio force-pushed the preserve-ld-library-path branch from 3665eb2 to 3308271 Compare March 29, 2020 02:56
@consideRatio
Copy link
Collaborator Author

consideRatio commented Mar 29, 2020

@maresb yeah haha #include is a statement which has meaning i think, while # include would be a comment :)

I've started to think quite clearly now about everything:

Our modification of secure_path:

  1. If we don't grant sudo, our choice of secure_path configuration for sudoers will only impact the single user of sudo we make from within start.sh, where we seemingly have wanted to pass along PATH entirely, and seem safe to me.
  2. If we do grant sudo, we have the option to limit the jovyan user we start up as by having secure_path set, but, this jovyan could then decide to simply do sudo su and update the /etc/sudoers` file for example, so it doesn't make sense to me that we trouble the sudo empowered jovyan but instead let this power user not run into such trouble, so, it secure_path should be disabled no matter what!

Our use of -E / --preserve-env:
We use sudo --preserve-env (-E), and that will make all set environment variables when the Dockerfile started (except certain such as LD_LIBRARY_PATH), be passed when we switch from root to the actual user. That this happens is very reasonable to me, since not starting as root would grant the jovyan user that environment directly.

I think there shouldn't be a difference between starting the docker container as root or jovyan with regards to what the user will have in its environment, and we should assume that any environment variables set when starting the container should be presumed accessible to the actual user we transition to run as in the start.sh script. But... If we want to hide something from the user, we can create an exception, for example of any variable listed in JUPYTER_ENV_VARS_TO_UNSET.

@consideRatio consideRatio force-pushed the preserve-ld-library-path branch from 3308271 to 407f7ff Compare March 29, 2020 03:59
@consideRatio
Copy link
Collaborator Author

@maresb I made plenty of changes and made this PR a lot more readable.
@parente I now consider it ready for review if you would have some time to spend on this.

I've updated the initial post with a PR summary and wrote some review suggestions to help you avoid reading too much.

@consideRatio consideRatio changed the title WIP: Preserve LD_LIBRARY_PATH and PYTHON* as sudo powered jovyan Review: Preserve environment (PATH, LD_LIBRARY_PATH, PYTHON*) when starting container as root Mar 29, 2020
@consideRatio consideRatio changed the title Review: Preserve environment (PATH, LD_LIBRARY_PATH, PYTHON*) when starting container as root Review: Preserve environment better when starting container as root Mar 29, 2020
@rkdarst
Copy link
Contributor

rkdarst commented Mar 29, 2020

You know, it just occured to me that #787 is relevant here (which I've been using at least as long as it's been open). These problems come because we try to start the directly starts the notebook, without a chance for user configuration. In my cluster, I use #787 and then have an extensive hooks directory that would do things like this after the sudo. It seems more natural we are doing complex things here, so first the system is booted, then we change to the user, then the user's environment is set up, then the notebook runs.

If I reworked this, I would remove all the conda stuff before the sudo, and have a user hook that does "source activate conda" in there. This doesn't work if someone starts the image manually and sudos manually as part of testing stuff (but well, then what we see here wouldn't happen either...).

@consideRatio
Copy link
Collaborator Author

consideRatio commented Mar 29, 2020

Thanks for linking that PR! I have been thinking about these hooks as well, lets work on this also while making it safe to merge without risk breaking peoples deployments!

Really glad to have you thinking about these issues with me @rkdarst and @marrsb! ❤️

@maresb
Copy link
Contributor

maresb commented Mar 29, 2020

I don't want to sidetrack this PR, but for the record, my current use case for start.sh is to use it as a wrapper for executing commands in the container from the host via docker exec. Thus I would make a "container run" command like

alias crun='docker exec my-container start.sh'

so that I can do something like crun ls to run commands as jovyan. I can see two issues with this:

  1. We might not want to run all hooks in such a scenario. We might need two types of hooks: startup hooks and environment hooks. Startup hooks are intended to run once and start necessary services, while environment hooks help achieve the correct environment for running commands as above. (Then we would want the cartesian product of these with @rkdarst's user/root hooks, and things get complicated...)

  2. Logging is going to cause conflicts. In addition to seeing the output of ls the output of the debugging commands would be mixed in. My ideal solution to this would be to have a default log-level of info and output to stderr, but make this adjustable via environment variables. It may be useful to use an established logging library. I did a few quick searches and found logr, but I have never tried it out.

With @consideRatio's major improvements already in this PR, perhaps it's wise to take this one step at a time, deferring these to a follow-up PR.

run-hooks /usr/local/bin/before-notebook.d
echo "Executing the command: ${cmd[@]}"
exec sudo -E -H -u $NB_USER PATH=$PATH XDG_CACHE_HOME=/home/$NB_USER/.cache PYTHONPATH=${PYTHONPATH:-} "${cmd[@]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the PATH=$PATH is here, it actually does the right thing and path is correct (/opt/conda/bin is in the front, not at the end). It seems like a lot of work to remove this and then find the ways to add it again (even though the new way is actually a bit more "correct", though - but with wider-reaching side-effects and more quirks).

Simple and explicit would be adding in LD_LIBRARY_PATH=$LD_LIBRARY_PATH here... I'm not sure it's best but it's minimal until we decide what's best (about to file a new issue with thoughts...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the PATH=$PATH is here, it actually does the right thing and path is correct (/opt/conda/bin is in the front, not at the end).

I'm pretty sure that PATH=$PATH is not doing the right thing here, even though it may look like it is. See #1053 for details. It's very subtle, and even the which command gives the "wrong" answer.

@consideRatio consideRatio changed the title Review: Preserve environment better when starting container as root WIP: Preserve environment better when starting container as root Mar 30, 2020
@romainx
Copy link
Collaborator

romainx commented Mar 30, 2020

Hello,

I did not have the time to review all the changes. But maybe while performing this big refactoring, it could be interesting (or not) to also implement what has been suggested here: #1034 (comment).

Since you have checked in detail the behavior of the big start.sh, your opinion is valuable on this topic.
What do you think about that? Does it make sense?

Best

Allow you to define variables to be unset before running the command
that start.sh is supposed to start. These variables will still be
available in the hooks run before.
It is meant to allow you to opt out from non-error non-warning logs
generated by start.sh.
@consideRatio consideRatio force-pushed the preserve-ld-library-path branch from 565903a to 63295ba Compare April 6, 2020 07:50
@consideRatio
Copy link
Collaborator Author

@romainx I added a commit regarding #1034!

This is now no longer a PR to fix a specific thing, but a bigger rework of the startup scripts.

@consideRatio consideRatio changed the title WIP: Preserve environment better when starting container as root Rework of start*.sh scripts Apr 6, 2020
@consideRatio
Copy link
Collaborator Author

Current considerations

I'm letting this PR become a workbench fix all kinds of issues and limitations relating to the startup scripts. Here are some considerations.

root/user based hooks

@rkdarst have made work in #787 regarding hooks to run as root / user.

  • -user / -root suffixed hooks added
  • tests with mounted hooks added

exec command in an already started container as a user

docker exec or kubectl exec can run a command in an already started container. This is also the mechanism used in Kubernetes pod lifecycle hooks.

When a exec commands run, it will be run independently from the command that got the container started in the first place, which means any root -> user transitions made in past executions of start.sh are irrelevant and we may want to switch to the user we configured in the initial call to start.sh.

It would be good to have a documented approach on how to do this. We may need a dedicated helper script since re-executing start.sh could trigger chown and hooks etc to re-execute. At the same time, we have some parts in start.sh that we may want to also have in this script such as the newly added JUPYTER_ENV_VARS_TO_UNSET.

Currently I suggest we extract the "run command" part of start.sh to be reused in a standalone manner without triggering user creation, chown, hooks, etc.

# 1. Waiting for the initial start.sh to finish can be vital as it creates the user etc.
# <waiting logic>
# Implementation idea: we let file existing/not-existing indicate if start.sh has finished running hooks etc, and let this be created/deleted in start.sh just before running the passed command.

# 2. Run command as user
if [ $(id -u) == 0 ]; then
    exec sudo --preserve-env --set-home --user $NB_USER "${cmd[@]}"
else
    exec "${cmd[@]}"
fi

NB_UMASK implementation adjustment

I think it may be problematic that the NB_UMASK configuration from #781 is applied from the jupyter_notebook_config.py as compared to in the startup script. Perhaps this should be part of the transition to user space instead of being part of the jupyter_notebook_config.py that assume you only need it for your jupyter server.


@rkdarst I think your initial questions section in #1055 is addressed by this discussion. I'm not sure on how to go, but I think it can be suitable to:

  • Add -user.d and -root.d hooks.
  • Extract the final launch command logic into a separate script that can be run by itself.
    • Let this script be able to: ensure start.sh has run hooks once by a waiting mechanism.
    • Let this script be able to: adjusts to getting started as root/user.

@consideRatio
Copy link
Collaborator Author

@rkdarst I'm struggling with sourcing of scripts. I don't know how the start.sh script started as root can source something that is meant to run as the user, and I dislike to introduce something that will behave inconsistently.

I think one may need something like this in the case where sudo needs to be used to switch to the user environment: https://unix.stackexchange.com/a/269080/257111

@consideRatio consideRatio changed the title Rework of start*.sh scripts [WIP] Rework of start*.sh scripts Apr 16, 2020
@rkdarst
Copy link
Contributor

rkdarst commented Apr 16, 2020 via email

@maresb
Copy link
Contributor

maresb commented Apr 19, 2020

Based on my understanding, it now looks to me like -user.d and -root.d hooks are the wrong approach. (I changed my mind.) They add another layer of complexity to the startup scripts. One can always use if [ $(id -u) == 0 ] in order to customize behavior based on if one is root or not.

The root of the problem (which @rkdarst seems to have been trying to solve with the hooks system) seems to be the lack of a good script to run a command as a user. I think we should do this first, and then we can reevaluate if the -user.d and -root.d hooks are actually necessary.

(Apologies to @rkdarst if I'm misrepresenting #787.)

@parente
Copy link
Member

parente commented Apr 19, 2020

@consideRatio thank you for your time and work on rethinking the startup scripts. I agree they have grown unwieldy over the years as use of these images has grown and we (I) failed to define a clear scope for the project.

I truly appreciate that others have stepped up to review these changes and share their experience (@maresb, @rkdarst) as I have not had much energy to contribute deep thought to open source lately. That said, if there's something specific I can contribute here, please let me know and I'll try to do it.

@parente
Copy link
Member

parente commented Nov 29, 2020

I'm taking a pass through old PRs and cleaning up ones that haven't been touched in some time. I'm planning to leave this one alone given the depth of conversation and relation to other on-going discussions (e.g., jupyterhub/mybinder.org-deploy#1474)

@consideRatio
Copy link
Collaborator Author

I think this PR has a lot of relevant discussion and changes, but is at this point by itself too large and hard to overview what it tries to accomplish. I suggest we close it. It is still referenced from various issues it meant to close for findability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong PATH in start.sh when running as root NOTEBOOK_ARGS passed to image, but has no effect
5 participants