Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information about using NFS with z2jh #421

Open
yuvipanda opened this issue Jan 18, 2018 · 63 comments
Open

Add information about using NFS with z2jh #421

yuvipanda opened this issue Jan 18, 2018 · 63 comments

Comments

@yuvipanda
Copy link
Collaborator

NFS is still a very popular storage setup, and is a good fit for use with z2jh in several cases:

  1. When you are supporting a large number of users
  2. When you are running on baremetal and NFS is your only option
  3. When your utilization % (% of total users active at any time) is very low, causing you to spend more on storage than compute.

While we don't want to be on the hook for teaching users to setup and maintain NFS servers, we should document how to use an NFS serer that already exists.

@cam72cam
Copy link
Contributor

cam72cam commented Jan 19, 2018

@yuvipanda I'd like to be a guinea pig on this. I am trying to setup a persistent EFS volume and use that as the user storage.

So far I've created and applied:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-persist
spec:
  capacity:
    storage: 123Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: <fs-id>.efs.us-east-1.amazonaws.com
    path: "/"
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: efs-persist
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 11Gi

After that I added the following to my config.yaml:

singleuser:
  storage:
    static:
      pvc-name: efs-persist

I am pretty sure I am missing a few key ideas here

Edit:
first change was to add "type: static" to the storage section in config
changed pvc-name to pvcName

@yuvipanda
Copy link
Collaborator Author

w00t, thanks for volunteering :)

The other two things to keep in mind are:

  1. subPath (https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/jupyterhub/values.yaml#L138). This specifies where inside the share the user's homedirectory should go. Although it defaults to just {username}, I would recommend something like home/{username}
  2. User permissions. This can be a little tricky, since IIRC when kubelet creates a directory for subPath mounting it makes it uid 0 / gid 0, which is problematic for our users (with uid 1000 by default). The way we've worked around it right now is by using anongid / anonuid properties in our NFS share, but that's not a good long term solution. I've been working on http://github.com/yuvipanda/nfs-flex-volume as another option here. Is anongid / anonuid an option with EFS?

@cam72cam
Copy link
Contributor

  1. I saw that, thanks for the clearer explanation
  2. I don't think that anonuid/gid is an option on EFS

I'll take a look through the nfs-flex-volume repo

@yuvipanda
Copy link
Collaborator Author

@cam72cam another way to check that everything works right now except for permissions is to set:

singleuser:
  uid: 0
  fsGid: 0
  cmd: 
    - jupyterhub-singleuser
    - --allow-root

If you can launch servers with that, then we can confirm that the uid situation is the only problem.

@cam72cam
Copy link
Contributor

I am currently getting the following response:

 HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod in version \"v1\" cannot be handled as a Pod: v1.Pod: Spec: v1.PodSpec: Containers: []v1.Container: v1.Container: VolumeMounts: []v1.VolumeMount: v1.VolumeMount: SubPath: ReadString: expects \" or n, parsing 184 ...ubPath\": {... at {\"kind\": \"Pod\", \"spec\": {\"containers\": [{\"imagePullPolicy\": \"IfNotPresent\", \"lifecycle\": {}, \"ports\": [{\"containerPort\": 8888, \"name\": \"notebook-port\"}], \"volumeMounts\": [{\"subPath\": {\"username\": null}, \"name\": \"home\", \"mountPath\": \"/home/jovyan\"}, {\"readOnly\": true, \"name\": \"no-api-access-please\", \"mountPath\": \"/var/run/secrets/kubernetes.io/serviceaccount\"}], \"env\": [{\"name\": \"JUPYTERHUB_HOST\", \"value\": \"\"}, {\"name\": \"JUPYTERHUB_CLIENT_ID\", \"value\": \"user-efs4\"}, {\"name\": \"JUPYTERHUB_API_TOKEN\", \"value\": \"355dc09aca1143f580ee0435339cc18d\"}, {\"name\": \"JUPYTERHUB_USER\", \"value\": \"efs4\"}, {\"name\": \"EMAIL\", \"value\": \"efs4@local\"}, {\"name\": \"GIT_AUTHOR_NAME\", \"value\": \"efs4\"}, {\"name\": \"JUPYTERHUB_ADMIN_ACCESS\", \"value\": \"1\"}, {\"name\": \"JUPYTERHUB_SERVICE_PREFIX\", \"value\": \"/user/efs4/\"}, {\"name\": \"JPY_API_TOKEN\", \"value\": \"355dc09aca1143f580ee0435339cc18d\"}, {\"name\": \"JUPYTERHUB_API_URL\", \"value\": \"http://100.65.96.26:8081/hub/api\"}, {\"name\": \"JUPYTERHUB_BASE_URL\", \"value\": \"/\"}, {\"name\": \"JUPYTERHUB_OAUTH_CALLBACK_URL\", \"value\": \"/user/efs4/oauth_callback\"}, {\"name\": \"GIT_COMMITTER_NAME\", \"value\": \"efs4\"}, {\"name\": \"MEM_GUARANTEE\", \"value\": \"1073741824\"}], \"image\": \"jupyterhub/k8s-singleuser-sample:v0.5.0\", \"resources\": {\"limits\": {}, \"requests\": {\"memory\": 1073741824}}, \"args\": [\"jupyterhub-singleuser\", \"--ip=\\\"0.0.0.0\\\"\", \"--port=8888\"], \"name\": \"notebook\"}], \"securityContext\": {\"runAsUser\": 1000, \"fsGroup\": 1000}, \"volumes\": [{\"persistentVolumeClaim\": {\"claimName\": \"efs-persist\"}, \"name\": \"home\"}, {\"emptyDir\": {}, \"name\": \"no-api-access-please\"}], \"initContainers\": []}, \"metadata\": {\"labels\": {\"hub.jupyter.org/username\": \"efs4\", \"heritage\": \"jupyterhub\", \"component\": \"singleuser-server\", \"app\": \"jupyterhub\"}, \"name\": \"jupyter-efs4\"}, \"apiVersion\": \"v1\"}","reason":"BadRequest","code":400}

I suspect it has to do with:

              'volumes': [{'name': 'home',
                           'persistentVolumeClaim': {'claimName': 'efs-persist'}},
                          {'aws_elastic_block_store': None,
                           'azure_disk': None,
                           'azure_file': None,
                           'cephfs': None,
                           'cinder': None,
                           'config_map': None,
                           'downward_api': None,
                           'empty_dir': {},
                           'fc': None,
                           'flex_volume': None,
                           'flocker': None,
                           'gce_persistent_disk': None,
                           'git_repo': None,
                           'glusterfs': None,
                           'host_path': None,
                           'iscsi': None,
                           'name': 'no-api-access-please',
                           'nfs': None,
                           'persistent_volume_claim': None,
                           'photon_persistent_disk': None,
                           'portworx_volume': None,
                           'projected': None,
                           'quobyte': None,
                           'rbd': None,
                           'scale_io': None,
                           'secret': None,
                           'storageos': None,
                           'vsphere_volume': None}]},

It should have the nfs option set there if I understand correctly

Actually:

'volume_mounts': [{'mountPath': '/home/jovyan',
                                                 'name': 'home',
                                                 'subPath': {'username': None}},
                                                {'mount_path': '/var/run/secrets/kubernetes.io/serviceaccount',
                                                 'name': 'no-api-access-please',
                                                 'read_only': True,
                                                 'sub_path': None}],

it appears that subpath is not being set correctly

@yuvipanda
Copy link
Collaborator Author

So it turns out specifying subPath manually of home/{username} was required, so we should investigate why.

@yuvipanda
Copy link
Collaborator Author

The PVC needs to be in the same namespace as JupyterHub, so the pods can find it.

@yuvipanda
Copy link
Collaborator Author

The PVC needs to be told how to find the PV to match, and this is done by using:

  1. Labels to match PVC and PV
  2. Setting storageclass of PVC to '' (so kubernetes does not try to create a PV for it)

So

apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-persist
  labels:
    <some-label1-key>: <some-label1-value>
    <some-label2-key>: <some-label2-value>
spec:
  capacity:
    storage: 123Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: <fs-id>.efs.us-east-1.amazonaws.com
    path: "/"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: efs-persist
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  selector:
    matchLabels:
       <some-label1-key>: <some-label1-value>
       <some-label2-key>: <some-label2-value>
  resources:
    requests:
      storage: 11Gi

Things to note:

  1. Specify labels that uniquely identify this PV in your entire k8s cluster.
  2. The size requests bits are ignored both in the PV and PVC for EFS specifically, since it grows as you use it.

@cam72cam
Copy link
Contributor

cam72cam commented Jan 19, 2018

Ok, I've got the mount working. I did not do the label stuff yet, simply set 'storageClassName: ""' in the claim. That seemed to work just fine.

I ran into a speed bump where I had to change the security groups to allow access from EC2 to EFS. As a temporary measure I added both the EFS volume and the EC2 instances to the "default" security group. Eventually part of the initial kops config should add the correct security groups.

I am now getting a permission error:
PermissionError: [Errno 13] Permission denied: '/home/jovyan/.jupyter'

I am going to try to change the permissions on the EFS drive first, and if that does not work try the root hack that @yuvipanda mentioned

EDIT: A manual chown on the EFS volume to 1000:1000 seems to have worked!

@cam72cam
Copy link
Contributor

EFS Success!

Process:

Setup an EFS volume. It must be in the same VPC as your cluster. This can be changed in the AWS settings after it has been created.
The EFS volume will be created in the default security group in the VPC. As a temporary hack around, add your cluster master and nodes to the default VPC group so they can access the EFS volume. Eventually we will setup proper security groups as part of this process.

Created test_efs.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-persist
spec:
  capacity:
    storage: 123Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: fs-$EFS_ID.efs.us-east-1.amazonaws.com
    path: "/"

Created test_efs_claim.yaml

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: efs-persist
spec:
  storageClassName: ""
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 11Gi

kubectl --namespace=cmesh-test apply -f test_efs.yaml
kubectl --namespace=cmesh-test apply -f test_efs_claim.yaml

The sizes in these files don't mean what you think. There is no quota enforced with EFS **. In the future we want to set the efs PersistentVolume size to something ridiculously large like 8Ei and the PersistentVolumeClaim to 10GB (neither matters AFAICT). This is my rough understanding and could be incorrect.

A PersistentVolume defines a service which can perform a mount inside of a container. The PersistentVolumeClaim is a way of reserving a portion of the PersistentVolume and potentially locking access to it.

The storageClassName setting looks innocuous, but it is incredibly critical. The only non storage class PV in the cluster is the one we defined above. In the future we should tag different PV's and use tag filters in the PVC instead of relying on a default of "".

We are going to configure jupyterhub to use the same "static" claim among all of the containers*** . This means that all of our users will be using the same EFS share which should be able to scale as high as we need.

We now add the following to config.yaml

singleuser:
  storage:
    type: "static"
    static:
      pvcName: "efs-persist"
      subPath: 'home/{username}'

type static tells jh not to use a storage class and instead use a PVC defined below.
pvcName matches the claim name we specified before
subPath tells where on the supplied storage the mount point should be. In this case it will be "$EFS_ROOT/home/{username}"

It turns out there is a bug in jupyterhub where the default subPath does not work, and setting the subPath to "{username}" breaks in the same way.

At this point if we tried to start our cluster, it would fail. The directory created on the mount at subPath will be created with uid:0 and gid:0. This means that when jupyter hub is launched it won't be able to create any files, will complain, then self destruct.

What we need to do is tell the container to run our jupyterhub setup as root, then switch to the Jovian user before starting the jupyterhub process. When we are running as root we can do our own chown to adjust the created directory permissions.

First we merge the following to our config.yaml

singleuser:
  uid: 0
  fsGid: 0
  cmd: "start-singleuser.sh"

This tells jupyterhub to enter the container as root and run the start-singleuser script. Start-singleuser calls a helper start.sh script which we will use later on.

This will get jupyter hub to provision the container and attempt to start, but the process will still fail as the chmod has not taken place.

In order for us to have a properly chowned directory in /home/Jovian mounted from $EFS_ROOT/home/{username}, we need to create our own docker container***.

Here are some terse steps:
Create a docker account
Create a docker repo
Create a directory to store the build file
Create Dockerfile inside that directory

FROM jupyter/base-notebook:281505737f8a

# pin jupyterhub to match the Hub version
# set via --build-arg in Makefile
ARG JUPYTERHUB_VERSION=0.8
RUN pip install --no-cache jupyterhub==$JUPYTERHUB_VERSION

USER root
RUN sed -i /usr/local/bin/start.sh -e 's,# Handle username change,chown 1000:1000 /home/$NB_USER \n # Handle username change,'
RUN cat /usr/local/bin/start.sh
USER $NB_USER

The base dockerfile came from 967b2d2#diff-aed8b29ee8beb1247469956c481040c2
Notice that we are using the older revision. The newer revision is broken in some awesome way that Yuvi needs to fix.
This script is fragile and should be done better in the future...
The first parts of the docker setup are done as $NB_USER
The rest of the file is done as ROOT since we need to modify the start.sh script which will be run as root when the container is started.
Many of the files referenced can be found in https://github.com/jupyter/docker-stacks/tree/master/base-notebook
sudo yum install docker
sudo docker login
sudo docker build ${directory_containing_dockerfile}
sudo docker tag ${image_id_in_cmd_output} $docker_username/$docker_repo # can also be found by sudo docker images
sudo docker push $docker_username/$docker_repo

Merge the following into config.yaml

singleuser:
  image:
   name: $docker_username/$docker_repo
    tag: latest

You may be able to do a helm upgrade, but I ended up purging and reinstalling via helm just to be safe.

At this point you should be all set with a semi-fragile (but functional) EFS backed jupyterhub setup

Debugging tools: (all with --namespace=)

  • kubectl get pods # list pods
  • kubectl logs $podname # get a pod log, may not be anything if the crash happens soon enough
  • kubectl describe pod $podname # dumps a bunch of useful info about the pod
  • kubectl get pod $podname -o yaml # dumps the args used to create the pod. This stuff is the container creation stuff
** fuse layer for fs quota
*** We may run into issues with a hundred containers all hitting the same EFS volume.  I suspect that AWS can more than handle that, but I have been wrong before.  If it can't handle that Yuvi has a WIP nfs server sharding system partially built that we could use.
**** I hope that the changes I made to the base container will be adopted by the project as it seems relatively harmless to have in the start script.  Even if it is harmful to others, I would still like it in there as a config option (if possible).

@yuvipanda
Copy link
Collaborator Author

Thank you for getting this working, @cam72cam!

To summarize, this mostly works, except for the issue of permissions:

  1. When using subPath, Kubernetes creates this directory when it doesn't exist if it needs to
  2. However, this will always be created as root:root
  3. Since we want our users to run as non-root, this won't work for us and we have to use hacks to do chowns.

It'll be great if we can fix EFS or Kubernetes to have options around 'what user / group / mode should this directory be created as?'

@cam72cam
Copy link
Contributor

Could we add the chown hack I put in my image to the start.sh script in the stacks repo?
https://github.com/jupyter/docker-stacks/blob/master/base-notebook/start.sh#L19

Would that break any existing users setups?

cam72cam added a commit to cam72cam/docker-stacks that referenced this issue Jan 31, 2018
I ran into an issue when trying to get this to work with a NFS server which I did not have direct control over (EFS).  As part of the PersistentVolumeClaim, there is no easy way to set the UID and GID of the created directory.on the networked FS.

My only concern with this chown is that some user out there might be running jupyterhub in an odd configuration where $NB_USER is not supposed to have these exact permissions on the storage.  I think this is quite unlikely, but it is worth mentioning. 

I chronicled my experiences with working around this issue and setting up z2jh on EFS in jupyterhub/zero-to-jupyterhub-k8s#421 with @yuvipanda.
cam72cam added a commit to cam72cam/docker-stacks that referenced this issue Jan 31, 2018
I ran into an issue when trying to get this to work with a NFS server which I did not have direct control over (EFS).  As part of the PersistentVolumeClaim, there is no easy way to set the UID and GID of the created directory.on the networked FS.

My only concern with this chown is that some user out there might be running jupyterhub in an odd configuration where $NB_USER is not supposed to have these exact permissions on the storage.  I think this is quite unlikely, but it is worth mentioning. 

I chronicled my experiences with working around this issue and setting up z2jh on EFS in jupyterhub/zero-to-jupyterhub-k8s#421 with @yuvipanda.
@manics
Copy link
Member

manics commented Jan 31, 2018

@cam72cam
Copy link
Contributor

I figure doing the chown ourselves resolves it for now (behind a config setting) and can be removed once K8S finalizes if/how/when the subpath permissions should be set

@cam72cam
Copy link
Contributor

singleuser:
  image:
    name: jupyter/base-notebook
    tag: 29b68cd9e187
  extraEnv:
    CHOWN_HOME: 'yes'

Just confirmed the fix in my test env

@zcesur
Copy link

zcesur commented Feb 28, 2018

I was able to use my NFS-backed persistent claim on Google Cloud as the user storage by following the steps @cam72cam outlined, so I can attest his solution. Thanks for paving the way guys!

@amanda-tan
Copy link

Just to clarify, do we do a helm installation first using a config file with the start-singleuser.sh command inserted; and then do a helm upgrade using an updated config file with singleuser image?

@cam72cam
Copy link
Contributor

Either should work, though I'd recommend a clean install just to be safe.

@choldgraf
Copy link
Member

Hey all - it sounds like there's some useful information in this thread that hasn't made its way into Z2JH yet. Might I suggest that either:

  1. @cam72cam opens a PR to add a guide for NFS, similar to what's above
  2. If this isn't a solution we want to "officially" recommend yet for the reasons @yuvipanda mentions above, @cam72cam should write up a little blog post and we can link to this.

What do folks think?

@cam72cam
Copy link
Contributor

cam72cam commented May 1, 2018

We are currently doing an internal alpha with a setup similar to the one mentioned above and working out any minor issues which come up. @mfox22 I'd be up for either, what do you think?

My biggest concern with how it works at the moment is that a clever user could look at the system mounts and figure out how to do a userspace nfs mount with someone else's directory. I think we could get around that with configuring the PV differently, but I still have a lot to learn in regards to that.

@choldgraf
Copy link
Member

Well if this is "useful functionality that may introduce some bugs because it's in an 'alpha' state" kinda functionality, maybe a blog post kinda thing is better? One reason we added https://zero-to-jupyterhub.readthedocs.io/en/latest/#resources-from-the-community was to make it easier for people to add more knowledge to the internet without needing to be "official" z2jh instructions. :-)

If you like, I'm happy to give a writeup a look-through, and if it ever gets to a point that your team is happy with, we can revisit bringing it into Z2JH?

@fifar
Copy link

fifar commented Jul 23, 2019

@albertmichaelj So, I don't think you've tried my solution. I have the same requirement as yours, mounting one PVC multiple times on the same pod. How I got the name home is when you execute kubectl describe pod $pod_name, and check "Mounts" and "Volumes" parts, you'll find necessary hints. And as @stefansedich mentioned, PVC -> PV is exclusive 1-1 mapping, we're reusing the 1-1 existing mapping which is home.

@albertmichaelj
Copy link

@fifar I thought I had tried your solution, but I didn't quite understand what you were suggesting. Just for sake of posterity, I'll lay out the problem I was having. My initial config was (basically) this:

    storage:
      type: "static"
      static:
        pvcName: "nfs-home"
        subPath: 'home/{username}'
      extraVolumes:
        - name: home
          persistentVolumeClaim:
            claimName: "nfs-home"
      extraVolumeMounts:
        - name: home
          mountPath: "/mnt/data"
          subPath: "shared_data"

When I had this, I got the following error message when I try to spawn the pod:

Spawn failed: (422) Reason: error HTTP response headers: HTTPHeaderDict({'Audit-Id': '08312c49-cc6c-4d97-9ed3-bc862151b44c', 'Content-Type': 'application/json', 'Date': 'Wed, 24 Jul 2019 13:08:15 GMT', 'Content-Length': '372'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod \"jupyter-albertmichaelj\" is invalid: spec.volumes[2].name: Duplicate value: \"home\"","reason":"Invalid","details":{"name":"jupyter-albertmichaelj","kind":"Pod","causes":[{"reason":"FieldValueDuplicate","message":"Duplicate value: \"home\"","field":"spec.volumes[2].name"}]},"code":422}

basically saying that I had a duplicate resource (which I did, two things named home). However, when I used the config:

    storage:
      type: "static"
      static:
        pvcName: "nfs-home"
        subPath: 'home/{username}'
      extraVolumeMounts:
        - name: home
          mountPath: "/mnt/data"
          subPath: "shared_data"

it works! The problem is that I had the extraVolumes for the home volume in there again (which was already implicitly included in my home config).

This now works great, and I can use a single PV and PVC for as many mounts as I'd like! I had been creating dozens of PVs and PVCs (which is not that hard to do with a helm template, but it is annoying) in order to mount multiple shared volumes (groups, data, whole class, etc...). This is much more elegant.

Thanks @fifar.

@fifar
Copy link

fifar commented Jul 24, 2019

@albertmichaelj Good to know it works for you. The config without extraVolumes is exactly my suggestion. Actually, your solution inspired me much (the subPath in extraVolumeMounts) when I started thinking reusing one EFS for the same pod.

@sebastian-luna-valero
Copy link

Hi,

I was also struggling to get z2jh up and running on prem, and inspired by:
https://raymondc.net/2018/12/07/kubernetes-hosted-nfs-client.html

My solution was to helm install the nfs-client chart:
https://github.com/kubernetes-incubator/external-storage/tree/master/nfs-client

and make it the default storage class on my kubernetes cluster. Specific steps:

# install nfs-client chart
helm install stable/nfs-client-provisioner --set nfs.server=kubeserver --set nfs.path=/home --name nfs-client --namespace jhub
# define it as default storage class
kubectl patch storageclass nfs-client -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

I hope that helps others as well.

Best regards,
Sebastian

PS1: This could help here:
#593

PS2: Similar issue here:
#1320

PS3: This solution works when you have your own NFS server deployed. If not, you should try this other chart instead:
https://github.com/helm/charts/tree/master/stable/nfs-server-provisioner

@akaszynski
Copy link

Thanks for everyone who commented on this. I was having issues long login times when using Jupyterhub on Azure Kubernetes Service (AKS), and was able to take the login times from two minutes to 20 seconds by using NFS. For any who are interested, here's how I did it:

  • Create a NFS share on a VM on the same virtual network as my cluster
  • Create a persistent volume and persistent volume claim to that NFS
  • Update the config.yaml
# persistent volume
apiVersion: v1
kind: PersistentVolume
metadata:
  name: userdata-pv
  namespace: jup
spec:
  capacity:
    storage: 20000Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: 10.11.0.4
    path: "/userdata"
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: userdata-pvc
  namespace: jup
spec:
  storageClassName: ""
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 20Gi
# config.yaml
singleuser:
  storage:
    type: "static"
    static:
      pvcName: "userdata-pvc"
      subPath: 'home/{username}'
  uid: 0
  fsGid: 0
  cmd: "start-singleuser.sh"

@consideRatio
Copy link
Member

@akaszynski was the delay related to attaching the default PVCs that were dynamically created by JupyterHub, or perhaps on its creation? Did you notice this big time difference between first login ever and the second login, or was there a big time delay even on second login and jupyterhub user pod startups?

@akaszynski
Copy link

@consideRatio: The delay was for both. Maybe 1-2 minutes to create, 30-60 seconds to mount. When the end user expects things to load near instantly, having the progress bar hang there will drive them crazy!

I've even moved the hub-db-dir over to nfs as it was taking forever create the hub pod everytime I upgraded the cluster:

hub:
  extraVolumes:
    - name: hub-db-dir
      persistentVolumeClaim:
        claimName: userdata-pvc

@consideRatio
Copy link
Member

@akaszynski thanks for sharing this insight! ❤️

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/efs-multi-user-home-directories/3032/3

@josibake
Copy link

Just finished setting up user NFS storage and a shared NFS folder based on this excellent issue and the CHOWN stuff in base image (thanks @cam72cam !). Just checking in to see if this is still being worked on from either the K8s perspective or making it simpler in Z2JH setup? I plan on writing a blog on my experience just because there were small things that took me forever to figure out (like running singleuser.uid: 0 to get CHOWN to work), but I definitely see this as an important use case.

@manics
Copy link
Member

manics commented Mar 12, 2020

As far as I know no-ones actively working on it. The problem with NFS is it's flexibility- there are lots of ways to set it up, ranging from managed public cloud fileservers to NFS provisioners you manage yourself, so it'd be difficult to add something in the Z2JH chart. It'd be nice to make the documentation clearer though.

Perhaps a good start for someone would be to collate all the information in this issue as it's grown quite a lot and there are several configurations mentioned.

One option might be a discourse post? I think it's possible to create a editable wiki post (@consideRatio ?).

@consideRatio
Copy link
Member

One option might be a discourse post? I think it's possible to create a editable wiki post (@consideRatio ?).

Yepp! The first "post" in a "topic" can be made into a "wiki post" by users of a certain trust level. This trust is gained after being registered for 24 hours on the forum.

@josibake
Copy link

great! I am doing a write up internally, I'll take a crack at condensing that into a topic on discourse this weekend. Thanks!

@consideRatio
Copy link
Member

@josibake ❤️ thanks for taking the time to work on this, I think it would be very valuable summarizing this!

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/multi-nfs-mount-points/4610/2

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/additional-storage-volumes/5012/2

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/custom-dockerimage-for-jupyterhub-on-kubernetes/6118/6

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/warning-unable-to-mount-volumes-for-pod/6404/2

@satra
Copy link

satra commented Nov 11, 2020

i am still running into this issue on AWS EFS with the https://github.com/kubernetes-sigs/aws-efs-csi-driver . when the subPath is mounted into the pod the directory is owned by root with the following permissions

drwxr-xr-x 2 root users 6144 Nov 11 03:18 satra

and mounted in the container as 127.0.0.1:/home/satra 8.0E 0 8.0E 0% /home/jovyan

i was expecting CHOWN_HOME to change this but it doesn't. i have to mount the EFS into a separate pod and change the permissions manually for this to work.

i'm using the config discussed in this issue and in the z2jh documentation.

  storage:
    type: "static"
    static:
      pvcName: "efs-claim"
      subPath: 'home/{username}'
  extraEnv:
    CHOWN_HOME: 'yes'
  uid: 0
  fsGid: 0
  cmd: "start-singleuser.sh"

@satra
Copy link

satra commented Nov 11, 2020

following up on a tip from @consideRatio on the gitter channel i have finally been able to get this operational using initContainers. relevant storage and initiContainers section here:

https://github.com/ABCD-ReproNim/reprohub/blob/2eadfce61a48445a60e3fae8c1be5447c8ee2d5c/dandi-info/config.yaml.j2#L80

the uid/fsGid/CHOWN_HOME was removed and not necessary. and in addition i went with explicit mounts and volumes with type set to none.

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/sql-operationalerror-with-jh-default-config/12207/5

@a3626a
Copy link
Contributor

a3626a commented Feb 18, 2022

following up on a tip from @consideRatio on the gitter channel i have finally been able to get this operational using initContainers. relevant storage and initiContainers section here:

https://github.com/ABCD-ReproNim/reprohub/blob/2eadfce61a48445a60e3fae8c1be5447c8ee2d5c/dandi-info/config.yaml.j2#L80

the uid/fsGid/CHOWN_HOME was removed and not necessary. and in addition i went with explicit mounts and volumes with type set to none.

I love this. Because it let us to launch jupyterhub with normal user, not root. I think this should be the recommend way. And I think it is better to use -R option for chown.

  storage:
    type: none
    extraVolumes:
      - name: persistent-storage
        persistentVolumeClaim:
          claimName: efs-claim
    extraVolumeMounts:
      - name: persistent-storage
        mountPath: '/home/jovyan'
        subPath: 'home/{username}'
  initContainers:
    - name: nfs-fixer
      image: alpine
      securityContext:
        runAsUser: 0
      volumeMounts:
      - name: persistent-storage
        mountPath: /nfs
        subPath: 'home/{username}'
      command:
      - sh
      - -c
      - (chmod 0775 /nfs; chown -R 1000:100 /nfs)  

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/how-to-link-gcp-vm-disks-allocated-by-dynamic-pvc-hexhexhex-and-claim-username-again-after-irrevocably-losing-all-kubernetes-yaml/13412/7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests