Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH]: Guided create bucket finalized #33

Merged
merged 15 commits into from
Jun 14, 2019
Merged

Conversation

ljlamarche
Copy link
Contributor

This isn't finalized yet, I'm mostly starting this PR so we're all up to date on progress. I added several "helper" functions to validate input. Right now they're all just methods of the ResenCmd class. We discussed pulling them all into separate files, correct? Does that still seam reasonable?

The user interface seems to function at least. Give it a try and feel free to correct things!

@ljlamarche ljlamarche added the discussion Let's talk about this before doing anything label Jun 10, 2019
@ljlamarche
Copy link
Contributor Author

Oh, and I haven't tackled how we want to handle the start_jupyter command yet.

@ljlamarche
Copy link
Contributor Author

When a user enters a local path for mounting, should we check to make sure it exists?

@asreimer
Copy link
Contributor

asreimer commented Jun 11, 2019 via email

@ljlamarche
Copy link
Contributor Author

Sounds good. Also, just a stylistic thing, but would anyone be bothered if we make jupyter lab the default for start_jupyter? Jupyter lab starts when you choose yes at the end of the create_bucket sequence, so it would probably make sense.

@asreimer asreimer changed the title Enh guided create bucket finalized [ENH]: Guided create bucket finalized Jun 12, 2019
@asreimer
Copy link
Contributor

asreimer commented Jun 12, 2019

Minor problem found.

>>> Mount storage to /home/jovyan/work? (y/n): y
>>> Enter local path: /home/asreimer/test_path
>>> Enter bucket path: /home/jovyan/work
Invalid path. Must start with: /home/jovyan/work
>>> Enter bucket path: /home/jovyan/work/
Invalid path. Must start with: /home/jovyan/work

I'm working it.

Edit: Context: The "problem" here is that this doesn't match the server experience. On the server, /home/jovyan/work is used as a persistent user workspace for things like scripts and Jupyter notebooks created by the user, whereas /home/jovyan/mount is a location where users can mount in data, such as networked data drives, or a directory with 10s to 100s of GB of data that shouldn't be persisted in the bucket when it is frozen (to facilitate upload to persistent archives like Zenodo).

@asreimer
Copy link
Contributor

Next on the list:

  1. change start_jupyter such that it starts bucket as well, but so that it doesn't start another jupyter lab if one is already running
  2. remove start_bucket and stop_bucket command line
  3. add stop_jupyter command that stops jupyter lab and stops bucket.

@asreimer asreimer marked this pull request as ready for review June 13, 2019 11:45
@asreimer
Copy link
Contributor

Finally done and ready for testing.

@asreimer asreimer removed their request for review June 13, 2019 11:46
@asreimer
Copy link
Contributor

@ljlamarche, can you test this and merge if it works? Please start by removing any buckets you already have running.

@asreimer
Copy link
Contributor

After this is merged, we need to update usage docs in #31

@pmreyes2
Copy link
Contributor

@asreimer I am experiencing a bug:
First step:

>>> Enter bucket name: b0
Please choose a version of resen-core.
Available versions: 2019.1.0rc1
>>> Select a version: 2019.1.0rc1
...
>>> Start bucket and jupyterlab? (y/n): y
...adding core...
...adding mounts...
Bucket created successfully!
...starting jupyterlab...
Jupyter lab can be accessed in a browser at: http://localhost:9000/?token=XXXXX

Everything is working great.
Second step: From the resen command line I type:
[resen] >>> stop_jupyter b0
Then the resen tool hangs.
Issuing CTRL+C to have some more information on the issue:

CTraceback (most recent call last):
  File "/Users/e30824/python/python3.6_env/bin/resen", line 11, in <module>
    load_entry_point('resen==19.1.0', 'console_scripts', 'resen')()
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/resen/resencmd.py", line 446, in main
    ResenCmd(res).cmdloop(intro)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/cmd.py", line 138, in cmdloop
    stop = self.onecmd(line)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/cmd.py", line 217, in onecmd
    return func(arg)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/resen/resencmd.py", line 220, in do_stop_jupyter
    status = self.program.stop_jupyter(bucket_name)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/resen/Resen.py", line 83, in stop_jupyter
    return self.bucket_manager.stop_jupyter(bucket_name)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/resen/Resen.py", line 605, in stop_jupyter
    pid = self.get_jupyter_pid(bucket['docker']['container'])
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/resen/Resen.py", line 618, in get_jupyter_pid
    result = self.dockerhelper.execute_command(container,'ps -ef',detach=False)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/resen/Resen.py", line 805, in execute_command
    result = container.exec_run(command,detach=detach)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/models/containers.py", line 192, in exec_run
    demux=demux
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/api/exec_api.py", line 168, in exec_start
    return self._read_from_socket(res, stream, tty=tty, demux=demux)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/api/client.py", line 415, in _read_from_socket
    return consume_socket_output(gen, demux=demux)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/utils/socket.py", line 137, in consume_socket_output
    return six.binary_type().join(frames)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/api/client.py", line 409, in <genexpr>
    gen = (data for (_, data) in gen)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/utils/socket.py", line 94, in frames_iter_no_tty
    (stream, n) = next_frame_header(socket)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/utils/socket.py", line 66, in next_frame_header
    data = read_exactly(socket, 8)
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/utils/socket.py", line 51, in read_exactly
    next_data = read(socket, n - len(data))
  File "/Users/e30824/python/python3.6_env/lib/python3.6/site-packages/docker/utils/socket.py", line 31, in read
    select.select([socket], [], [])
KeyboardInterrupt

@asreimer
Copy link
Contributor

That output is meaningless to me. You'll need to debug this on your end.

@pmreyes2
Copy link
Contributor

Well, don't worry about the output. The resen tool just hangs after stop_jupyter. When I start and stop jupyter (not guided command) everything works fine

@asreimer
Copy link
Contributor

Ok, but why is it hanging? It works perfectly on my system.

@pmreyes2
Copy link
Contributor

pmreyes2 commented Jun 13, 2019

Yes, it hangs, i.e. no more command prompt.

Some other resen tool comments:

  • when trying to stop jupyter from a bucket that does not exist: great response:
[resen] >>> stop_jupyter b5
ERROR: Bucket with name: b5 does not exist!
ERROR: Bucket with name: b5 does not exist!
  • when trying to stop jupyter from a bucket that is in exited mode, the resen tool crashes
  • when trying to start jupyter from a bucket that does not exist, resen tool also crashes

@asreimer
Copy link
Contributor

This branch works perfectly in Windows 10 with Docker Toolbox.

@asreimer
Copy link
Contributor

I fixed the double stop_jupyter error. It was simple. I can't debug/fix the hanging problem you are experiencing with the information you have provided here. It also doesn't happen on my Windows 10 machine, so you have to figure this out. We'll see if @ljlamarche also has the problem or not.

@pmreyes2
Copy link
Contributor

@asreimer Sure. I'll try to see what is really happening. I will also test it in a linux server machine

@pmreyes2
Copy link
Contributor

pmreyes2 commented Jun 13, 2019

On linux I am getting another error when entering the local path:

>>> Mount storage to /home/jovyan/work? (y/n): y
>>> Enter local path: /home/pablo/test_storage
Traceback (most recent call last):
  File "/home/pablo/python/python3.5.2/bin/resen", line 9, in <module>
    load_entry_point('resen==19.1.0', 'console_scripts', 'resen')()
  File "/home/pablo/python/python3.5.2/lib/python3.5/site-packages/resen/resencmd.py", line 446, in main
    ResenCmd(res).cmdloop(intro)
  File "/usr/lib/python3.5/cmd.py", line 138, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib/python3.5/cmd.py", line 217, in onecmd
    return func(arg)
  File "/home/pablo/python/python3.5.2/lib/python3.5/site-packages/resen/resencmd.py", line 69, in do_create_bucket
    local_path = self.get_valid_local_path('>>> Enter local path: ')
  File "/home/pablo/python/python3.5.2/lib/python3.5/site-packages/resen/resencmd.py", line 388, in get_valid_local_path
    if os.path.isdir(path):
  File "/usr/lib/python3.5/genericpath.py", line 42, in isdir
    st = os.stat(s)
TypeError: argument should be string, bytes or integer, not PurePosixPath

Linux: 4.4.0-141-generic #167-Ubuntu SMP Wed Dec 5 10:40:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
python version: 3.5.2
docker version: Docker version 18.09.1, build 4c52b90

Now, if I try using stop_jupyter bucket_name in the resen tool in a bucket without storage(because it didn't work) that I let jupyter start at the end of the guided create_bucket command, the resen tool also hangs (as it did on my mac)

@ljlamarche
Copy link
Contributor Author

Ok, I had a chance to test this. I have not been able to reproduce the hanging issues that @pmreyes2 keeps running into at all... The only issue I had was start_jupyter crashes resen if you enter a bucket that does not exist.

[resen] >>> start_jupyter bucket_does_not_exist
Traceback (most recent call last):
  File "/anaconda2/envs/python3.6/bin/resen", line 10, in <module>
    sys.exit(main())
  File "/anaconda2/envs/python3.6/lib/python3.6/site-packages/resen/resencmd.py", line 446, in main
    ResenCmd(res).cmdloop(intro)
  File "/anaconda2/envs/python3.6/lib/python3.6/cmd.py", line 138, in cmdloop
    stop = self.onecmd(line)
  File "/anaconda2/envs/python3.6/lib/python3.6/cmd.py", line 217, in onecmd
    return func(arg)
  File "/anaconda2/envs/python3.6/lib/python3.6/site-packages/resen/resencmd.py", line 193, in do_start_jupyter
    ind = self.program.bucket_manager.bucket_names.index(bucket_name)
ValueError: 'bucket_does_not_exist' is not in list

I think we just need to replicate whatever check is in stop_jupyter for this purpose, which produces a very nice output.

[resen] >>> stop_jupyter bucket_does_not_exist
ERROR: Bucket with name: bucket_does_not_exist does not exist!
ERROR: Bucket with name: bucket_does_not_exist does not exist!

(@asreimer , did you already fix the double error issue? Regardless, I think this is really NOT a big deal right now.)

Aside from that, there were just a couple really minor cosmetic things from the user interface perspective:

  1. help menu contains both EOF and quit, and the help EOF produces the statement for quit. If there's not a reason to have EOF, we should probably do away with it.
  2. help for start_jupyter is missing the description that the rest of the commands have, i.e.
[resen] >>> help start_jupyter
Usage:
>>> start_jupyter bucket_name
        

vs

[resen] >>> help stop_jupyter
Usage:
stop_jupyter bucket_name : Stop jupyter on bucket bucket_name.
  1. Is resen-core 2019.1.0rc2 available? I thought I saw that it was being uploaded but I still only get the option for 2019.1.0rc1 in create_bucket.

Other than that, this looks great!

@@ -190,29 +174,52 @@ def do_status(self,args):

def do_start_jupyter(self,args):
"""Usage:
>>> start_jupyter bucket_name local_port bucket_port\t: Start a jupyter notebook server on port bucket_port available at local_port.
>>> start_jupyter bucket_name local_port bucket_port --lab\t: Start a jupyter lab server on port bucket_port available at local_port.
>>> start_jupyter bucket_name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc string should conform to the format of similar commands.
start_jupyter bucket_name : Start jupyter on bucket bucket_name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

status = self.program.start_jupyter(bucket_name,local_port,bucket_port,lab=lab)
# get bucket infomrmation (ports and status)
# This stuff may be better suited to exist in some kind of "status query" inside of Resen.py
ind = self.program.bucket_manager.bucket_names.index(bucket_name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the issue with the start_jupyter command. Checking to make sure that a bucket name exists happens in self.program.start_jupyter() on line 208, which is consistent with what's going on with stop_jupyter, but we're trying to get the port information up here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ljlamarche
Copy link
Contributor Author

Ok, I looked into the start_jupyter issue. The problem is when I hacked together something in resencmd.py to grab the ports BEFORE calling start_jupyter(), it circumvents the checking that start_jupyter() includes at the top. I think we have 3 options here:

  1. Keep the code in resencmd.py as is and just wrap it in some kind of try/except statement.
  2. Actually write some kind of "bucket query" function in Resen.py that returns the ports and handles checking to make sure the bucket exists correctly.
  3. Move the entire business of detecting what ports to use into start_jupyter() in Resen.py

@ljlamarche
Copy link
Contributor Author

@pmreyes2 and I figured out how to reproduce the hanging issue (at least on my mac). The problem occurs when you start a jupyter session and then actually DO something in the session. I wasn't getting the issue before because I was just checking that the jupyter page opened in my browser. To reproduce the problem:

  1. Start resen
$ resen
  1. Create bucket
[resen] >>> create_bucket
Please enter a name for your bucket.
Valid names may not contain spaces and must start with a letter and be less than 20 characters long.
>>> Enter bucket name: test
Please choose a version of resen-core.
Available versions: 2019.1.0rc1
>>> Select a version: 2019.1.0rc1
Local directories can be mounted to either /home/jovyan/work or /home/jovyan/mount/ in a bucket. The /home/jovyan/work location is a workspace and /home/jovyan/mount/ is intended for mounting in data. You will have rw privileges to everything mounted in work, but can specified permissions as either r or rw for directories in mount. Code and data created in a bucket can ONLY be accessed outside the bucket or after the bucket has been deleted if it is saved in a mounted local directory.
>>> Mount storage to /home/jovyan/work? (y/n): n
>>> Mount storage to /home/jovyan/mount? (y/n): n
>>> Start bucket and jupyterlab? (y/n): y
...adding core...
...adding mounts...
Bucket created successfully!
...starting jupyterlab...
Jupyter lab can be accessed in a browser at: http://localhost:9000/?token=8a835dfa4d19c036063da340e9f71681810a888816751bdd
  1. In jupyter lab, open a jupyter notebook and do something in it, e.g. 1+1
  2. Stop jupyter
[resen] >>> stop_jupyter test

This hangs.
If you skip step 3, it does NOT hang.

This behavior seems to happen regardless of whether you start jupyter by entering y at the end of the create_bucket prompt or you use start_jupyter later. I can start and stop jupyter as many times as I like successfully UNTIL the point where I actually enter something in a notebook, so somehow that is changing things.

@asreimer
Copy link
Contributor

asreimer commented Jun 13, 2019

@pmreyes2 are you using your own fork to do the testing? It's very strange that you are the only one getting stuck. And it's even stranger that on your Mac it's getting stuck at the "ps -ef" execute command statement. Is that the same place it's getting stuck on the Ubuntu server (you didn't indicate that it's the exact same problem)?

I'll fix the other problems. @ljlamarche I have already fixed the double stop_jupyter problem. @pmreyes, the pathlib problem you encountered seems to be some problem with the pathlib version on python 3.5. Can you please post the output of pip freeze so we can check versions of pathlib?

@asreimer
Copy link
Contributor

Ok, that's actually useful for debugging. Thanks for digging.

@asreimer
Copy link
Contributor

asreimer commented Jun 13, 2019

@ljlamarche @pmreyes2 define "do something"? I tried 1+1 in a py36 notebook. I cannot reproduce the problem. Here's what I did:

  1. create_bucket, including start jupyter
  2. open a new notebook (py36) and execute "import matplotlib"
  3. stop_jupyter

and

  1. create_bucket, including start jupyter
  2. stop_jupyter
  3. start_jupyter
  4. open a new notebook (py36) and execute "import matplotlib"
  5. stop_jupyter

no hanging. I don't get a hang on Windows 10. I don't get a hang on Fedora 29 (my laptop). I cannot reproduce this. When I come in you will have to show me what you are doing.

edit: it doesn't matter what I "do" in the jupyter server, nothing hangs it...

@asreimer
Copy link
Contributor

I have now fixed the crashing that occurs with start_jupyter when supplied with a non-existent bucket name.

I think I have fixed the pathlib problem that @pmreyes2 encountered on his ubuntu server.

Can @ljlamarche and @pmreyes2 confirm?

@pmreyes2
Copy link
Contributor

pmreyes2 commented Jun 13, 2019

start_jupyter random_name test:

[resen] >>> start_jupyter IdontExist
ERROR: Bucket with name: IdontExist does not exist!

Its fixed.
The problem with PurePosixPath pathlib is fixed too.

@ljlamarche
Copy link
Contributor Author

I duplicated this:

  1. create_bucket, including start jupyter
  2. open a new notebook (py36) and execute "import matplotlib"
  3. stop_jupyter
    and I DO get stop_jupyter hanging.
    Specifically,
  • I enter stop_jupyter B0 at the resen prompt.
  • A "Server Connection Error" box pops up over the JupyterLab window in my browser (this is expected).
  • I close the browser tab (confirm "Leave Page").
  • In the terminal, it has not returned to the resen prompt.

In the "normal" case (when I don't start a new notebook), entering stop_jupyter B0 returns to the resen prompt almost immediately, usually before I even have a chance to close the browser window.

I think we'll just have to show this to you when you get in - It looks like it's a weird mac thing.

@asreimer
Copy link
Contributor

@ljlamarche do_EOF supports usage of ctrl+D to exit shell. ctrl+D is a standard way to exit shells (bash, sh, etc, python, ipython, etc).

Also, yes, 2019.1.0rc2 is available on dockerhub now, but let's worry about that after we get this PR sorted out...

@asreimer
Copy link
Contributor

@ljlamarche you guys and you macs ;)

I'm guessing this is some strange behaviour with Docker Desktop? It doesn't explain how @pmreyes2 was getting it to hang on his ubuntu server though... I'm heading in soon.

@ljlamarche
Copy link
Contributor Author

Ok, got it. The EOF this is not a problem, and I agree 2019.1.0rc2 can wait!

@pmreyes2
Copy link
Contributor

The following doesn't hang resen:

  1. create_bucket, including start jupyter
  2. open a new notebook (py36) and execute "import matplotlib"
  3. Quit jupyter from the browser: File --> Quit
  4. stop_jupyter bucketname

So, in this case resen is not handeling the quit from jupyter, only the stopping of the bucket.

@asreimer
Copy link
Contributor

asreimer commented Jun 13, 2019

Recap of private discussions that led to solution:

  1. There was a bug in BucketManager.execute_command where it wasn't passing the detach keyword to the DockerHelper.execute_command. That's been fixed now.
  2. There is a bug in jupyter, which I've filed a PR to fix. For now, we'll implement a patch for resen-core and I'll change how the stop_jupyter command works.

@asreimer
Copy link
Contributor

@ljlamarche @pmreyes2 alright, the hanging bug is fixed now.

To test, you need to start_jupyter, and then in the container, modify the ~/envs/py36/lib/python3.6/site-packages/notebook/notebookapp.py by adding a not in front of check_pid in the 2 places it occurs in shutdown_server.

Then you can start and stop the jupyter server as much as you want. It will not hang again.

@pmreyes2
Copy link
Contributor

pmreyes2 commented Jun 14, 2019

The hanging bug is fixed now and tested on my mac and on the linux server. @asreimer also tested on a windows machine. Closing the PR...
Next thing to do is updating the documentation of how to use the resen tool after the changes.

@pmreyes2 pmreyes2 closed this Jun 14, 2019
@pmreyes2 pmreyes2 reopened this Jun 14, 2019
@pmreyes2 pmreyes2 merged commit 8af1f37 into develop Jun 14, 2019
@pmreyes2 pmreyes2 deleted the enh_guided_create_bucket branch June 14, 2019 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Let's talk about this before doing anything
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants