Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing kernel contents (file system) via comms #1006

Open
krassowski opened this issue Feb 1, 2024 · 4 comments
Open

Exposing kernel contents (file system) via comms #1006

krassowski opened this issue Feb 1, 2024 · 4 comments

Comments

@krassowski
Copy link
Member

During jupyter-server meeting we discussed the possibility of enabling the frontends to ask for and receive the content from the kernel. This is distinct from the current contents manager APIs in jupyter-server which gets and puts content in the jupyter server root.

For context, this sprout out of a discussion on file ownership/path resolution endpoint proposed for jupyter server, with motivating use case for the frontend to decide which API to hit to get a source file when user clicks on a file name to open the file (e.g. in a traceback):

There were two concerns raised for the above proposal:

  • it adds a new REST API addressing only one very specific use case
  • there is interest in exposing much more of the kernel contents to the frontend
  • it would require gateway to implement another step to pass the resolved path if that was to be accepted

Previously, another proposal was raised on sidelines of jupyter-server team conversation, involving exposing kernel contents via comms. In that proposal a ContentManager-like API would be optionally implemented by the kernel and available to the client by comms. This was proposed for ipylab by @bollwyvl:

It is not clear to me how this should be implemented, both on code level nor on architectural level, so I would greatly appreciate more thoughts on how it could/should function. CC @Zsailer.

@krassowski
Copy link
Member Author

The opposite problem (accessing files of a drive from kernel) was also explored in jupyterlab, I think it might be relevant to mention:

@linlol
Copy link

linlol commented Mar 5, 2024

Hi @krassowski, for the issue you mentioned in jupyter-server/jupyter_server#1280

This is the case even if frontend knows what the root_dir is. For example if root_dir is /home/my-username/server_root, the frontend does not know what is the expansion of ~ in the kernel space (it may well be /home/another-username/).

Is it applicable if we always flatten ~ to real absolute path? since user can easily get the value from frontend, and root_dir shall be immutable since server startup

@linlol
Copy link

linlol commented Mar 5, 2024

Thanks @krassowski and @Zsailer,

This issue is definitely worthy a further discussion, let me briefly share our use case.

Our case

In our case, we build Jupyter based on docker image and deploy it via https://github.com/jupyterhub/zero-to-jupyterhub-k8s in kubernetes.

Therefore, lib/internal code base installed when building image, while root_dir is configured as external volume mount. (since we want user to have independent and persistent filesystem to store their own code/data/notebook)

This design shall be quite common across investment bankings and other similar institution (e.g. hf). Since almost python-based quant-analytics team would use Jupyter with mounting dedicated filesystem to each user (and setup root_dir to that mounted volume). Public solutions such as Google Colabr shall also be similar. (Just guess, please correct me if I am wrong)

In this design, root-dir could never coincide with where we store code, which makes the new feature jupyterlab/jupyterlab#13390 introduced in 4.1 unavailable. Also, linking codes to root_dir is not acceptable since it would significantly downgrade the overall performance to create symbolic link across file systems.

Proposal

Add a flag, in terms of traitlet configuration, which let Jupyter-server know whether or not shall it search and expose files out of root_dir among original file disposing API.

(sorry that I might think this question on a limited scope, in my perspective, we are extending an existing feature, thus, make a few enhancement and configuration on original feature is enough)

@krassowski
Copy link
Member Author

Thank you @linlol for nudging this!

Of note, kernels with a debuggers already have a way to pass the contents of the file, wherever the user space is mounted. In such a setup the only thing that we need for jupyterlab/jupyterlab#13390 to work is to get info who owns the file (and what is its full path). This is why the original proposal was along the lines of:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants