-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lay foundation to pass notebook names to kernel at startup. #656
Conversation
@goanpeca pointed out to me that it could be a good idea to include the associated file in each execute request metadata in order to take into account multiple connected clients and renaming, though that's a more invasive change. |
As you said, it only makes sense in specific situations, the most obvious one being when the kernel is used by a single notebook. But in this case, it looks to me like it would be better handled by e.g. the Jupyter Server, which is well aware of the notebook's name and its associated kernel. It could execute some code in the kernel at startup and each time the notebook is renamed, something like: __notebook_name__ = "my_notebook.ipynb" |
Note that there is some notion of "temporary" filename for the debug protocol. |
This sounds like another item for the kernel parameterization cart, although one that is more of a system-owned parameter. I agree that using an environment variable is the way to go, but I'd like to better understand the use-case here. The referenced notebook issue talks of a high availability scenario, while others talk about creating files with differing suffixes, where the notebook file's basename is the common association.
The fact that this value can change after a kernel has started is troubling. At best, this is merely a hint, so if this were to be supported, I would recommend this environment variable be named to convey that aspect: I like the idea of providing system-owned environment variables and the If we were to proceed with this, I would suggest introducing the environment variable in the session manager directly and utilizing the existing |
While that is true for the Python kernel this is not a cross language solution, the advantage of using an env variable is that it's kernel agnostic and does not risk breaking anything, I agree that
I'm not sure this is parameterization, but I see where you are coming from. I think that having this as an env variable is a cheap way to at least get started. Having the name can be used to better reference error messages (instead of having
I don't have particular feeling for how to name the env variable, HINT in fine, JPY_ prefix sounds great. I want to avoid My goal here is to have a stopgap – the env variable will in most of the case be the right value, if we call it INITIAL as you pointed that's enough to say that it might not be correct. But if we get that in it works for all the kernel, and then we can discuss what to add in execute request metadata , for example file type (notebook, plain, other), and the filename or full path. Those metadata can be interpreted by each kernel to provide language specific way to access those values, maybe if that can be harmonized with the debugger I'm all for it. Now I believe the first two questions we want to address are:
|
I think it's asked for often enough that having a path for this that's reasonably safe guarded makes sense.
I think if the solution is scoped to "When executing in situation A, then B will be available to the runtime" is sufficient. Solving perfectly for all cases is going to a major headache with little upside. Good documentation and naming convention here would probably make it clear around scope of use imho.
What about instead talking about it as kernel session naming? When you launch a kernel the kernel can accept a session name that's then set to an env variable (or whatever is decided). e.g. |
I think couching this in terms of a session name is a great idea - thanks @MSeal. Such an approach removes all sorts of implications that other names might suggest. For my own understanding, is this value just the notebook filename or an actual path? This comment seems to imply that, at least initially, the kernel can expect the value of this env to be a tangible resource (and not just a contextual string).
|
My reasoning for a path instead of a filename, is that sufficiently enough people cd around in their notebook that the filename might not be sufficient. But I agree that if we go with a session name this ambiguity is resolved as we don't guarantee it's the curent file. |
c8dae7c
to
4afe680
Compare
As a note, one of the use case this was privately requested to me was reporting which file are containing deprecated functions when using kernels scheduled on remote machines. There are hooks installed on a distributed system that log when deprecated functions are used and send user regular reports; problem is that the kernels do not know the file names and the logs are a bit useless as the only thing thay can mention is |
Do we want to target this for 7.0? |
If I'm understanding things correctly, and assuming we went with an env similar to |
4afe680
to
95f01a7
Compare
Unfortunately this does requires upstream, because even if most functions takes **kw, those in the end get passed to Popen() which does not like unkonwn keywords
95f01a7
to
a94cfb2
Compare
I've reworked this on top of the 7.x branch. I'm a bit worried of a couple of things.
I've sent jupyter/notebook#6180 to show how this can be used. it's also unclear where the name of the env variable should be documented. I can document in it in Jupyter_client, though I guess this is more interesting for end-users so maybe somewhere else ? |
This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: ```diff diff --git a/notebook/services/sessions/sessionmanager.py b/notebook/services/sessions/sessionmanager.py index 92b2a7345..f7b4011ce 100644 --- a/notebook/services/sessions/sessionmanager.py +++ b/notebook/services/sessions/sessionmanager.py @@ -108,7 +108,9 @@ class SessionManager(LoggingConfigurable): # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) kernel_id = yield maybe_future( - self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=path + ) ) # py2-compat raise gen.Return(kernel_id) ```diff Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.
a94cfb2
to
d0ee6c9
Compare
I agree that plumbing this as a specific (and new) keyword argument is painful. I guess I figured this would simply be incorporated in the existing
I think jupyter_client is the correct location for this. When/if we ever parameterize kernel launches, this would/should be viewed as a system parameter/property, and those should probably be documented at the lowest level (IMHO). |
There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter#6180. It will need to be ported to jupyter_server as well.
There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter/notebook#6180. It will need to be ported to jupyter_server as well. This is the mirror commit of jupyter/notebook#6279 on main notebook.
There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter/notebook#6180. It will need to be ported to jupyter_server as well. This is the mirror commit of jupyter/notebook#6279 on main notebook.
There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter#6180. It will need to be ported to jupyter_server as well.
@@ -151,6 +151,9 @@ def pre_start_kernel( | |||
constructor_kwargs = {} | |||
if self.kernel_spec_manager: | |||
constructor_kwargs["kernel_spec_manager"] = self.kernel_spec_manager | |||
|
|||
if 'session_name' in kwargs: | |||
constructor_kwargs['session_name'] = kwargs.copy().pop('session_name') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just kwargs['session_name'] ?
From my experience:
Already-proposed-and-discussed implementation with just passing properly-named envvar (containing notebook path, not name) suffice for above usecases. I'd be very supportive in getting it incorporated. |
@Carreau @kevin-bates do you think you converged on how this should work or are there any alternatives you still consider? |
@arogozhnikov - thanks for the ping. Reading back through this, I think we're good with using an The one thing I'd rather see is to introduce the |
ok, @Carreau
|
There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter/notebook#6180. It will need to be ported to jupyter_server as well. This is the mirror commit of jupyter/notebook#6279 on main notebook.
There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter/notebook#6180. It will need to be ported to jupyter_server as well. This is the mirror commit of jupyter/notebook#6279 on main notebook.
) * Inject session identifier into environment variable. There are many use case where users want to know the current notebook name/path. This help by adding a session identifier (to not really say this is the current notebook name), and by default make it the full path to the notebook document that created the session. This will of course not work if the notebook get renamed, but we can tackle this later. See also jupyter/jupyter_client#656, jupyter/notebook#6180. It will need to be ported to jupyter_server as well. This is the mirror commit of jupyter/notebook#6279 on main notebook. * Make start_kernel env extend os.environ (#859) * Make start_kernel env extend os.environ * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
As this is still open I'll ask the question: why I do see something like:
with the uuid rather than actual file name of the notebook? I was going to use the current notebook name as a way to add it to artifacts that are preserve along with other outputs but without the file name it's of no use and there seems to be no other sane, portable way to do so (https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name) :( p.s. I'm using Jupyter notebooks under JupyterLab. |
Hi @emsi. Yeah, this was realized the other day as well and an issue opened here: jupyter-server/jupyter_server#1059. I would suggest we keep the discussion there. Seems like this issue should be closed since the desired behavior was implemented (in jupyter_server), just that something caused the desired behavior to be changed. Closing. |
This has been a controversial topic from some time:
jupyter/notebook#1000
https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html
https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name
https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/
This is also sometime critical to linter, and tab completion to know
current name.
Of course current answer is that the question is ill-defined,
there might not be a file associated with the current kernel, there
might be multiple files, files might not be on the same system, it could
change through the execution and many other gotchas.
This suggest to add an JPY_ASSOCIATED_FILE env variable which is not
too visible, but give an escape hatch which should mostly be correct
unless the notebook is renamed or kernel attached to a new one.
Do do so this handles the new associated_file parameters in a few
function of the kernel manager. On jupyter_server this one line change
make the notebook name available using typical local installs:
Of course only launchers that will pass forward this value will allow
the env variable to be set.
I'm thinking that various kernels may use this and expose it in
different ways. like notebook_name if it ends with
.ipynb
inipykernel.