Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client hooks #188

Merged
merged 2 commits into from
Jan 31, 2022
Merged

Client hooks #188

merged 2 commits into from
Jan 31, 2022

Conversation

devintang3
Copy link
Contributor

Continuation of #79

I rebased with master and added some tests. I occasionally get errors on test_many_parallel_notebooks which I don't think is related, but I'll also get it on test_error_cell_hooks in py38. It's an intermittent error that gets resolved if I add a sleep, so I think it's something related to the async task not running right away. Any guidance here would be appreciated.

I also removed py36 from tox since it looked like the minimum version is now Python3.7 and it was just hanging when I tried running it.

Copy link
Member

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR @devintang3.
Would you like to add these new features to the documentation also?

nbclient/client.py Outdated Show resolved Hide resolved
nbclient/client.py Outdated Show resolved Hide resolved
nbclient/client.py Outdated Show resolved Hide resolved
nbclient/tests/test_client.py Outdated Show resolved Hide resolved
nbclient/tests/test_client.py Outdated Show resolved Hide resolved
nbclient/util.py Outdated
loop = asyncio.get_event_loop()
hook_with_kwargs = partial(hook, **kwargs)
future = loop.run_in_executor(None, hook_with_kwargs)
asyncio.ensure_future(future)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, does it mean that hooks are going to be launched in the background? BTW, it would be nice to test async hooks.
I think run_hook should be async:

async def run_hook(hook: Optional[Callable], **kwargs) -> None:
    await ensure_async(hook(**kwargs))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the intent is for the hook to be launched in the background.

@devintang3
Copy link
Contributor Author

Thanks for the suggestions. I've applied your recommendations as well as added more tests.
Some changes to note:

  • tests/util.py has been renamed to tests/test_util.py so that pytest can pick it up automatically
  • Added the package pytest-asyncio to requirements-dev.txt so I can test some async methods
  • Updated docs
  • The pre-commit hook was complaining about unused variables so I've removed them from test_client.py. I'm not quite sure what the purpose of the loop = asyncio.get_event_loop() was

nbclient/client.py Outdated Show resolved Hide resolved
nbclient/util.py Outdated Show resolved Hide resolved
@davidbrochart
Copy link
Member

Could you rebase on master? It looks like you merged master into your branch instead.

@@ -805,11 +865,13 @@ async def async_execute_cell(
self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
)

await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we are at it, perhaps there could be a hook at the very start of this function, before even skipping non-executing cells, like on_cell_visit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, that would allow pre-processing Markdown cells.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to just move on_cell_start to the start instead of creating a new hook?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make sense, and we know if the cell is a Markdown cell with cell.cell_type.
But then maybe add on_cell_execute right before the execution? It won't be called if the cell is not a code cell or if the cell is skipped because of a tag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved on_cell_start towards the top so it'll execute for non-code cells and added a new hook called on_cell_execute

Copy link
Contributor

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really nice extension for nbclient - I have a few comments and suggestions, mostly around the API design and the documentation to make it clear how to use it

docs/client.rst Outdated
Comment on lines 99 to 128
In addition to the two above, we also support traitlets for hooks. They are as
follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``,
``on_cell_error``. These traitlets allow specifying a ``Callable`` function,
which will run at certain points during the notebook execution and is executed asynchronously.
``on_execution_start`` will run when the notebook client is kicked off.
``on_cell_start`` will run right before each cell is executed.
``on_cell_complete`` will run right after the cell is executed.
``on_cell_error`` will run if there is an error in the cell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In addition to the two above, we also support traitlets for hooks. They are as
follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``,
``on_cell_error``. These traitlets allow specifying a ``Callable`` function,
which will run at certain points during the notebook execution and is executed asynchronously.
``on_execution_start`` will run when the notebook client is kicked off.
``on_cell_start`` will run right before each cell is executed.
``on_cell_complete`` will run right after the cell is executed.
``on_cell_error`` will run if there is an error in the cell.
Hooks before and after cell execution
~~~~~~~~~~~~~~~~~~~~~~~~
There are several configurable hooks that allow the user to execute code before and
after a cell is executed. Each one is configured with a function that will be called in its
respective place in the cell execution pipeline. These function calls are **asynchronous**.
Each is described below:
**Notebook-level hooks**
These hooks are called with a single extra parameter:
- ``notebook=NotebookNode``: the current notebook being executed.
Here is the available hook:
- ``on_execution_start`` will run when the notebook client is initialized, before any execution has happened.
**Cell-level hooks**
These hooks are called with two parameters:
- ``cell=NotebookNode``: a reference to the current cell.
- ``cell_index=int``: the index of the cell in the current notebook's list of cells
Here are the available hooks:
- ``on_cell_start`` will run right before each cell is executed.
- ``on_cell_complete`` will run after execution, if the cell is executed with no errors.
- ``on_cell_error`` will run if there is an error during cell execution.

This fleshes out the documentation a bit so that it is clearer which hooks operate on which parts of the document, and so that it's clearer what arguments are passed to the hooks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth mentioning that the hooks can be async, but don't have to be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @choldgraf! Much appreciated.

I removed the line about functions can be async. If it can be async or sync, then I don't see much point in mentioning that

docs/client.rst Outdated
@@ -96,6 +96,15 @@ on both versions. Here the traitlet ``kernel_name`` helps simplify and
maintain consistency: we can just run a notebook twice, specifying first
"python2" and then "python3" as the kernel name.

In addition to the two above, we also support traitlets for hooks. They are as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also document what kinds of arguments are passed to these functions, or what variables are available to them when they are called (e.g., if I wrote a function for this, how would I access the notebook? the current cell? the traceback of an error message if it failed to execute?)

@@ -426,6 +483,7 @@ async def async_start_new_kernel_client(self) -> KernelClient:
await self._async_cleanup_kernel()
raise
self.kc.allow_stdin = False
await run_hook(self.on_execution_start)
Copy link
Contributor

@choldgraf choldgraf Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await run_hook(self.on_execution_start)
await run_hook(self.on_execution_start, notebook=self.nb)

What about something like this so that people have access to the NotebookNode before execution?

nbclient/client.py Outdated Show resolved Hide resolved
@@ -426,6 +483,7 @@ async def async_start_new_kernel_client(self) -> KernelClient:
await self._async_cleanup_kernel()
raise
self.kc.allow_stdin = False
await run_hook(self.on_execution_start)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For symmetry's sake, why not also include two extra notebook-level hooks: on_notebook_complete and on_notebook_error, that map on to the two cell execution hooks but at a notebook level?

That way you would have the same basic pattern for both the notebook, and for each cell in the notebook:

  • hook for just before execution happens (on_notebook_start, on_cell_start)
  • hook for when execution completes (on_notebook_complete, on_cell_complete)
  • hook for when execution has an error (on_notebook_error, on_cell_error)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added on_notebook_error, although I'm not sure if I did that part correctly. The only time I've personally seen the notebook fail is when there's a RuntimeError, so that's where I've placed the hook.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough - and feel free to push back if you think that this is "too many hooks all at once" 😅 on the one hand I like symmetry and intentional design for extension points. On the other hand, I also believe in not building things until somebody actively wants it to be built :-) so if you think this is too much complexity, another option could be to just add comments for where you think the extra hooks could go, and wait to build them until a user explicitly asks for it.

This will enable tracking of execution process without subclassing the
way papermill does.
docs/client.rst Outdated Show resolved Hide resolved
docs/client.rst Outdated Show resolved Hide resolved
docs/client.rst Outdated Show resolved Hide resolved
nbclient/client.py Outdated Show resolved Hide resolved
nbclient/client.py Outdated Show resolved Hide resolved
nbclient/client.py Outdated Show resolved Hide resolved
nbclient/client.py Outdated Show resolved Hide resolved
]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(*tasks))
[async_run_notebook(input_file.format(label=label), opts, res) for label in ("A", "B")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the execution of these tasks, can you explain why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a mistake on my part when I was trying to investigate some failures. I've added those changes back in.

tasks = [async_run_notebook(input_file, opts, res) for i in range(4)]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(*tasks))
[async_run_notebook(input_file, opts, res) for i in range(4)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @devintang3, that looks good to me. A couple of notes, but I think this PR can be merged anyway:

  • maybe use in the tests: hooks = [MagicMock() for i in range(7)].
  • have all the hook help strings start with A callable which executes....

Run_hook is now async and renamed util to test_util so it gets picked up by pytest.
Also added new hooks: on_notebook_error, on_cell_execution
Updated docs
@devintang3
Copy link
Contributor Author

  • maybe use in the tests: hooks = [MagicMock() for i in range(7)].
  • have all the hook help strings start with A callable which executes....

That makes sense to me. I've updated the tests to do a loop and updated the help strings as well

@davidbrochart davidbrochart merged commit 20081a6 into jupyter:master Jan 31, 2022
@davidbrochart
Copy link
Member

Thanks @devintang3 !

@choldgraf
Copy link
Contributor

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants