Client hooks #188

devintang3 · 2021-12-28T22:23:41Z

Continuation of #79

I rebased with master and added some tests. I occasionally get errors on test_many_parallel_notebooks which I don't think is related, but I'll also get it on test_error_cell_hooks in py38. It's an intermittent error that gets resolved if I add a sleep, so I think it's something related to the async task not running right away. Any guidance here would be appreciated.

I also removed py36 from tox since it looked like the minimum version is now Python3.7 and it was just hanging when I tried running it.

davidbrochart

Thanks a lot for this PR @devintang3.
Would you like to add these new features to the documentation also?

nbclient/client.py

nbclient/tests/test_client.py

davidbrochart · 2022-01-03T16:11:37Z

nbclient/util.py

+        loop = asyncio.get_event_loop()
+        hook_with_kwargs = partial(hook, **kwargs)
+        future = loop.run_in_executor(None, hook_with_kwargs)
+    asyncio.ensure_future(future)


I don't understand, does it mean that hooks are going to be launched in the background? BTW, it would be nice to test async hooks.
I think run_hook should be async:

async def run_hook(hook: Optional[Callable], **kwargs) -> None: await ensure_async(hook(**kwargs))

Yeah, the intent is for the hook to be launched in the background.

devintang3 · 2022-01-08T00:11:53Z

Thanks for the suggestions. I've applied your recommendations as well as added more tests.
Some changes to note:

tests/util.py has been renamed to tests/test_util.py so that pytest can pick it up automatically
Added the package pytest-asyncio to requirements-dev.txt so I can test some async methods
Updated docs
The pre-commit hook was complaining about unused variables so I've removed them from test_client.py. I'm not quite sure what the purpose of the loop = asyncio.get_event_loop() was

nbclient/client.py

nbclient/util.py

davidbrochart · 2022-01-12T08:57:13Z

Could you rebase on master? It looks like you merged master into your branch instead.

chrisjsewell · 2022-01-13T08:15:41Z

nbclient/client.py

@@ -805,11 +865,13 @@ async def async_execute_cell(
            self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
        )

+        await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)


while we are at it, perhaps there could be a hook at the very start of this function, before even skipping non-executing cells, like on_cell_visit

Good point, that would allow pre-processing Markdown cells.

Does it make sense to just move on_cell_start to the start instead of creating a new hook?

That would make sense, and we know if the cell is a Markdown cell with cell.cell_type.
But then maybe add on_cell_execute right before the execution? It won't be called if the cell is not a code cell or if the cell is skipped because of a tag.

I moved on_cell_start towards the top so it'll execute for non-code cells and added a new hook called on_cell_execute

choldgraf

I think this is a really nice extension for nbclient - I have a few comments and suggestions, mostly around the API design and the documentation to make it clear how to use it

choldgraf · 2022-01-13T20:28:23Z

docs/client.rst

+In addition to the two above, we also support traitlets for hooks. They are as
+follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``,
+``on_cell_error``. These traitlets allow specifying a ``Callable`` function,
+which will run at certain points during the notebook execution and is executed asynchronously.
+``on_execution_start`` will run when the notebook client is kicked off.
+``on_cell_start`` will run right before each cell is executed.
+``on_cell_complete`` will run right after the cell is executed.
+``on_cell_error`` will run if there is an error in the cell.
+


Suggested change

In addition to the two above, we also support traitlets for hooks. They are as

follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``,

``on_cell_error``. These traitlets allow specifying a ``Callable`` function,

which will run at certain points during the notebook execution and is executed asynchronously.

``on_execution_start`` will run when the notebook client is kicked off.

``on_cell_start`` will run right before each cell is executed.

``on_cell_complete`` will run right after the cell is executed.

``on_cell_error`` will run if there is an error in the cell.

Hooks before and after cell execution

~~~~~~~~~~~~~~~~~~~~~~~~

There are several configurable hooks that allow the user to execute code before and

after a cell is executed. Each one is configured with a function that will be called in its

respective place in the cell execution pipeline. These function calls are **asynchronous**.

Each is described below:

**Notebook-level hooks**

These hooks are called with a single extra parameter:

- ``notebook=NotebookNode``: the current notebook being executed.

Here is the available hook:

- ``on_execution_start`` will run when the notebook client is initialized, before any execution has happened.

**Cell-level hooks**

These hooks are called with two parameters:

- ``cell=NotebookNode``: a reference to the current cell.

- ``cell_index=int``: the index of the cell in the current notebook's list of cells

Here are the available hooks:

- ``on_cell_start`` will run right before each cell is executed.

- ``on_cell_complete`` will run after execution, if the cell is executed with no errors.

- ``on_cell_error`` will run if there is an error during cell execution.

This fleshes out the documentation a bit so that it is clearer which hooks operate on which parts of the document, and so that it's clearer what arguments are passed to the hooks.

Maybe worth mentioning that the hooks can be async, but don't have to be?

Thanks @choldgraf! Much appreciated.

I removed the line about functions can be async. If it can be async or sync, then I don't see much point in mentioning that

choldgraf · 2022-01-13T20:29:13Z

docs/client.rst

@@ -96,6 +96,15 @@ on both versions. Here the traitlet ``kernel_name`` helps simplify and
 maintain consistency: we can just run a notebook twice, specifying first
 "python2" and then "python3" as the kernel name.

+In addition to the two above, we also support traitlets for hooks. They are as


We should also document what kinds of arguments are passed to these functions, or what variables are available to them when they are called (e.g., if I wrote a function for this, how would I access the notebook? the current cell? the traceback of an error message if it failed to execute?)

choldgraf · 2022-01-13T21:18:25Z

nbclient/client.py

@@ -426,6 +483,7 @@ async def async_start_new_kernel_client(self) -> KernelClient:
            await self._async_cleanup_kernel()
            raise
        self.kc.allow_stdin = False
+        await run_hook(self.on_execution_start)


Suggested change

await run_hook(self.on_execution_start)

await run_hook(self.on_execution_start, notebook=self.nb)

What about something like this so that people have access to the NotebookNode before execution?

nbclient/client.py

choldgraf · 2022-01-13T21:25:38Z

nbclient/client.py

@@ -426,6 +483,7 @@ async def async_start_new_kernel_client(self) -> KernelClient:
            await self._async_cleanup_kernel()
            raise
        self.kc.allow_stdin = False
+        await run_hook(self.on_execution_start)


For symmetry's sake, why not also include two extra notebook-level hooks: on_notebook_complete and on_notebook_error, that map on to the two cell execution hooks but at a notebook level?

That way you would have the same basic pattern for both the notebook, and for each cell in the notebook:

hook for just before execution happens (on_notebook_start, on_cell_start)

hook for when execution completes (on_notebook_complete, on_cell_complete)

hook for when execution has an error (on_notebook_error, on_cell_error)

Added on_notebook_error, although I'm not sure if I did that part correctly. The only time I've personally seen the notebook fail is when there's a RuntimeError, so that's where I've placed the hook.

fair enough - and feel free to push back if you think that this is "too many hooks all at once" 😅 on the one hand I like symmetry and intentional design for extension points. On the other hand, I also believe in not building things until somebody actively wants it to be built :-) so if you think this is too much complexity, another option could be to just add comments for where you think the extra hooks could go, and wait to build them until a user explicitly asks for it.

This will enable tracking of execution process without subclassing the way papermill does.

docs/client.rst

nbclient/client.py

davidbrochart · 2022-01-20T08:20:10Z

nbclient/tests/test_client.py

-        ]
-        loop = asyncio.get_event_loop()
-        loop.run_until_complete(asyncio.gather(*tasks))
+        [async_run_notebook(input_file.format(label=label), opts, res) for label in ("A", "B")]


You removed the execution of these tasks, can you explain why?

I think this was a mistake on my part when I was trying to investigate some failures. I've added those changes back in.

davidbrochart · 2022-01-20T08:20:27Z

nbclient/tests/test_client.py

-    tasks = [async_run_notebook(input_file, opts, res) for i in range(4)]
-    loop = asyncio.get_event_loop()
-    loop.run_until_complete(asyncio.gather(*tasks))
+    [async_run_notebook(input_file, opts, res) for i in range(4)]


davidbrochart

Thanks a lot @devintang3, that looks good to me. A couple of notes, but I think this PR can be merged anyway:

maybe use in the tests: hooks = [MagicMock() for i in range(7)].
have all the hook help strings start with A callable which executes....

Run_hook is now async and renamed util to test_util so it gets picked up by pytest. Also added new hooks: on_notebook_error, on_cell_execution Updated docs

devintang3 · 2022-01-25T23:37:46Z

maybe use in the tests: hooks = [MagicMock() for i in range(7)].

have all the hook help strings start with A callable which executes....

That makes sense to me. I've updated the tests to do a loop and updated the help strings as well

davidbrochart · 2022-01-31T22:14:34Z

Thanks @devintang3 !

choldgraf · 2022-02-01T00:29:20Z

🚀

devintang3 force-pushed the client-hooks branch from 85ed4b0 to 4f46196 Compare December 29, 2021 19:39

davidbrochart requested changes Jan 3, 2022

View reviewed changes

devintang3 force-pushed the client-hooks branch from 4f46196 to 8b54421 Compare January 8, 2022 00:07

davidbrochart requested changes Jan 12, 2022

View reviewed changes

nbclient/client.py Outdated Show resolved Hide resolved

nbclient/util.py Outdated Show resolved Hide resolved

devintang3 force-pushed the client-hooks branch 2 times, most recently from 207cac2 to 1de1dbd Compare January 13, 2022 01:31

davidbrochart mentioned this pull request Jan 13, 2022

PoP for Inline variable insertion in markdown #160

Closed

chrisjsewell reviewed Jan 13, 2022

View reviewed changes

choldgraf reviewed Jan 13, 2022

View reviewed changes

Add basic hooks during execution

c68b700

This will enable tracking of execution process without subclassing the way papermill does.

devintang3 force-pushed the client-hooks branch from 1de1dbd to 3892c04 Compare January 19, 2022 23:27

davidbrochart requested changes Jan 20, 2022

View reviewed changes

devintang3 force-pushed the client-hooks branch from 3892c04 to 20f4289 Compare January 24, 2022 22:46

davidbrochart approved these changes Jan 25, 2022

View reviewed changes

Rebased with master and added tests

51c3ece

Run_hook is now async and renamed util to test_util so it gets picked up by pytest. Also added new hooks: on_notebook_error, on_cell_execution Updated docs

devintang3 force-pushed the client-hooks branch from 20f4289 to 51c3ece Compare January 25, 2022 23:36

davidbrochart approved these changes Jan 26, 2022

View reviewed changes

davidbrochart merged commit 20081a6 into jupyter:master Jan 31, 2022

choldgraf mentioned this pull request Feb 4, 2022

Refactor for clarity that deployer/tests aren't just developer tests 2i2c-org/infrastructure#971

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client hooks #188

Client hooks #188

devintang3 commented Dec 28, 2021

davidbrochart left a comment

davidbrochart Jan 3, 2022

devintang3 Jan 8, 2022

devintang3 commented Jan 8, 2022

davidbrochart commented Jan 12, 2022

chrisjsewell Jan 13, 2022

davidbrochart Jan 13, 2022

devintang3 Jan 13, 2022

davidbrochart Jan 14, 2022

devintang3 Jan 19, 2022

choldgraf left a comment

choldgraf Jan 13, 2022

davidbrochart Jan 14, 2022

devintang3 Jan 19, 2022

choldgraf Jan 13, 2022

choldgraf Jan 13, 2022 •

edited

Loading

choldgraf Jan 13, 2022

devintang3 Jan 19, 2022

choldgraf Jan 19, 2022

davidbrochart Jan 20, 2022

devintang3 Jan 24, 2022

davidbrochart Jan 20, 2022

davidbrochart left a comment

devintang3 commented Jan 25, 2022

davidbrochart commented Jan 31, 2022

choldgraf commented Feb 1, 2022

-In addition to the two above, we also support traitlets for hooks. They are as
-follows: ``on_execution_start``, ``on_cell_start``, ``on_cell_complete``,
-``on_cell_error``. These traitlets allow specifying a ``Callable`` function,
-which will run at certain points during the notebook execution and is executed asynchronously.
-``on_execution_start`` will run when the notebook client is kicked off.
-``on_cell_start`` will run right before each cell is executed.
-``on_cell_complete`` will run right after the cell is executed.
-``on_cell_error`` will run if there is an error in the cell.
+Hooks before and after cell execution
+~~~~~~~~~~~~~~~~~~~~~~~~
+There are several configurable hooks that allow the user to execute code before and
+after a cell is executed. Each one is configured with a function that will be called in its
+respective place in the cell execution pipeline. These function calls are **asynchronous**.
+Each is described below:
+**Notebook-level hooks**
+These hooks are called with a single extra parameter:
+ - ``notebook=NotebookNode``: the current notebook being executed.
+Here is the available hook:
+- ``on_execution_start`` will run when the notebook client is initialized, before any execution has happened.
+**Cell-level hooks**
+These hooks are called with two parameters:
+- ``cell=NotebookNode``: a reference to the current cell.
+- ``cell_index=int``: the index of the cell in the current notebook's list of cells
+Here are the available hooks:
+- ``on_cell_start`` will run right before each cell is executed.
+- ``on_cell_complete`` will run after execution, if the cell is executed with no errors.
+- ``on_cell_error`` will run if there is an error during cell execution.

	await run_hook(self.on_execution_start)
	await run_hook(self.on_execution_start, notebook=self.nb)

Client hooks #188

Client hooks #188

Conversation

devintang3 commented Dec 28, 2021

davidbrochart left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

devintang3 commented Jan 8, 2022

davidbrochart commented Jan 12, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

choldgraf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

choldgraf Jan 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidbrochart left a comment

Choose a reason for hiding this comment

devintang3 commented Jan 25, 2022

davidbrochart commented Jan 31, 2022

choldgraf commented Feb 1, 2022

choldgraf Jan 13, 2022 •

edited

Loading