-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create line magic to debug a node in notebook workflow #3510
Changes from all commits
04410b2
a79386b
a93e5cc
b251fb7
4b31eb9
6c47fe4
feba68b
6dd21fb
d0c0633
ba8fb09
7b041a9
702e017
36a0e2a
dbac325
ff10816
0145154
8a86403
474c0fc
be99c0e
72411be
b4be9a7
1819261
3c343b2
12e7f04
4e95bde
e1a6f39
17e47d1
e8c85c9
a46139e
9487911
6c69a36
d1bb8e1
80e06ad
43a70b6
68c7b97
d3163ed
9adf9ab
a92c462
c5fc1fc
688a4e6
87b43a2
25dec21
fae8aa7
70dc53a
7581c56
28ba1ba
0d69fc6
9d12c8d
78ff496
dbbe81b
e39942c
efb6fa8
de6198d
f4c69cd
c4cb40c
3e9cbf1
b16dda9
c9dc606
bb50792
d4d14d0
fa2951c
90fe58f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,11 +4,13 @@ | |
""" | ||
from __future__ import annotations | ||
|
||
import inspect | ||
import logging | ||
import sys | ||
import typing | ||
import warnings | ||
from pathlib import Path | ||
from typing import Any | ||
from typing import Any, Callable | ||
|
||
import IPython | ||
from IPython.core.magic import needs_local_scope, register_line_magic | ||
|
@@ -19,11 +21,13 @@ | |
from kedro.framework.cli.utils import ENV_HELP, _split_params | ||
from kedro.framework.project import ( | ||
LOGGING, # noqa: F401 | ||
_ProjectPipelines, | ||
configure_project, | ||
pipelines, | ||
) | ||
from kedro.framework.session import KedroSession | ||
from kedro.framework.startup import _is_project, bootstrap_project | ||
from kedro.pipeline.node import Node | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
@@ -36,6 +40,9 @@ def load_ipython_extension(ipython: Any) -> None: | |
See https://ipython.readthedocs.io/en/stable/config/extensions/index.html | ||
""" | ||
ipython.register_magic_function(magic_reload_kedro, magic_name="reload_kedro") | ||
logger.info("Registered line magic 'reload_kedro'") | ||
ipython.register_magic_function(magic_load_node, magic_name="load_node") | ||
logger.info("Registered line magic 'load_node'") | ||
|
||
if _find_kedro_project(Path.cwd()) is None: | ||
logger.warning( | ||
|
@@ -178,3 +185,134 @@ def _find_kedro_project(current_dir: Path) -> Any: # pragma: no cover | |
current_dir = current_dir.parent | ||
|
||
return None | ||
|
||
|
||
@typing.no_type_check | ||
@magic_arguments() | ||
@argument( | ||
"node", | ||
type=str, | ||
help=("Name of the Node."), | ||
nargs="?", | ||
default=None, | ||
) | ||
def magic_load_node(node: str) -> None: | ||
"""The line magic %load_node <node_name> | ||
Currently it only supports Jupyter Notebook (>7.0) and Jupyter Lab. This line magic | ||
will generate code in multiple cells to load datasets from `DataCatalog`, import | ||
relevant functions and modules, node function definition and a function call. | ||
""" | ||
cells = _load_node(node, pipelines) | ||
from ipylab import JupyterFrontEnd | ||
|
||
app = JupyterFrontEnd() | ||
|
||
def _create_cell_with_text(text: str) -> None: | ||
# Noted this only works with Notebook >7.0 or Jupyter Lab. It doesn't work with | ||
# VS Code Notebook due to imcompatible backends. | ||
app.commands.execute("notebook:insert-cell-below") | ||
app.commands.execute("notebook:replace-selection", {"text": text}) | ||
|
||
for cell in cells: | ||
_create_cell_with_text(cell) | ||
|
||
|
||
def _load_node(node_name: str, pipelines: _ProjectPipelines) -> list[str]: | ||
"""Prepare the code to load dataset from catalog, import statements and function body. | ||
|
||
Args: | ||
node_name (str): The name of the node. | ||
|
||
Returns: | ||
list[str]: A list of string which is the generated code, each string represent a | ||
notebook cell. | ||
""" | ||
warnings.warn( | ||
"This is an experimental feature, only Jupyter Notebook (>7.0) & Jupyter Lab " | ||
"are supported. If you encounter unexpected behaviour or would like to suggest " | ||
"feature enhancements, add it under this github issue https://github.com/kedro-org/kedro/issues/3580" | ||
) | ||
node = _find_node(node_name, pipelines) | ||
node_func = node.func | ||
|
||
node_inputs = _prepare_node_inputs(node) | ||
imports = _prepare_imports(node_func) | ||
function_definition = _prepare_function_body(node_func) | ||
function_call = _prepare_function_call(node_func) | ||
|
||
cells: list[str] = [] | ||
cells.append(node_inputs) | ||
cells.append(imports) | ||
cells.append(function_definition) | ||
cells.append(function_call) | ||
return cells | ||
|
||
|
||
def _find_node(node_name: str, pipelines: _ProjectPipelines) -> Node: | ||
for pipeline in pipelines.values(): | ||
try: | ||
found_node: Node = pipeline.filter(node_names=[node_name]).nodes[0] | ||
return found_node | ||
except ValueError: | ||
continue | ||
# If reached the node was not found in the project | ||
raise ValueError( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we want to support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was briefly discussed in last TD. @AhdraMeraliQB originally have an implementation. It runs into issue if function is used twice, nodes are unique but functions do not have to. Fundamentally we want to fix this with the default node name but this was decided to be fixed later. @AhdraMeraliQB I can't find your original PR/commit, feel free to supplement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. really good reasoning, perhaps we could support full classpath |
||
f"Node with name='{node_name}' not found in any pipelines. Remember to specify the node name, not the node function." | ||
) | ||
|
||
|
||
def _prepare_imports(node_func: Callable) -> str: | ||
"""Prepare the import statements for loading a node.""" | ||
python_file = inspect.getsourcefile(node_func) | ||
logger.info(f"Loading node definition from {python_file}") | ||
|
||
# Confirm source file was found | ||
if python_file: | ||
import_statement = [] | ||
with open(python_file) as file: | ||
# Parse any line start with from or import statement | ||
for line in file.readlines(): | ||
if line.startswith("from") or line.startswith("import"): | ||
import_statement.append(line.strip()) | ||
|
||
clean_imports = "\n".join(import_statement).strip() | ||
return clean_imports | ||
else: | ||
raise FileNotFoundError(f"Could not find {node_func.__name__}") | ||
|
||
|
||
def _prepare_node_inputs(node: Node) -> str: | ||
node_func = node.func | ||
signature = inspect.signature(node_func) | ||
|
||
node_inputs = node.inputs | ||
func_params = list(signature.parameters) | ||
|
||
statements = [ | ||
"# Prepare necessary inputs for debugging", | ||
"# All debugging inputs must be defined in your project catalog", | ||
] | ||
|
||
for node_input, func_param in zip(node_inputs, func_params): | ||
statements.append(f'{func_param} = catalog.load("{node_input}")') | ||
|
||
input_statements = "\n".join(statements) | ||
return input_statements | ||
|
||
|
||
def _prepare_function_body(func: Callable) -> str: | ||
source_lines, _ = inspect.getsourcelines(func) | ||
body = "".join(source_lines) | ||
return body | ||
|
||
|
||
def _prepare_function_call(node_func: Callable) -> str: | ||
"""Prepare the text for the function call.""" | ||
func_name = node_func.__name__ | ||
signature = inspect.signature(node_func) | ||
func_params = list(signature.parameters) | ||
|
||
# Construct the statement of func_name(a=1,b=2,c=3) | ||
func_args = ", ".join(func_params) | ||
body = f"""{func_name}({func_args})""" | ||
return body |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying this on IPython (after naming my node, see above) and got an error:
I would find it confusing if
kedro.ipython
didn't work on IPython, just because of the name. On the other hand, I don't think it makes sense for this to be in IPython at all, because the point is to bring the source code and be able to edit it on a cell - something that can't happen on a REPL.If this is to be a Jupyter-only extension, I think we should have it somewhere else. Maybe
kedro.jupyter
? Or even a separatekedro-jupyter
package?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, there are two different things here:
ipylab
, but I see it's there when the user doespip install kedro[jupyter]
👍🏽%load_node clean_statuses
on IPython and nothing happened:My node function is not in
dir()
:so I stand by my previous point: since this is pointless in IPython, I'd move it somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3536
There is a separate ticket for IPython support. Originally I thought the same, but it appears that some people, particular anti-notebook user would still like to have IPython support. It will work slightly different because IPython doesn't have the concept of "cells". It will still be useful if data can get pre-load (or just the code to load the corresponding data from catalog). If you have comments about the design of the feature, feel free to drop comments over there instead.
Re:
kedro.ipython
vskedro.jupyter
- I agree it's a bit confusing, but I think it's okay to leave it inkedro.ipython
since ultimately I plan to add support for IPython. In additionally, we almost use IPython/Jupyter interchangeably because the feature parity is always the same, until now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I don't know how feasible this is, but if it's possible to show a warning like "IPython is not yet supported", that would be golden.