-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove tree from PythonSequentialLinter
#3535
Merged
JCZuurmond
merged 99 commits into
main
from
fix/remove-tree-from-python-sequential-linter
Feb 4, 2025
Merged
Changes from all commits
Commits
Show all changes
99 commits
Select commit
Hold shift + click to select a range
b53e0fd
Let append tree return None
JCZuurmond 108dd7b
Test bidirectionality of appended trees
JCZuurmond 8a22d90
Rename append_tree to attach tree
JCZuurmond 6c0ebcf
Clean test for attaching child tree
JCZuurmond fd2c761
Rewrite test for module propagation
JCZuurmond 2c46a01
Rewrite test for not implemented error
JCZuurmond ca30dcf
Test append globals
JCZuurmond 1d22084
Narrow not implemented test
JCZuurmond 48ddf4e
Test appending globals during attach tree
JCZuurmond 1211dda
Refactor `append_globals` to `extend_globals`
JCZuurmond 043fc6b
Test appending nodes sets parent on node
JCZuurmond b0e39ef
Test appending nodes adds nodes to end of body
JCZuurmond 51358f2
Move append_nodes method up
JCZuurmond 460e88f
Rename append_nodes to attach_nodes
JCZuurmond 357a2e5
Narrow raising not implemented error test
JCZuurmond d9ffd91
Add docstring for attach nodes
JCZuurmond a3c023b
Change defining sources in test
JCZuurmond 1e7a4b9
Update constructing sources
JCZuurmond 249857e
Test PythonLinter with dummy advices
JCZuurmond ba04281
Test linting unparsable python code
JCZuurmond 18e9c98
Test sequential linter with dummy advices
JCZuurmond fb8d6ee
Test linting print(1) sets no globals
JCZuurmond 96ef1ec
Test linting with one global
JCZuurmond e5365ef
Test linting with two globals
JCZuurmond aed3b12
Test linting separate code sources separates globals
JCZuurmond ee148eb
Test appending globals sets global
JCZuurmond 84b5f69
Remove SquentialLinter.make_tree
JCZuurmond d142f7c
Refactor globals linter to fetch globals from body nodes
JCZuurmond 00aad28
Sort globals for consistent testing
JCZuurmond 4c1e79e
Test dummy DFSA Python collector
JCZuurmond f860a67
Test dummy used table Python collector
JCZuurmond 906ba87
Delete dead code `PythonSequentialLinter.process_child_cell`
JCZuurmond feb0a8c
Format imports
JCZuurmond 73991ae
Remove Tree from python sequential linter
JCZuurmond 4cf6d8a
Fix type hinting for classmethod with child classes
JCZuurmond 04fda6f
Let tree loading return failure
JCZuurmond 7e16589
Connect cells using parents
JCZuurmond d2db4b6
Pass inherited tree to notebook linter
JCZuurmond 14b8c45
Format
JCZuurmond 5c28fc9
Disable test that does not reflect a realistic scenario
JCZuurmond 1b8f4b5
Pass run cell's tree as parent to the notebook it is running
JCZuurmond b7998bc
Do not append child nodes to parents body
JCZuurmond 70aa5bf
Use type over Type from type hinting
JCZuurmond c81e6b1
Rename attach_nodes to attach_child_nodes in Python analyzer
JCZuurmond 3fb49a7
Delete test for unrealistic scenario
JCZuurmond 1ff9891
Rename method to parse trees
JCZuurmond 273a40d
Add tests for notebook linter
JCZuurmond 7728719
Test for a table migration deprecation advice to be given
JCZuurmond 772684c
Test for notebook cells to consider only code above
JCZuurmond 2f4c7fa
Test inverse of previous commit
JCZuurmond 5441ec5
Test inverse of reading table from other cell
JCZuurmond d6c0afc
Format
JCZuurmond 435f6c8
Let PythonSequentialLinter inherit correctly
JCZuurmond fd8f04d
Remove PythonSequentialLinter initialization from NotebookLinter init
JCZuurmond 7209aac
Test NotebookLinter to lint parse failure
JCZuurmond f2dbc56
Remove redundant if statement
JCZuurmond 2603d14
Format
JCZuurmond 23e5a49
Rewrite load children from tree
JCZuurmond 8cef0ef
Remove redundant for-loops
JCZuurmond 82f3ad1
Remove unused name method
JCZuurmond ce23e4f
Move load tree from run cell up
JCZuurmond 7176276
Rename methods for consistency
JCZuurmond 768706c
Rename Python tree cache for clarity
JCZuurmond 1f44031
Always cache Python trees
JCZuurmond b80f2d3
Rename methods for clarity
JCZuurmond 54895fb
Rename variable
JCZuurmond 3a8bb37
Add docstrings
JCZuurmond f46955e
Remove redundant tree initialization
JCZuurmond be03c3b
Return failures for each notebook cell
JCZuurmond 3646140
Fix expected start and end line
JCZuurmond b799076
Fix type hint
JCZuurmond da3eb20
Move from_source_code class method to tester class
JCZuurmond cf5679d
Change elif to if
JCZuurmond 63a5d22
Merge branch 'main' into fix/remove-tree-from-python-sequential-linter
JCZuurmond 8113828
Rename inherited tree to parent tree
JCZuurmond ae206b8
Test creating run cell from notebook
JCZuurmond 4b1528f
Test infer value from parent's child
JCZuurmond 5b98b44
Test Python trees simulating notebook running other notebook
JCZuurmond 2e7663f
Test inferring value from grand parent
JCZuurmond 9701f17
Test using variable from ran child notebook
JCZuurmond 02e8cb3
Test infer from parent using extend globals
JCZuurmond 6f8a819
Test infer from grand parent using extend globals
JCZuurmond 47e3273
Fix test name
JCZuurmond e800b7e
Test inferring from sibling tree
JCZuurmond b648b2f
Test simulate using value from child notebook
JCZuurmond 25549ec
Test simulate using value from parent notebook
JCZuurmond 09470e6
Test propagating module with extend globals
JCZuurmond 2a16d90
Let NotebookLinter fail early while parsing
JCZuurmond 5bebad4
Rewrite notebook linter to only extend globals
JCZuurmond f25174a
Add test showing unresolvable node issue
JCZuurmond 0c67f3e
Pass tree globals to next cells tree
JCZuurmond a273997
Merge branch 'main' into fix/remove-tree-from-python-sequential-linter
JCZuurmond 6360e7b
Add assumption to docstring
JCZuurmond 54607a6
Merge branch 'main' into fix/remove-tree-from-python-sequential-linter
JCZuurmond 4afb7e9
Add new line to test sources
JCZuurmond 2c0f32b
Get first advice with next
JCZuurmond c5a7f28
Let process code node return failure
JCZuurmond 02e2254
Merge branch 'main' into fix/remove-tree-from-python-sequential-linter
JCZuurmond 35285e3
Merge branch 'main' into fix/remove-tree-from-python-sequential-linter
JCZuurmond File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
253 changes: 119 additions & 134 deletions
253
src/databricks/labs/ucx/source_code/notebooks/sources.py
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,6 +20,7 @@ | |
Import, | ||
ImportFrom, | ||
Instance, | ||
JoinedStr, | ||
Module, | ||
Name, | ||
NodeNG, | ||
|
@@ -220,30 +221,31 @@ def __repr__(self): | |
def attach_child_tree(self, tree: Tree) -> None: | ||
"""Attach a child tree. | ||
|
||
Attaching a child tree is a **stateful** operation for both the parent and child tree. After attaching a child | ||
tree, a tree can be traversed starting from the parent or child tree. From both starting points all nodes in | ||
both trees can be reached, though, the order of nodes will be different as that is relative to the starting | ||
point. | ||
1. Make parent tree of the nodes in the child tree | ||
2. Extend parents globals with child globals | ||
|
||
Attaching a child tree is a **stateful** operation for the child tree. After attaching a child | ||
tree, the tree can be traversed starting from the child tree as a child knows its parent. However, the tree can | ||
not be traversed from the parent tree as that node object does not contain a list with children trees. | ||
""" | ||
if not isinstance(tree.node, Module): | ||
raise NotImplementedError(f"Cannot attach child tree: {type(tree.node).__name__}") | ||
tree_module: Module = cast(Module, tree.node) | ||
self.attach_nodes(tree_module.body) | ||
self.attach_child_nodes(tree_module.body) | ||
self.extend_globals(tree_module.globals) | ||
|
||
def attach_nodes(self, nodes: list[NodeNG]) -> None: | ||
"""Attach nodes. | ||
def attach_child_nodes(self, nodes: list[NodeNG]) -> None: | ||
"""Attach child nodes. | ||
|
||
Attaching nodes is a **stateful** operation for both this tree's node, the parent node, and the child nodes. | ||
After attaching the nodes, the parent node has the nodes in its body and the child nodes have this tree's node | ||
as parent node. | ||
Attaching a child tree is a **stateful** operation for the child tree. After attaching a child | ||
tree, the tree can be traversed starting from the child tree as a child knows its parent. However, the tree can | ||
not be traversed from the parent tree as that node object does not contain a list with children trees. | ||
""" | ||
if not isinstance(self.node, Module): | ||
raise NotImplementedError(f"Cannot attach nodes to: {type(self.node).__name__}") | ||
self_module: Module = cast(Module, self.node) | ||
for node in nodes: | ||
node.parent = self_module | ||
self_module.body.append(node) | ||
|
||
def extend_globals(self, globs: dict[str, list[NodeNG]]) -> None: | ||
"""Extend globals by extending the global values for each global key. | ||
|
@@ -559,6 +561,11 @@ def visit_importfrom(self, node: ImportFrom) -> None: | |
return | ||
self._matched_nodes.append(node) | ||
|
||
def visit_joinedstr(self, node: JoinedStr) -> None: | ||
if self._node_type is not JoinedStr: | ||
return | ||
self._matched_nodes.append(node) | ||
|
||
def _matches(self, node: NodeNG, depth: int) -> bool: | ||
if depth >= len(self._match_nodes): | ||
return False | ||
|
@@ -674,7 +681,8 @@ def collect_dfsas(self, source_code: str) -> Iterable[DirectFsAccess]: | |
def collect_dfsas_from_tree(self, tree: Tree) -> Iterable[DirectFsAccessNode]: ... | ||
|
||
|
||
class PythonSequentialLinter(Linter, DfsaCollector, TableCollector): | ||
class PythonSequentialLinter(PythonLinter, DfsaPyCollector, TablePyCollector): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Big clean up in this class! |
||
"""A linter for sequencing python linters and collectors.""" | ||
|
||
def __init__( | ||
self, | ||
|
@@ -685,74 +693,15 @@ def __init__( | |
self._linters = linters | ||
self._dfsa_collectors = dfsa_collectors | ||
self._table_collectors = table_collectors | ||
self._tree: Tree | None = None | ||
|
||
def lint(self, code: str) -> Iterable[Advice]: | ||
maybe_tree = self._parse_and_append(code) | ||
if maybe_tree.failure: | ||
yield maybe_tree.failure | ||
return | ||
assert maybe_tree.tree is not None | ||
yield from self.lint_tree(maybe_tree.tree) | ||
|
||
def lint_tree(self, tree: Tree) -> Iterable[Advice]: | ||
for linter in self._linters: | ||
yield from linter.lint_tree(tree) | ||
|
||
def _parse_and_append(self, code: str) -> MaybeTree: | ||
maybe_tree = MaybeTree.from_source_code(code) | ||
if maybe_tree.failure: | ||
return maybe_tree | ||
assert maybe_tree.tree is not None | ||
self.append_tree(maybe_tree.tree) | ||
return maybe_tree | ||
|
||
def append_tree(self, tree: Tree) -> None: | ||
self._make_tree().attach_child_tree(tree) | ||
|
||
def append_nodes(self, nodes: list[NodeNG]) -> None: | ||
self._make_tree().attach_nodes(nodes) | ||
|
||
def append_globals(self, globs: dict) -> None: | ||
self._make_tree().extend_globals(globs) | ||
|
||
def process_child_cell(self, code: str) -> None: | ||
this_tree = self._make_tree() | ||
maybe_tree = MaybeTree.from_source_code(code) | ||
if maybe_tree.failure: | ||
# TODO: bubble up this error | ||
logger.warning(maybe_tree.failure.message) | ||
return | ||
assert maybe_tree.tree is not None | ||
this_tree.attach_child_tree(maybe_tree.tree) | ||
|
||
def collect_dfsas(self, source_code: str) -> Iterable[DirectFsAccess]: | ||
maybe_tree = self._parse_and_append(source_code) | ||
if maybe_tree.failure: | ||
logger.warning(maybe_tree.failure.message) | ||
return | ||
assert maybe_tree.tree is not None | ||
for dfsa_node in self.collect_dfsas_from_tree(maybe_tree.tree): | ||
yield dfsa_node.dfsa | ||
|
||
def collect_dfsas_from_tree(self, tree: Tree) -> Iterable[DirectFsAccessNode]: | ||
for collector in self._dfsa_collectors: | ||
yield from collector.collect_dfsas_from_tree(tree) | ||
|
||
def collect_tables(self, source_code: str) -> Iterable[UsedTable]: | ||
maybe_tree = self._parse_and_append(source_code) | ||
if maybe_tree.failure: | ||
logger.warning(maybe_tree.failure.message) | ||
return | ||
assert maybe_tree.tree is not None | ||
for table_node in self.collect_tables_from_tree(maybe_tree.tree): | ||
yield table_node.table | ||
|
||
def collect_tables_from_tree(self, tree: Tree) -> Iterable[UsedTableNode]: | ||
for collector in self._table_collectors: | ||
yield from collector.collect_tables_from_tree(tree) | ||
|
||
def _make_tree(self) -> Tree: | ||
if self._tree is None: | ||
self._tree = Tree.new_module() | ||
return self._tree |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a notable change! It avoids duplicates as adding nodes to the body of the parent tree duplicates the nodes (over all trees)