Description
In the case of decision trees with only a single root node, the traverse function called by tupple_tree_conversion assumes the root node has children:
def traverse(tree, visited):
for child in tree:
visit = find_to_condition(visited, child)
if len(child.children) > 0:
visit = traverse(child.children, visited)
return visit
This leads to the following exception for a tree that consists of only a single root node with no children:
Traceback (most recent call last):
File "/home/anj/src/paper-decision-trees/tree_diff2/experiment2.py", line 190, in eval_keep_regrow
similarity = rule_set_similarity(tuple_tree_conversion(full_clf), tuple_tree_conversion(batch_tree))
File "/home/anj/src/paper-decision-trees/tree_diff2/tree_diff/tree_ruleset_conversion.py", line 69, in tuple_tree_conversion
expected = link_dict_keys(traverse(tree.children, visited))
File "/home/anj/src/paper-decision-trees/tree_diff2/tree_diff/tree_ruleset_conversion.py", line 38, in traverse
return visit
UnboundLocalError: local variable 'visit' referenced before assignment
The traverse function (and associated functions) also need code cleanup:
- The
find_to_condition
function called bytraverse
modifies thevisited
dictionary in place, sotraverse
could just returnvisited
(which will be the same asvisit
). - The
find_to_condition
function andlink_dict_keys
functions create dictionary keys through string manipulation, which is difficult to make sense of. At a minimum they need documentation (code comments).
In the case of EFDT (which uses a different conversion function), there rule_set_similarity throws a division by zero error when the ruleset is of zero length.
def rule_set_similarity(ruleset1: Ruleset, ruleset2: Ruleset):
…
l = len(ruleset1.rules)
…
return sum(sim_d_list) / l
This leads to a ZeroDivisionError (the code has been wrapped in a try/except block as a temporary workaround):
Warn: caught division by zero in rule_set_similarity
See the following notebook for a demonstration of the issue: https://github.com/a2i2/tree_diff/blob/re-evaluation/notebooks/Similarity%20Score%20Issues%20-%20Exception%20for%20trees%20with%20single%20node.ipynb