Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-106922: Support multi-line error locations in traceback (attempt 2) #112097

Merged
merged 6 commits into from
Dec 1, 2023

Conversation

williamwen42
Copy link
Contributor

@williamwen42 williamwen42 commented Nov 15, 2023

Attempt 2 at #109589 - the C changes are no longer required, so only python files and tests are touched. Otherwise, the method to output multi-line errors remains unchanged compared to the original PR.

@bedevere-app
Copy link

bedevere-app bot commented Nov 15, 2023

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@williamwen42
Copy link
Contributor Author

cc @pablogsal @1st1 @lysnikolaou

@pablogsal
Copy link
Member

@williamwen42 Seems that there are a bunch of doctests that still fail, can you take a look?

@williamwen42 williamwen42 force-pushed the multiline-traceback-attempt-2 branch from 9d9a510 to 4094c93 Compare November 17, 2023 19:32
Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good. Great work. I made a first pass over the code.

Lib/traceback.py Outdated Show resolved Hide resolved
Lib/traceback.py Outdated Show resolved Hide resolved
@@ -328,13 +332,27 @@ def _original_line(self):
self.line
return self._line

@property
def _dedented_lines(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also strips the line, not only dedents, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lib/traceback.py Outdated Show resolved Hide resolved
Lib/traceback.py Outdated Show resolved Hide resolved
Lib/traceback.py Outdated Show resolved Hide resolved
result = []

# only display first line, last line, and lines around anchor start/end
significant_lines = {0, len(all_lines) - 1}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very convinced about this significant_lines treatment. I have the following concerns:

  • It increases the complexity of the traceback
  • There are plenty of cases where this hides seemingly arbitrary lines. Consider this code:
def foo(a,b,c,d,e,f,g):
    bar(b)

def bar(b):
    1/b

foo(
        1,
        0,
        3,
        4,
        5,
        6,
        7)

Currently this shows:

Traceback (most recent call last):
  File "/Users/pgalindo3/github/python/main/lel.py", line 7, in <module>
    foo(
    ~~~^
            1,
            ^^
    ...<4 lines>...
            6,
            ^^
            7)
            ^^
  File "/Users/pgalindo3/github/python/main/lel.py", line 2, in foo
    bar(b)
    ~~~^^^
  File "/Users/pgalindo3/github/python/main/lel.py", line 5, in bar
    1/b
    ~^~
ZeroDivisionError: division by zero

but it's not clear to me why we are hiding 4 lines there. Why are the arguments provided there not important? If you look closely what's causing the exception is that b is 0, but that's not being shown in the call to foo, but is the most important part of the call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although I agree with the general direction of the code to hide some lines, in case the error spans many lines and the traceback becomes unreadable. Should we maybe think of another heuristic to figure out which lines to show and which not to show?

Copy link
Contributor Author

@williamwen42 williamwen42 Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need some kind of heuristic - I encountered a case where a traceback through a with statement results in the entire block being printed because the instruction being traced points to the entire block.

Perhaps we can show all lines only for binary ops/subscripts/function calls?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of the with statement out of curiosity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code:

class C:
    def __enter__(self):
        raise RuntimeError("fail")

    def __exit__(self, *excinfo):
        pass

with C():
    # comment
    # comment
    # comment
    # comment
    # comment
    # comment
    # comment
    # comment
    # comment
    pass

Output:

Traceback (most recent call last):
  File "/data/users/williamwen/cpython/local_test.py", line 8, in <module>
    with C():
    ...<9 lines>...
        pass
  File "/data/users/williamwen/cpython/local_test.py", line 3, in __enter__
    raise RuntimeError("fail")
RuntimeError: fail

So if we don't skip lines, we would print out all those comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hummm, I think this may be a bug in the code locations more than a problem in the implementation. But I am fine going with this version initially and refine it if people find it confusing. @lysnikolaou WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, let's go with this and we can refine it in the future if the need arises.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, looks like all the bytecode instruction for the with include all the comments:

[]
Instruction(opname='RESUME', opcode=151, arg=0, argval=0, argrepr='', offset=0, starts_line=0, is_jump_target=False, positions=Positions(lineno=0, end_lineno=1, col_offset=0, end_col_offset=0))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='PUSH_NULL', opcode=2, arg=None, argval=None, argrepr='', offset=2, starts_line=1, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='LOAD_BUILD_CLASS', opcode=71, arg=None, argval=None, argrepr='', offset=4, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='LOAD_CONST', opcode=100, arg=0, argval=<code object C at 0x1049211b0, file "<string>", line 1>, argrepr='<code object C at 0x1049211b0, file "<string>", line 1>', offset=6, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='MAKE_FUNCTION', opcode=132, arg=0, argval=0, argrepr='', offset=8, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval='C', argrepr="'C'", offset=10, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='PRECALL', opcode=166, arg=2, argval=2, argrepr='', offset=12, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='CALL', opcode=171, arg=2, argval=2, argrepr='', offset=16, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
['class C:', '    def __enter__(self):', '        raise RuntimeError("fail")', '', '    def __exit__(self, *excinfo):']
Instruction(opname='STORE_NAME', opcode=90, arg=0, argval='C', argrepr='C', offset=26, starts_line=None, is_jump_target=False, positions=Positions(lineno=1, end_lineno=6, col_offset=0, end_col_offset=12))
[]
Instruction(opname='PUSH_NULL', opcode=2, arg=None, argval=None, argrepr='', offset=28, starts_line=8, is_jump_target=False, positions=Positions(lineno=8, end_lineno=8, col_offset=5, end_col_offset=6))
[]
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='C', argrepr='C', offset=30, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=8, col_offset=5, end_col_offset=6))
[]
Instruction(opname='PRECALL', opcode=166, arg=0, argval=0, argrepr='', offset=32, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=8, col_offset=5, end_col_offset=8))
[]
Instruction(opname='CALL', opcode=171, arg=0, argval=0, argrepr='', offset=36, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=8, col_offset=5, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='BEFORE_WITH', opcode=53, arg=None, argval=None, argrepr='', offset=46, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='POP_TOP', opcode=1, arg=None, argval=None, argrepr='', offset=48, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
[]
Instruction(opname='NOP', opcode=9, arg=None, argval=None, argrepr='', offset=50, starts_line=18, is_jump_target=False, positions=Positions(lineno=18, end_lineno=18, col_offset=4, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval=None, argrepr='None', offset=52, starts_line=8, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval=None, argrepr='None', offset=54, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval=None, argrepr='None', offset=56, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='PRECALL', opcode=166, arg=2, argval=2, argrepr='', offset=58, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='CALL', opcode=171, arg=2, argval=2, argrepr='', offset=62, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='POP_TOP', opcode=1, arg=None, argval=None, argrepr='', offset=72, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval=None, argrepr='None', offset=74, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=76, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='PUSH_EXC_INFO', opcode=35, arg=None, argval=None, argrepr='', offset=78, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='WITH_EXCEPT_START', opcode=49, arg=None, argval=None, argrepr='', offset=80, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='POP_JUMP_FORWARD_IF_TRUE', opcode=115, arg=4, argval=92, argrepr='to 92', offset=82, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))
['with C():', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment', '    # comment']
Instruction(opname='RERAISE', opcode=119, arg=2, argval=2, argrepr='', offset=84, starts_line=None, is_jump_target=False, positions=Positions(lineno=8, end_lineno=18, col_offset=0, end_col_offset=8))

@iritkatriel I think this qualifies as a bug. Fo example the CALL for C() in with C() has end_lineno setted to the last comment in the block.

Lib/traceback.py Outdated
# Binary ops, subscripts, and calls are expressions, so
# we can wrap them with parentheses to parse them as
# (possibly multi-line) expressions.
tree = ast.parse("(\n" + segment + "\n)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the problem here, can you provide some examples. Also, isn't this something we can solve with mode= "eval"?

Copy link
Contributor Author

@williamwen42 williamwen42 Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we might have to parse lines such as

x = (
    a +
    b
)

The anchor parsing code will only parse

    a +
    b

which is not a valid expression because of the newline, so not even using eval mode will help.
We need to wrap in parentheses so that the segment is a proper expression.

(
    a +
    b
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this going to alter the line locations for single-line expressions?

  • Add the example to the comment so is clearer
  • Let's simplify to:
Suggested change
tree = ast.parse("(\n" + segment + "\n)")
tree = ast.parse(f"(\n{segment}\n)")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this going to alter the line locations for single-line expressions?

My understanding is that there's some post-processing of the tree, which handles this, which looks okay to me.

Lib/traceback.py Outdated Show resolved Hide resolved
Lib/traceback.py Show resolved Hide resolved
@pablogsal
Copy link
Member

@lysnikolaou Can you take a look when you have some time? This is a considerable ammount of code and I think it needs extra eyes

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look very good in general indeed. Thanks @williamwen42! I left some comments, some require further discussion.

Lib/traceback.py Outdated
@property
def line(self):
if self._line is None:
if self.lineno is None:
return None
self._line = linecache.getline(self.filename, self.lineno)
return self._line.strip()
end_lineno = self.lineno if self.end_lineno is None else self.end_lineno
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not needed, since it's being handled in __init__, and self.end_lineno can be used directly, right?

Lib/traceback.py Outdated Show resolved Hide resolved
Lib/traceback.py Outdated
Comment on lines 530 to 533
start_offset -= dedent_characters
end_offset -= dedent_characters
start_offset = max(0, start_offset)
end_offset = max(0, end_offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
start_offset -= dedent_characters
end_offset -= dedent_characters
start_offset = max(0, start_offset)
end_offset = max(0, end_offset)
start_offset = max(0, start_offset - dedent_characters)
end_offset = max(0, end_offset - dedent_characters)

Lib/traceback.py Outdated
segment = segment[start_offset:len(segment) - (len(all_lines[-1]) - end_offset)]

# attempt to parse for anchors
anchors: Optional[_Anchors] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not put type hints in a completely untyped part of the codebase. Optional is also undefined here.

Lib/test/test_traceback.py Show resolved Hide resolved
@williamwen42 williamwen42 force-pushed the multiline-traceback-attempt-2 branch from 4094c93 to 425c874 Compare November 21, 2023 22:04
@williamwen42
Copy link
Contributor Author

@pablogsal @lysnikolaou can I get a re-review?

@pablogsal
Copy link
Member

Will make another pass this week. Thanks for your patience @williamwen42!

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This is a fantastic job @williamwen42 thanks a lot for your patience and for following along with us. You rock 🤘

I left a nit comment but after @lysnikolaou takes a look I think we can land it.

A heads up: it's possible that we may require some changes from now to the first beta if some users or core devs find some aspects of this confusing or not very legible, but we can address those once we have more concrete aspects to discuss.

@pablogsal
Copy link
Member

@lysnikolaou can you take a look when you have some time?

@lysnikolaou
Copy link
Member

@lysnikolaou can you take a look when you have some time?

On it.

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this! Thanks @williamwen42 for doing this. Great work! 🚀

@pablogsal pablogsal merged commit 939fc6d into python:main Dec 1, 2023
29 checks passed
@pablogsal
Copy link
Member

Merged! Congratulations @williamwen42! Thanks a lot for your contribution

aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants