gh-90810: Use code.co_qualname to provide richer information #31150

P403n1x87 · 2022-02-05T21:30:42Z

Use the code.co_qualname field to provide richer information in output generation, such as tracebacks, profiles, debugging etc...

Issue: Use code.co_qualname to provide richer information #90810

sweeneyde · 2022-02-05T22:17:54Z

It might be easier to review and get this merged if it was a couple of smaller PRs that touch fewer files each. Maybe just change tb_printinternal in traceback.c (probably already making most of the desired impact for users) and the associated tests in this PR, then open other PRs as necessary? E.g. one for asyncio-related things, one for tracing/profilers, etc.

In particular, asyncore is deprecated and so it probably shouldn't change, and IIRC we usually like to separate IDLE changes on their own.

P403n1x87 · 2022-02-06T00:53:32Z

It might be easier to review and get this merged if it was a couple of smaller PRs that touch fewer files each.

👍 Makes sense and I was already thinking about this. I just wanted to explore how far this could have gone. I'll reduce this PR to just the changes in traceback.c and leave the rest to subsequent PRs.

terryjreedy

See issue for general comments.

One way to split the PR would be separate PRs for changes related to tracebacks, profiles, and 'other'.

Lib/test/test_logging.py

Lib/idlelib/stackviewer.py

bedevere-bot · 2022-02-06T02:50:42Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

P403n1x87 · 2022-02-06T18:14:26Z

I have made the requested changes; please review again

bedevere-bot · 2022-02-06T18:14:29Z

Thanks for making the requested changes!

@terryjreedy: please review the changes made to this pull request.

Lib/test/test_traceback.py

Python/traceback.c

Use the code.co_qualname field to provide richer information in tracing output generation.

Python/traceback.c

gvanrossum · 2022-03-17T22:06:53Z

I have some doubts. The number of tests you had to fix here makes me wonder how many people are parsing tracebacks and find that this breaks their code. And I'm not sure how valuable the added info is given that the line number already takes you to the right place. Then again, I am not sure how much this should weigh against the added convenience in certain cases.

PS. Now that review has started, please don't force-push any more. It confuses reviewers. We squash all PRs when they get merged into main, so it's not needed anyway. Thanks!

P403n1x87 · 2022-03-18T10:02:33Z

I have some doubts. The number of tests you had to fix here makes me wonder how many people are parsing tracebacks and find that this breaks their code. And I'm not sure how valuable the added info is given that the line number already takes you to the right place. Then again, I am not sure how much this should weigh against the added convenience in certain cases.

Indeed there's a considerable risk of breaking someone's code with this change if they relied on the string representation of tracebacks. Of course, I can only speak for myself, but I personally would find it more useful to see qualnames when looking at tracebacks as they would disambiguate, e.g., calls to __init__ and make it clearer what the code path is. Without that, one would need to hop on the source code to infer the extra information (which is likely going to be available, but with the "inconvenience" of having to look at it).

My bet is that Python has had moments like this in the past, so how was the impact of similar potentially breaking changes gauged? Can we do the same for this change?

iritkatriel · 2022-03-18T10:13:06Z

I have some doubts. The number of tests you had to fix here makes me wonder how many people are parsing tracebacks and find that this breaks their code. And I'm not sure how valuable the added info is given that the line number already takes you to the right place. Then again, I am not sure how much this should weigh against the added convenience in certain cases.

Indeed there's a considerable risk of breaking someone's code with this change if they relied on the string representation of tracebacks. Of course, I can only speak for myself, but I personally would find it more useful to see qualnames when looking at tracebacks as they would disambiguate, e.g., calls to __init__ and make it clearer what the code path is. Without that, one would need to hop on the source code to infer the extra information (which is likely going to be available, but with the "inconvenience" of having to look at it).

My bet is that Python has had moments like this in the past, so how was the impact of similar potentially breaking changes gauged? Can we do the same for this change?

As Guido said, the number of stdlib tests you needed to fix give you an indication (many libraries have test for tracebacks).

There is also another cost - people are used to reading tracebacks the way they are rendered now. This change will take some getting used to. It might create very verbose and noisy tracebacks which are harder to read.

Can you think of another example, other than __init__ calling __init__, where this improves clarity?

P403n1x87 · 2022-03-18T10:34:47Z

Can you think of another example, other than __init__ calling __init__, where this improves clarity?

Yes, and I think this is quite a common case. Suppose you have a base class and many subclasses of it within the same file. If they all implement/override the same method, it's not obvious from the traceback which (sub)classes are actually involved. This is akin to the similar problem one might have in profiling data, when the line number information is suppressed. Functions with the same name would get bunched up together, which is an unfortunate loss of granularity.

iritkatriel · 2022-03-18T11:00:07Z

Functions with the same name would get bunched up together, which is an unfortunate loss of granularity.

Sounds like a bug in the profiler. Which profiler does this?

P403n1x87 · 2022-03-18T11:41:09Z

Sounds like a bug in the profiler. Which profiler does this?

I don't know if the profilers shipped with Python itself suffer from this, but other tools, like Austin, currently have this issue and this is why I contributed #26941, which introduced the co_qualname field in PyCodeObject.

I would dare to say that, had this field always existed in PyCodeObject, then what I'm proposing in this PR would likely be the default behaviour. But I appreciate it if the community thinks it is a bit too late now to switch over.

pablogsal · 2022-03-18T11:49:43Z

My bet is that Python has had moments like this in the past, so how was the impact of similar potentially breaking changes gauged? Can we do the same for this change?

In general, backwards-incompatible changes are by default disallowed. If they are deemed justified (because the default behaviour is fundamentally broken, dangerous or incomplete or because the change makes things much much better) then normally they require a deprecation cycle or similar. There are exceptions to this rule if there is enough consensus, though,

In general, is important to have in mind how much work an incompatible change is going to produce in user code and 3rd party libraries. Given the popularity of the language and the number of packages, this can be translated into dozens or hundreds of hours (which is nor how we measure the impact, but helps to put things in perspective).

iritkatriel · 2022-03-18T11:56:10Z

I don't know if the profilers shipped with Python itself suffer from this, but other tools, like Austin, currently have this issue and this is why I contributed #26941, which introduced the co_qualname field in PyCodeObject.

The line number would not be enough?

pablogsal · 2022-03-18T12:41:06Z

I don't know if the profilers shipped with Python itself suffer from this, but other tools, like Austin, currently have this issue and this is why I contributed #26941, which introduced the co_qualname field in PyCodeObject.

The line number would not be enough?

Yeah, the combination of file name + line number should be unique when collapsing function information. That's what I use in the profilers I maintain.

gvanrossum · 2022-03-18T16:31:42Z

Disregarding the issue of broken profilers, I agree there is real value is seeing common function names disambiguated with class names. I've seen this enough to know it's not hypothetical; it's especially common to have several classes in a file that form a class hierarchy where some methods with common names (e.g. get, update) are overridden in each class.

Thinking more about the number of affected tests, this is probably worse in the stdlib given that (of course) we have tests that tracebacks look correct.

Anybody who is currently parsing traceback lines is probably using either whitespace to separate the pieces (they won't have to change a thing) or a regex. If a regex, it's probably \w. That would have to be replaced with [\w.]+. Not terrible.

Where to look for code that might be affected? I'd try test frameworks like pytest, and Jupyter.

willingc · 2023-04-26T22:07:11Z

Hi @P403n1x87, Thanks for submitting this PR. Since this PR is over a year old, I think it would be best to close this PR for now. If you have time at a later date, you can request to reopen it. ☀️

alexmojaki · 2023-11-12T19:06:30Z

Can you think of another example, other than __init__ calling __init__, where this improves clarity?

I was the last person to comment on the request to add co_qualname:

I think this would be very useful in tracebacks. If a traceback line had a more useful name like MyClass.__init__ instead of just __init__ or my_decorator.<locals>.wrapper instead of just wrapper it would provide useful context to the frame.

So there's another example: decorators are very commonly implemented with a local function typically named wrapper or similar, and it may be nice to know which decorator is in the stack without having to go to the line.

Anyway, since every other comment there is more than 10 years old and didn't seem to provide any other strong reasons to add co_qualname, I'm left wondering what was the final motivation that led to adding it, if not for what this PR does? Was it for the benefit of third party libraries? Does Python itself use co_qualname anywhere? Or was this PR always the hope/intention but the failing tests revealed a problem that changed people's minds?

(many libraries have test for tracebacks)

I've had to update tests that expected tracebacks to look a certain way multiple times due to changes in Python. A commn reason is exception messages changing. In 3.11, I had to account for the extra lines of ^ markers. So there's at least precedent for changing tracebacks despite the fact that downstream code had to adapt. This doesn't seem too difficult to deal with.

Anybody who is currently parsing traceback lines is probably using either whitespace to separate the pieces (they won't have to change a thing) or a regex. If a regex, it's probably \w. That would have to be replaced with [\w.]+. Not terrible.

The new regex should also include <>.

Where to look for code that might be affected? I'd try test frameworks like pytest, and Jupyter.

Both pytest and IPython have custom traceback rendering which doesn't rely on parsing tracebacks. I'm not sure what kind of thing you're imagining.

Searching for regexes yields enough results to suggest that it's a good strategy, but not so many to imply that this is a catastrophic problem (many search results are duplicates):

If PyCharm sees something that looks like a traceback in stderr, it turns the file path into links that open the file in an editor at the correct line. I've just checked, this still works with more complicated function names.

normally they require a deprecation cycle or similar.

Why not discuss something of that nature? For example this feature could initially require setting an environment variable to activate, giving people time to learn of its existence and report problems without unavoidably breaking anything.

P403n1x87 · 2023-11-14T08:46:01Z

Anyway, since every other comment there is more than 10 years old and didn't seem to provide any other strong reasons to add co_qualname, I'm left wondering what was the final motivation that led to adding it, if not for what this PR does? Was it for the benefit of third party libraries?

I made the proposal and the change to allow Austin to retrieve the qualified name of frames. Many profiles I was looking at had frames with the same name (__init__ was a very common one, but there was also the case of overridden methods in subclasses), which made it really hard to understand what was actually involved at a glance. I then thought that the same benefits could have been brought to CPython itself, as the current tracebacks lack in depth of information in a similar fashion.

Why not discuss something of that nature? For example this feature could initially require setting an environment variable to activate, giving people time to learn of its existence and report problems without unavoidably breaking anything.

I'd be up for having another go at this, if there is consensus on accepting this proposal. IMO @alexmojaki has a good point on the fact that tracebacks have changed format pretty frequently, and the extra information that one gets is probably worth the price of having to adjust for yet another output format.

P403n1x87 requested review from 1st1, asvetlov, iritkatriel, terryjreedy and vsajip as code owners February 5, 2022 21:30

the-knights-who-say-ni added the CLA signed label Feb 5, 2022

bedevere-bot added the awaiting review label Feb 5, 2022

terryjreedy requested changes Feb 6, 2022

View reviewed changes

Lib/test/test_logging.py Outdated Show resolved Hide resolved

Lib/idlelib/stackviewer.py Outdated Show resolved Hide resolved

bedevere-bot removed the awaiting review label Feb 6, 2022

bedevere-bot added the awaiting changes label Feb 6, 2022

P403n1x87 force-pushed the bpo-46652/use-qualname branch 2 times, most recently from d71fc8c to 7ed2c65 Compare February 6, 2022 18:14

bedevere-bot added awaiting change review and removed awaiting changes labels Feb 6, 2022

bedevere-bot requested a review from terryjreedy February 6, 2022 18:14

pablogsal reviewed Mar 17, 2022

View reviewed changes

Lib/test/test_traceback.py Show resolved Hide resolved

pablogsal reviewed Mar 17, 2022

View reviewed changes

Python/traceback.c Outdated Show resolved Hide resolved

bpo-46652: Use code.co_qualname to provide richer tracing

c4100f2

Use the code.co_qualname field to provide richer information in tracing output generation.

P403n1x87 force-pushed the bpo-46652/use-qualname branch from 7ed2c65 to c4100f2 Compare March 17, 2022 14:18

pablogsal reviewed Mar 17, 2022

View reviewed changes

Python/traceback.c Outdated Show resolved Hide resolved

use correct variable in check and branch

21a372e

ezio-melotti removed the CLA signed label Jul 13, 2022

willingc added awaiting changes and removed awaiting change review labels Apr 26, 2023

willingc closed this Apr 26, 2023

terryjreedy changed the title ~~bpo-46652: Use code.co_qualname to provide richer information~~ gh-90810: Use code.co_qualname to provide richer information Nov 13, 2023

P403n1x87 mannequin mentioned this pull request Nov 14, 2023

Use code.co_qualname to provide richer information #90810

Open

Uh oh!

gh-90810: Use code.co_qualname to provide richer information #31150

gh-90810: Use code.co_qualname to provide richer information #31150

Uh oh!

Conversation

P403n1x87 commented Feb 5, 2022 • edited by terryjreedy Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sweeneyde commented Feb 5, 2022

Uh oh!

P403n1x87 commented Feb 6, 2022

Uh oh!

terryjreedy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bedevere-bot commented Feb 6, 2022

Uh oh!

P403n1x87 commented Feb 6, 2022

Uh oh!

bedevere-bot commented Feb 6, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gvanrossum commented Mar 17, 2022

Uh oh!

P403n1x87 commented Mar 18, 2022

Uh oh!

iritkatriel commented Mar 18, 2022 • edited by gvanrossum Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

P403n1x87 commented Mar 18, 2022 • edited by gvanrossum Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iritkatriel commented Mar 18, 2022

Uh oh!

P403n1x87 commented Mar 18, 2022

Uh oh!

pablogsal commented Mar 18, 2022

Uh oh!

iritkatriel commented Mar 18, 2022

Uh oh!

pablogsal commented Mar 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gvanrossum commented Mar 18, 2022

Uh oh!

willingc commented Apr 26, 2023

Uh oh!

alexmojaki commented Nov 12, 2023

Uh oh!

P403n1x87 commented Nov 14, 2023

Uh oh!

Uh oh!

P403n1x87 commented Feb 5, 2022 •

edited by terryjreedy

Loading

iritkatriel commented Mar 18, 2022 •

edited by gvanrossum

Loading

P403n1x87 commented Mar 18, 2022 •

edited by gvanrossum

Loading

pablogsal commented Mar 18, 2022 •

edited

Loading