Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119180: Set the name of the param to __annotate__ to "format" #124730

Merged
merged 7 commits into from
Dec 30, 2024

Conversation

JelleZijlstra
Copy link
Member

@JelleZijlstra JelleZijlstra commented Sep 28, 2024

This is what Larry wants, and so it shall be. It's a bit of a hack,
but it's localized and not too bad.

This is what Larry wants, and so it shall be. It's a bit of a hack,
but it's localized and not too bad.
@JelleZijlstra
Copy link
Member Author

cc @larryhastings @carljm

@larryhastings
Copy link
Contributor

larryhastings commented Sep 29, 2024

Please add three tests that use an annotation of format, which is defined in a closure, class scope, and module scope respectively.

@JelleZijlstra
Copy link
Member Author

@larryhastings done.

@larryhastings
Copy link
Contributor

It just hit me--do you mind adding a fourth that fails because format is not defined? I mean, let's cover all our bases here. Let no one accuse us of not doing a thorough job!

@JelleZijlstra
Copy link
Member Author

Done. Because format() is a builtin, getting a NameError is a little bit involved.

Copy link
Contributor

@larryhastings larryhastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really just needless munging on the comments, and a question. If you checked it in as-is it'd be fine.

Python/codegen.c Outdated Show resolved Hide resolved
Python/codegen.c Outdated Show resolved Hide resolved
Python/codegen.c Outdated Show resolved Hide resolved
if (size == -1) {
return ERROR;
}
PyObject *new_names = PyTuple_New(size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a critique of your approach, but--I'm surprised you needed to go to all this effort. Why was it necessary to make a new tuple, write the new value for index 0, copy over the other values, and release the reference to the old tuple? I'm assuming the reference count of co_localsplusnames is currently 1; I would have asserted that, then overwritten the first entry. I grant you your approach is more conceptually hygienic, but in practice I assume the quick-and-dirty approach would work equally well.

What am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is possible to mutate tuples in C code, it feels riskier. For example, maybe we'll make changes in the future that rely on tuples being immutable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assure you, this is a long-standing CPython idiom. We've relied on "if there's only one reference to an object, and you own it, you may modify the object however you like" for decades now.

For fun I made a survey of CPython, literally examining every instance of PyTuple_SET_ITEM. (I didn't try the other spellings.) I found a bunch of sites where we do this. In nearly every instance the code is structured as follows:

if there's only one reference to the tuple (which we own)
    modify the tuple in place
else
    create a new tuple

(I'll append the list of such sites at the bottom of this comment.)

Clearly these existing sites are optimizations; instead of destroying the old tuple and creating a fresh one, they're just reusing the existing tuple. They have a harder time of it because generally the tuple has been shown to the interpreter. In our case, we have a freshly compiled code object that hasn't been shown to the interpreter. So there's no chance anyone else has taken any references yet.

If we did change CPython so this was no longer viable, the developer making that change would have to fix all the sites I listed below, which they would probably find the same way I did--looking for all places where people set things in tuples. I don't think modifying the tuple directly would trip up such a future developer.

So, yeah, I really do think it'd be safe to modify the tuple in-place. Just to be totally safe, I'd check the reference count was 1 and raise if it wasn't. (It'd only happen if someone was hacking on compile.c or something, at which point they would deal with it. This would never raise in the wild.)

I don't actually mind you doing it the hard way--we can ship it like this. It just seems needless. We have a longstanding idiom that lets us skip the laborious approach you took. But I'm not gonna fight you about it.


Places where CPython modifies tuples in-place:

compile.c does it a couple times in its internal cache objects. Never exposed to the user (I think).

zip_next in bltinmodule.c, uses _PyObject_IsUniquelyReferenced.

odictiter_iternext in odictobject.c, uses (Py_REFCNT(result) == 1).

enum_next_long in enumobject.c, uses if (Py_REFCNT(result) == 1).

dictiter_iternextitem in dictobject.c, uses _Py_IsOwnedByCurrentThread.

dictreviter_iter_lock_held in dictobject.c, uses Py_REFCNT(result) == 1.

intern_constants in codeojbect.c, doesn't check ownership, this is in con->consts and I assume that's internal.

Five places in itertoolsmodule.c: pairwise_next combinations_next cwr_next permutations_next zip_longest_next, all use Py_REFCNT(result) == 1.

p.s. you should see the if-only-one-reference-modify-the-object shenanigans in the Unicode object!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #127058 where @markshannon proposes to deprecate existing tuple-mutation shenanigans. That strengthens my conviction that we shouldn't introduce a new tuple mutation here.

@bedevere-app
Copy link

bedevere-app bot commented Sep 30, 2024

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@JelleZijlstra
Copy link
Member Author

I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Oct 9, 2024

Thanks for making the requested changes!

@larryhastings: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from larryhastings October 9, 2024 04:37
@JelleZijlstra
Copy link
Member Author

@larryhastings would you mind taking another look here? It appears I can't merge while your review remains unresolved.

@JelleZijlstra JelleZijlstra merged commit 3480124 into python:main Dec 30, 2024
42 of 43 checks passed
@JelleZijlstra JelleZijlstra deleted the coformatname branch December 30, 2024 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants