-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tuples should be immutable and safe in C, as well as in Python. #127058
Comments
I'm filing this as a bug rather than an enhancement as it does cause bugs. Most notably: https://github.com/python/cpython/blob/main/Lib/test/crashers/gc_inspection.py |
|
This requires a massive change to C extensions, but doesn't actually address the problem that "tuples should be immutable."
This would break backwards compatibility with C extensions. See some examples: https://github.com/search?q=%2FPyTuple_GET_ITEM.*%3D%3D.*NULL%2F&type=code gc.get_referrers has a lot of issues, not just with tuples. As the linked files writes:
|
It might take a while for people to stop using I think you need to refine your search for
Such as? Partially created tuples may not be the only culprit, but they are the main one, I think. |
Deprecating commonly used APIs imposes a cost to C API extension authors even when we don't remove the existing API. Many extension authors will change their code in order to adopt the new best practices. If we don't have a foreseeable path to actually removing
My point is that it's a breaking change. The search does not capture all the uses that would break either.
As Armin wrote in #39117: "Expecting an object not to be seen before you first hand it out is extremely common, and get_referrers() breaks that assumption."
|
We already have For creating tuples of unknown size, the best way is to create a list and then convert it to a tuple with Looking at the uses of
Not just the GC, but in the optimizer and in memory management. Knowing that tuples are genuinely immutable allows some useful savings and a few tricks. Mainly it allows us to make reasoned improvements. For example, when untracking tuples in the GC we should be able to assume that, thanks to immutability, a tuple must be younger than the objects it contains. Therefore a simple oldest-first scan should find all tuples that can be untracked. Sadly, this isn't the case. Likewise, we would like to know that tuples cannot contain themselves. |
(sorry I missed the comment on triaged) |
* Use a small buffer, then list when constructing a tuple from an arbitrary sequence.
Bug report
Bug description:
[Apologies if this sounds a bit like a rant. I'm not blaming anyone. Just because something is the wrong choice now, doesn't mean it wasn't the right choice historically]
Tuples are immutable in Python, but we play all sorts of games in C with tuples, filling them will
NULL
s, mutating them and reusing them.We do this in the mistaken belief that it improves performance.
But it doesn't. It makes the code base more complicated and fragile as we need to work around tuples that misbehave and do strange things. Any local performance gain is overwhelmed by slowdowns caused by the extra complexity in tuple code, the garbage collector and a few other places.
So let's fix this.
We need to:
PyTuple_MakePair()
. Pairs are by far the most common type of tuple that we play games with. By providing a fast way to create pairs, we can provide an upgrade path for C code that creates tuples in unsafe ways to do so safely and quickly.PyTuple_New
. I don't know when we'll be able to remove it, but we should deprecate it ASAP.PyTuple_New
to fill the tuple with pointers toNone
instead ofNULL
. This doesn't fix the mutability issue, but it at least means the GC will only see valid objects. (This might break too much code, so we might just have to clearly document that tuples should be fully initialized in one go, before the tuple escapes the function it was created in)PyTuple_New()
or perform tuple shenanigans. We can't reasonably expect third-party package authors to follow the rules if we don't.CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
PySequence_Tuple
safer and probably faster. #127758The text was updated successfully, but these errors were encountered: