Skip to content

bpo-37270: Manage memory lifetime for all type-related objects. #14066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

llllllllll
Copy link
Contributor

@llllllllll llllllllll commented Jun 13, 2019

Remove internal usages of PyMethodDef and PyGetSetDef.

For PyMethodDef, change PyCFunctionObject to replace the PyMethodDef* member
with a PyCFunctionBase member. The PyCFunctionBase is a new struct to hold the
managed values of a PyMethodDef. This type is shared between PyCFunction and the
various callable descriptor objects. A PyCFunctionBase is like a PyMethodDef but
replaces the char* members with PyObject* members.

For PyGetSetDef, inline the members on the resulting PyGetSetDescrObject,
replacing all char* members with PyObject* members. The memory for the closure
is not managed, adding support for that would likely require an API change and
can be done in a future change.

For the tp_name field, instead of setting it directly to the value of
PyType_Spec.name, set it to the result of PyUnicode_AsUTF8(ht_name), where
ht_name is the PyUnicode object created from the original spec name. This is the
same trick used to properly manage this pointer for heap types when the name
is reassigned.

https://bugs.python.org/issue37270

@llllllllll
Copy link
Contributor Author

llllllllll commented Jun 14, 2019

I can reproduce the error locally and have begun working on a fix. I realized we also need to handle the PyMemberDef objects. structseq actually almost has this problem, because it uses PyType_FromSpec to make types, and there is a special case to copy the PyMemberDef fields into a PyHeapType. Unfortunately, this doesn't copy the name nor doc, so only static strings may be used there. I got through applying a similar transformation as PyGetSetDef to PyMemberDef, but some of the GC code and structseq repr code actually iterates this. While walking the type's dict is probably a reasonable performance degradation on structseq repr, I think that would be too much for tp_visit. For that reason, I plan on changing the PyHeapTypeObject's internal PyMemberDef[] storage into space for and array of offsets which contain object fields. In subtype_visit and the clear code, it currently loops over tp_members filtering on the type and only checking the object members. This loop should get a little faster if it can unconditionally visit all of the members of a potentially smaller array.

@jdemeyer
Copy link
Contributor

Remove internal usages of PyMethodDef and PyGetSetDef.

Could these two be separated in two different PRs?

In interested in particular to review the part dealing with PyMethodDef (but note that I'm not a core developer, so I cannot make an official review).

@llllllllll
Copy link
Contributor Author

They could be broken up, but there is little value in doing just one of the three (PyMethodDef, PyGetSetDef, and now also PyMemberDef). The goal is to not be forced to leak memory when making types. I could do a better job of breaking this into 3 commits though.

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 14, 2019

They could be broken up, but there is little value in doing just one of the three

I agree with that, I never suggested to do just one of the three. I suggested breaking it up because I think that 3 simpler PRs have a higher chance of getting reviewed than 1 big one. (at least, this is true if I would be the reviewer)

I could do a better job of breaking this into 3 commits though.

If you make 3 commits, you might as well make 3 PRs.

@eduardo-elizondo
Copy link
Contributor

The only thing that you need is to guarantee that Py_tp_members will be alive throughout the life-cycle of the type. This is easily achieved by storing the data within the type itself. I just responded in the capi-sig list into how you can do that. It has been done with Py_tp_members before, it just needs to be updated to work with Py_tp_methods: https://mail.python.org/archives/list/capi-sig@python.org/thread/RNNJRKKWDHM5FUERFLSWL6JSI2UVURIS/

Btw your change breaks the continuous builds :)

@llllllllll llllllllll force-pushed the typeobject-memory-management branch 2 times, most recently from 4d1fa96 to e481956 Compare June 18, 2019 01:07
Remove internal usages of PyMethodDef and PyGetSetDef.

For PyMethodDef, change PyCFunctionObject to replace the PyMethodDef* member
with a PyCFunctionBase member. The PyCFunctionBase is a new struct to hold the
managed values of a PyMethodDef. This type is shared between PyCFunction and the
various callable descriptor objects. A PyCFunctionBase is like a PyMethodDef but
replaces the char* members with PyObject* members.

For PyGetSetDef, inline the members on the resulting PyGetSetDescrObject,
replacing all char* members with PyObject* members. The memory for the closure
is *not* managed, adding support for that would likely require an API change and
can be done in a future change.

For the tp_name field, instead of setting it directly to the value of
PyType_Spec.name, set it to the result of PyUnicode_AsUTF8(ht_name), where
ht_name is the PyUnicode object created from the original spec name. This is the
same trick used to properly manage this pointer for heap types when the __name__
is reassigned.
@llllllllll llllllllll force-pushed the typeobject-memory-management branch from e481956 to fb1bc5e Compare June 18, 2019 03:11
Copy link
Contributor

@eduardo-elizondo eduardo-elizondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are trying to do too many things at once. This change is not reviewable as is, you are mixing in code formatting, new C-APIs, changes to current C-APIs, etc. You need to break the code up into smaller chunks. Also, use PEP8 to guide your code style.

Btw, this breaks backwards compatibility both in the ABI and in functionality. Your code change must not break either of these, it should only be additive. It is indeed possible to fix your problem while maintaining these invariants.

@jdemeyer
Copy link
Contributor

Btw, this breaks backwards compatibility both in the ABI and in functionality.

Which ABI does it break?

@jdemeyer
Copy link
Contributor

This will likely conflict with my #13781, so I would prefer if that PR could be merged first.

@llllllllll
Copy link
Contributor Author

Which ABI does it break?

The only interface I believe this affects is iterating over the tp_members, tp_getset or tp_member arrays after the type has been created from PyType_FromSpec*. I am not sure that is actually a public interface of the type, or guaranteed to be pointers into the objects that back the given descr types. For an example of code that used this and needed to change, see the structseq changes here.

I see that it is large change, but I don't believe this is part of the stable ABI. Also, we can't just copy the arrays backing the tp_members because we need to also copy the char* members. The current implementation for tp_members assumes the names are pointers to static storage, or somehow will outlive the type. If we need to manage the lifetime of the name and doc members, it seems like a better interface to just use PyObject* there and use an interned unicode object.

@brettcannon brettcannon added the type-feature A feature request or enhancement label Jun 21, 2019
Copy link
Contributor

@jdemeyer jdemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #13781 is merged, this needs to be rebased to master.

I remind you that I agree with the general approach here, but that the PR would better be split up in smaller pieces.

@csabella
Copy link
Contributor

@llllllllll, please resolve the merge conflicts. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants