Skip to content

bpo-45947: Place dict and values pointer at fixed (negative) offset just before GC header. #29879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Dec 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ee59772
Remove unsafe _PyObject_GC_Calloc function.
markshannon Aug 5, 2021
72b71cc
Place __dict__ immediately before GC header for variable sized object…
markshannon Aug 5, 2021
cd22dac
Restore documented behavior of tp_dictoffset.
markshannon Nov 29, 2021
4c83f77
Merge branch 'main' into regular-dict-placement
markshannon Nov 29, 2021
8a8593c
Fix up lazy dict creation logic to use managed dict pointers.
markshannon Nov 29, 2021
34b5cea
Manage values pointer, placing them directly before managed dict poin…
markshannon Nov 30, 2021
e8c74ab
Refactor a bit.
markshannon Nov 30, 2021
1bf13b0
Fix specialization of managed values.
markshannon Nov 30, 2021
a025dfb
Convert hint-based load/store attr specialization target managed dict…
markshannon Nov 30, 2021
5a012a8
Specialize LOAD_METHOD for managed dict objects.
markshannon Nov 30, 2021
e7734b8
Merge branch 'main' into regular-dict-placement
markshannon Dec 1, 2021
14d41ab
Use newer API internally.
markshannon Dec 1, 2021
123171a
Add NEWS.
markshannon Dec 1, 2021
48d6a58
Use inline functions instead of magic constants.
markshannon Dec 1, 2021
79e61bf
Remove unsafe _PyObject_GC_Malloc() function.
markshannon Dec 1, 2021
ce0f65b
Remove invalid assert.
markshannon Dec 2, 2021
98ddaed
Add comment explaning use of Py_TPFLAGS_MANAGED_DICT.
markshannon Dec 3, 2021
0f376b5
Use inline function, not magic constant.
markshannon Dec 7, 2021
d724812
Tidy up struct layout a bit.
markshannon Dec 7, 2021
9435bae
Tidy up gdb/libpython.py.
markshannon Dec 7, 2021
302f46f
Fix whitespace.
markshannon Dec 7, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion Include/cpython/object.h
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,6 @@ struct _typeobject {

destructor tp_finalize;
vectorcallfunc tp_vectorcall;
Py_ssize_t tp_inline_values_offset;
};

/* The *real* layout of a type object when allocated on the heap */
Expand Down
3 changes: 0 additions & 3 deletions Include/cpython/objimpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,6 @@ PyAPI_FUNC(int) PyObject_IS_GC(PyObject *obj);
# define _PyGC_FINALIZED(o) PyObject_GC_IsFinalized(o)
#endif

PyAPI_FUNC(PyObject *) _PyObject_GC_Malloc(size_t size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PyObject_GC_Malloc is part of stable ABI. AFAIK the function cannot be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not part of the stable ABI. It starts with an underscore.
https://www.python.org/dev/peps/pep-0384/#excluded-functions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misc/stable_abi.txt define it as stable ABI.

function _PyObject_GC_Malloc
    added 3.2
    abi_only

On the other hand, no public macro in Python/C API use it. So I doubt it is actually stable abi.

As far as this repo, only one package in top4000 packages uses it.
https://github.com/hpyproject/top4000-pypi-packages/search?q=_PyObject_GC_Malloc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sent an email to python-dev asking for clarification of the status of this functions and others that start with _ but are listed in Misc/stable_abi.txt. (It is not listed in Doc/data/stable_abi.dat.)

PyAPI_FUNC(PyObject *) _PyObject_GC_Calloc(size_t size);


/* Test if a type supports weak references */
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)
Expand Down
23 changes: 22 additions & 1 deletion Include/internal/pycore_object.h
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,15 @@ _PyObject_IS_GC(PyObject *obj)
// Fast inlined version of PyType_IS_GC()
#define _PyType_IS_GC(t) _PyType_HasFeature((t), Py_TPFLAGS_HAVE_GC)

static inline size_t
_PyType_PreHeaderSize(PyTypeObject *tp)
{
return _PyType_IS_GC(tp) * sizeof(PyGC_Head) +
_PyType_HasFeature(tp, Py_TPFLAGS_MANAGED_DICT) * 2 * sizeof(PyObject *);
}

void _PyObject_GC_Link(PyObject *op);

// Usage: assert(_Py_CheckSlotResult(obj, "__getitem__", result != NULL));
extern int _Py_CheckSlotResult(
PyObject *obj,
Expand All @@ -185,7 +194,19 @@ extern int _PyObject_StoreInstanceAttribute(PyObject *obj, PyDictValues *values,
PyObject *name, PyObject *value);
PyObject * _PyObject_GetInstanceAttribute(PyObject *obj, PyDictValues *values,
PyObject *name);
PyDictValues ** _PyObject_ValuesPointer(PyObject *);

static inline PyDictValues **_PyObject_ValuesPointer(PyObject *obj)
{
assert(Py_TYPE(obj)->tp_flags & Py_TPFLAGS_MANAGED_DICT);
return ((PyDictValues **)obj)-4;
}

static inline PyObject **_PyObject_ManagedDictPointer(PyObject *obj)
{
assert(Py_TYPE(obj)->tp_flags & Py_TPFLAGS_MANAGED_DICT);
return ((PyObject **)obj)-3;
}

PyObject ** _PyObject_DictPointer(PyObject *);
int _PyObject_VisitInstanceAttributes(PyObject *self, visitproc visit, void *arg);
void _PyObject_ClearInstanceAttributes(PyObject *self);
Expand Down
6 changes: 6 additions & 0 deletions Include/object.h
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,12 @@ given type object has a specified feature.

#ifndef Py_LIMITED_API

/* Placement of dict (and values) pointers are managed by the VM, not by the type.
* The VM will automatically set tp_dictoffset. Should not be used for variable sized
* classes, such as classes that extend tuple.
*/
#define Py_TPFLAGS_MANAGED_DICT (1 << 4)

/* Set if instances of the type object are treated as sequences for pattern matching */
#define Py_TPFLAGS_SEQUENCE (1 << 5)
/* Set if instances of the type object are treated as mappings for pattern matching */
Expand Down
1 change: 0 additions & 1 deletion Lib/test/test_stable_abi_ctypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -817,7 +817,6 @@ def test_available_symbols(self):
"_PyErr_BadInternalCall",
"_PyObject_CallFunction_SizeT",
"_PyObject_CallMethod_SizeT",
"_PyObject_GC_Malloc",
"_PyObject_GC_New",
"_PyObject_GC_NewVar",
"_PyObject_GC_Resize",
Expand Down
4 changes: 2 additions & 2 deletions Lib/test/test_sys.py
Original file line number Diff line number Diff line change
Expand Up @@ -1421,8 +1421,8 @@ def delx(self): del self.__x
check((1,2,3), vsize('') + 3*self.P)
# type
# static type: PyTypeObject
fmt = 'P2nPI13Pl4Pn9Pn12PIPP'
s = vsize(fmt)
fmt = 'P2nPI13Pl4Pn9Pn12PIP'
s = vsize('2P' + fmt)
check(int, s)
# class
s = vsize(fmt + # PyTypeObject
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Place pointers to dict and values immediately before GC header. This reduces
number of dependent memory loads to access either dict or values from 3 to
1.
3 changes: 0 additions & 3 deletions Misc/stable_abi.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1577,9 +1577,6 @@ function _PyObject_CallFunction_SizeT
function _PyObject_CallMethod_SizeT
added 3.2
abi_only
function _PyObject_GC_Malloc
added 3.2
abi_only
function _PyObject_GC_New
added 3.2
abi_only
Expand Down
17 changes: 12 additions & 5 deletions Modules/_testcapimodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -5861,6 +5861,7 @@ test_tstate_capi(PyObject *self, PyObject *Py_UNUSED(args))
}


static PyObject *negative_dictoffset(PyObject *, PyObject *);
static PyObject *test_buildvalue_issue38913(PyObject *, PyObject *);
static PyObject *getargs_s_hash_int(PyObject *, PyObject *, PyObject*);

Expand Down Expand Up @@ -5929,14 +5930,15 @@ static PyMethodDef TestMethods[] = {
#if (defined(__linux__) || defined(__FreeBSD__)) && defined(__GNUC__)
{"test_pep3118_obsolete_write_locks", (PyCFunction)test_pep3118_obsolete_write_locks, METH_NOARGS},
#endif
{"getbuffer_with_null_view", getbuffer_with_null_view, METH_O},
{"PyBuffer_SizeFromFormat", test_PyBuffer_SizeFromFormat, METH_VARARGS},
{"test_buildvalue_N", test_buildvalue_N, METH_NOARGS},
{"getbuffer_with_null_view", getbuffer_with_null_view, METH_O},
{"PyBuffer_SizeFromFormat", test_PyBuffer_SizeFromFormat, METH_VARARGS},
{"test_buildvalue_N", test_buildvalue_N, METH_NOARGS},
{"negative_dictoffset", negative_dictoffset, METH_NOARGS},
{"test_buildvalue_issue38913", test_buildvalue_issue38913, METH_NOARGS},
{"get_args", get_args, METH_VARARGS},
{"get_args", get_args, METH_VARARGS},
{"test_get_statictype_slots", test_get_statictype_slots, METH_NOARGS},
{"test_get_type_name", test_get_type_name, METH_NOARGS},
{"test_get_type_qualname", test_get_type_qualname, METH_NOARGS},
{"test_get_type_qualname", test_get_type_qualname, METH_NOARGS},
{"test_type_from_ephemeral_spec", test_type_from_ephemeral_spec, METH_NOARGS},
{"get_kwargs", (PyCFunction)(void(*)(void))get_kwargs,
METH_VARARGS|METH_KEYWORDS},
Expand Down Expand Up @@ -7629,6 +7631,11 @@ PyInit__testcapi(void)
return m;
}

static PyObject *
negative_dictoffset(PyObject *self, PyObject *Py_UNUSED(ignored))
{
return PyType_FromSpec(&HeapCTypeWithNegativeDict_spec);
}

/* Test the C API exposed when PY_SSIZE_T_CLEAN is not defined */

Expand Down
64 changes: 29 additions & 35 deletions Modules/gcmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,10 @@ module gc
#define NEXT_MASK_UNREACHABLE (1)

/* Get an object's GC head */
#define AS_GC(o) ((PyGC_Head *)(o)-1)
#define AS_GC(o) ((PyGC_Head *)(((char *)(o))-sizeof(PyGC_Head)))

/* Get the object given the GC head */
#define FROM_GC(g) ((PyObject *)(((PyGC_Head *)g)+1))
#define FROM_GC(g) ((PyObject *)(((char *)(g))+sizeof(PyGC_Head)))

static inline int
gc_is_collecting(PyGC_Head *g)
Expand Down Expand Up @@ -2231,28 +2231,14 @@ PyObject_IS_GC(PyObject *obj)
return _PyObject_IS_GC(obj);
}

static PyObject *
_PyObject_GC_Alloc(int use_calloc, size_t basicsize)
void
_PyObject_GC_Link(PyObject *op)
{
PyGC_Head *g = AS_GC(op);
assert(((uintptr_t)g & (sizeof(uintptr_t)-1)) == 0); // g must be correctly aligned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, you can use & on pointers. I didn't know that. :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only after you've cast it to an int. You can do anything with a cast 🙂
NULL is usually just 0 cast to a pointer, after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, somehow I thought uintptr_t was a pointer. :-/


PyThreadState *tstate = _PyThreadState_GET();
GCState *gcstate = &tstate->interp->gc;
if (basicsize > PY_SSIZE_T_MAX - sizeof(PyGC_Head)) {
return _PyErr_NoMemory(tstate);
}
size_t size = sizeof(PyGC_Head) + basicsize;

PyGC_Head *g;
if (use_calloc) {
g = (PyGC_Head *)PyObject_Calloc(1, size);
}
else {
g = (PyGC_Head *)PyObject_Malloc(size);
}
if (g == NULL) {
return _PyErr_NoMemory(tstate);
}
assert(((uintptr_t)g & 3) == 0); // g must be aligned 4bytes boundary

g->_gc_next = 0;
g->_gc_prev = 0;
gcstate->generations[0].count++; /* number of allocated GC objects */
Expand All @@ -2266,26 +2252,32 @@ _PyObject_GC_Alloc(int use_calloc, size_t basicsize)
gc_collect_generations(tstate);
gcstate->collecting = 0;
}
PyObject *op = FROM_GC(g);
return op;
}

PyObject *
_PyObject_GC_Malloc(size_t basicsize)
{
return _PyObject_GC_Alloc(0, basicsize);
}

PyObject *
_PyObject_GC_Calloc(size_t basicsize)
static PyObject *
gc_alloc(size_t basicsize, size_t presize)
{
return _PyObject_GC_Alloc(1, basicsize);
PyThreadState *tstate = _PyThreadState_GET();
if (basicsize > PY_SSIZE_T_MAX - presize) {
return _PyErr_NoMemory(tstate);
}
size_t size = presize + basicsize;
char *mem = PyObject_Malloc(size);
if (mem == NULL) {
return _PyErr_NoMemory(tstate);
}
((PyObject **)mem)[0] = NULL;
((PyObject **)mem)[1] = NULL;
PyObject *op = (PyObject *)(mem + presize);
_PyObject_GC_Link(op);
return op;
}

PyObject *
_PyObject_GC_New(PyTypeObject *tp)
{
PyObject *op = _PyObject_GC_Malloc(_PyObject_SIZE(tp));
size_t presize = _PyType_PreHeaderSize(tp);
PyObject *op = gc_alloc(_PyObject_SIZE(tp), presize);
if (op == NULL) {
return NULL;
}
Expand All @@ -2303,8 +2295,9 @@ _PyObject_GC_NewVar(PyTypeObject *tp, Py_ssize_t nitems)
PyErr_BadInternalCall();
return NULL;
}
size_t presize = _PyType_PreHeaderSize(tp);
size = _PyObject_VAR_SIZE(tp, nitems);
op = (PyVarObject *) _PyObject_GC_Malloc(size);
op = (PyVarObject *)gc_alloc(size, presize);
if (op == NULL) {
return NULL;
}
Expand Down Expand Up @@ -2333,6 +2326,7 @@ _PyObject_GC_Resize(PyVarObject *op, Py_ssize_t nitems)
void
PyObject_GC_Del(void *op)
{
size_t presize = _PyType_PreHeaderSize(((PyObject *)op)->ob_type);
PyGC_Head *g = AS_GC(op);
if (_PyObject_GC_IS_TRACKED(op)) {
gc_list_remove(g);
Expand All @@ -2341,7 +2335,7 @@ PyObject_GC_Del(void *op)
if (gcstate->generations[0].count > 0) {
gcstate->generations[0].count--;
}
PyObject_Free(g);
PyObject_Free(((char *)op)-presize);
}

int
Expand Down
Loading