Skip to content

c api next level changes

Simon Cross edited this page Nov 19, 2020 · 6 revisions

What Needs to Change and Why

This document presents further details on the changes envisioned in Taking the C API to the Next Level.

Not return borrowed references

A function returns a borrowed reference to a Python object obj if instead of returning a new reference to obj, it loans the caller an existing reference.

This saves the caller from having to call Py_DECREF(obj) but at the cost of exposing the lifetime of the reference as part of the API and preventing the Python implementation from knowing when the caller has finished using the borrowed reference.

As a simple example, imagine that a Python module contained t = (1, 2, 3) and that a Python implementation wished to efficiently store that tuple as int t[] = {1, 2, 3}. PyTuple_GetItem returns a borrowed reference, so calling PyTuple_GetItem(obj_t, 0) would require creating a new reference that could never be freed even though the caller would likely only require it for a short time.

Not steal references

A function steals a reference to a Python object obj when it takes over the responsibility of freeing the reference from the caller.

This exposes the lifetime of the stolen reference as part of the API.

For example, PyList_SetItem steals the reference to the item passed to it. The caller might then continue to use the reference (even though they shouldn't) and rely on the reference continuing to be valid for as long as the list exists.

Stolen references also make it harder to write correct code. Instead of being able to easily check where references are freed by reading the C code, one must also remember the long list of API functions that steal a reference. For example, PyList_SetItem steals a reference, but PyList_Insert and PyList_Append do not.

Not expose reference counting as part of the API

The current API exposes reference counting via Py_INCREF and Py_DECREF. Implementing the semantics of this API requires maintaining a counter for each object -- i.e. emulating reference counting.

It also requires references to be long-lived -- a reference must be valid for as long as the reference count is non-zero (i.e. for the object's entire lifetime).

It would be better to use an interface that allowed the caller of the API to explicitly communicate its own requirements via obj = Py_I_Need_A_New_Reference(...) and Py_I_Am_Done_With_This_Reference(obj) API functions. These would allow for shorter lived references that can be freed as soon as an individual caller is done with them.

Not rely on pointers for object identity

Not expose the memory layout of Python objects as part of the API

Not expose static types

Expose Python (the language), not a specific Python implementation version

Expose constructs generally useful in C

Provide an explicit execution context

Clone this wiki locally