Description
Currently, the Python C API is designed to fit exactly the current CPython implementation. It makes CPython changes really complicated, and it forces other Python implementation to mimick exactly the exact CPython implementation, likely missing nice optimization opportunities.
Example: PyUnicode_AsUTF8()
relies on the fact that a Unicode object caches the UTF-8 encoded string, and the returned pointer becomes a dangling pointer once the Python str object is deleted. This API is unsafe.
Example: PySequence_Fast()
is designed for tuple and list types, but 3rd party C extensions cannot implement their own "object arrary view" protocol, the API returns a Python object and expects that Py_DECREF(result);
will immediately delete the "fast sequence". The function is inefficient for 3rd party types which could directly provide an "object array view", since currently a temporary Python list is created.
I would like to decouple the C API from its implementation: have a more generic API to "close a function".
typedef struct {
void (*close_func) (void *data);
void *data;
} PyResource;
PyAPI_FUNC(void) PyResource_Close(PyResource *res);
#ifdef Py_BUILD_CORE
PyAPI_FUNC(void) _PyResource_DECREF(void *data);
#endif
Example of an hypothetical PyUnicode_AsUTF8Resource()
API which gives the caller control on when the returned string is valid.
void _PyResource_DECREF(void *data)
{
PyObject *obj = _PyObject_CAST(data);
Py_DECREF(obj);
}
const char *
PyUnicode_AsUTF8Resource(PyObject *unicode, PyResource *res)
{
res->close_func = _PyResource_DECREF;
res->data = Py_NewRef(unicode);
return PyUnicode_AsUTF8AndSize(unicode, NULL);
}
Usage of PyUnicode_AsUTF8Resource()
:
PyObject *unicode = PyUnicode_FromString("temporary string");
PyResource res;
const char* utf8 = PyUnicode_AsUTF8Resource(unicode, &res);
Py_DECREF(unicode); // delete the object!!!
// ... code using utf8 ...
PyResource_Close(&res);
This code is valid even if the Python str object is deleted by Py_DECREF(unicode);
.
This example uses reference counting, but PyUnicode_AsUTF8Resource() can create allocate a copy on the heap, and the close function would just free the memory.
UPDATE: I renamed PyResource_Release() to PyResource_Close(): "closing" a resource expression sounds commonly used.