-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C API: Add PyResource API: generic API to "close a resource" #106592
Comments
See also "Function must not return a pointer to content without an explicit resource management" issue: capi-workgroup/problems#57 |
I created "C API: Add PyObject_AsObjectArray() function: get tuple/list items as PyObject** array" which uses this proposed PyResource API: issue #106593. |
Proof-of-Concept PR to show how this API can be implemented and used: PR #106596. |
This needs a non-opaque struct, which is a pretty heavy addition. Does it need to be opaque? Should it be versioned? |
IMO, this should be a PEP, as it affects many various kinds of objects, and it should get feedback from alternate implementations. |
PyResource structure members can be public. Maybe it's better to make it private, but I don't think that it's a requirement. If we want to make it private, we can declare "fake members" to make the structure big enough for our needs, and use a different structure in the implementation. I would like to avoid having to allocate PyResource structure on the heap memory, so just write |
For comparison with other "opaque handle APIs", you can see PyResource_Release() as:
The difference here is that PyResource usage is associated with a pointer which is not opaque :-)
I propose an API to discuss it and see if it would help fixing a class of bugs: prevent the risk of dangling pointer. The Python C API has multiple functions returning a pointer without explicit lifetime management (no function to release the resource):
There are different questions here:
For example, Python 3.12 has private
This API creates a buffer which must be released with PyResource API would avoid the need to have a specialized _Py_FreeCharPArray() function: it would reduce the size of the C API and make the code "more regular. Pseudo-code: PyResource res;
argv = _PySequence_BytesToCharpArray(converted_args, &res);
// ... use argv ...
PyResource_Release(&res); Internally, maybe I don't think that PyResource_Release() should be used too broadly. For example, if a function allocates a string with PyMem_Malloc(), IMO it's fine to request the caller to call PyMem_Free() on it when it's done with it. The advantage of generalizing PyResource usage is that somehow it's a good hint that the function result is a new resource which may need to be released. |
Another question for the PEP: Why not use |
If you take PyUnicode_AsUTF8() as an example, how would the API look like to encode a Python str object to a For me, Py_buffer is kind of overcomplicated for such function: it supports memory layout with shapes, strides, suboffset, and it relies on a |
Py_buffer buffer;
int result = PyUnicode_AsUTF8Buffer(string_obj, &buffer);
if(result < 0) goto error;
char *some_destination = malloc(buffer.len);
memcpy(some_destination, buffer.buf, buffer.len);
PyBuffer_Release(buffer);
Set |
If such API is added, it would be interesting to convert _PyType_Name() to a public function: this function is quite useful. |
* Add PyResource structure and PyResource_Close() function * Add Include/pyresource.h * Add PyResource_Close() to the stable ABI * Add PyBytes_AsStringRes() function.
* Add PyResource structure and PyResource_Close() function. Add them to the stable ABI. * Add Include/pyresource.h. * Add PyBytes_AsStringRes() function.
* Add PyResource structure and PyResource_Close() function. Add them to the stable ABI. * Add Include/pyresource.h. * Add PyBytes_AsStringRes() function.
I wrote a draft PEP: https://discuss.python.org/t/pep-draft-add-pyresource-callback-c-api-to-close-resources/30331 |
Sadly, I'm not confident that this API is solving a "real world" problem. It seems like there is little demand (me) to change the API. We may revisit the topic once of the concerned API would cause more troubles. For now, I prefer to give up on the issue. I closed my draft PEP PR and I close this issue. |
Currently, the Python C API is designed to fit exactly the current CPython implementation. It makes CPython changes really complicated, and it forces other Python implementation to mimick exactly the exact CPython implementation, likely missing nice optimization opportunities.
Example:
PyUnicode_AsUTF8()
relies on the fact that a Unicode object caches the UTF-8 encoded string, and the returned pointer becomes a dangling pointer once the Python str object is deleted. This API is unsafe.Example:
PySequence_Fast()
is designed for tuple and list types, but 3rd party C extensions cannot implement their own "object arrary view" protocol, the API returns a Python object and expects thatPy_DECREF(result);
will immediately delete the "fast sequence". The function is inefficient for 3rd party types which could directly provide an "object array view", since currently a temporary Python list is created.I would like to decouple the C API from its implementation: have a more generic API to "close a function".
Example of an hypothetical
PyUnicode_AsUTF8Resource()
API which gives the caller control on when the returned string is valid.Usage of
PyUnicode_AsUTF8Resource()
:This code is valid even if the Python str object is deleted by
Py_DECREF(unicode);
.This example uses reference counting, but PyUnicode_AsUTF8Resource() can create allocate a copy on the heap, and the close function would just free the memory.
UPDATE: I renamed PyResource_Release() to PyResource_Close(): "closing" a resource expression sounds commonly used.
Linked PRs
The text was updated successfully, but these errors were encountered: