Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Removed private _PyArg_Parser API has no replacement in Python 3.13 #112136

Closed
vstinner opened this issue Nov 16, 2023 · 18 comments
Closed

Comments

@vstinner
Copy link
Member

vstinner commented Nov 16, 2023

Copy of @tacaswell's message:

aio-libs/multidict is also using _PyArg_Parser and friends (e.g. https://github.com/aio-libs/multidict/blob/18d981284b9e97b11a4c0cc1e2ad57a21c82f323/multidict/_multidict.c#L452-L456 but there are many)

The _PyArg_Parser API is used by Argument Clinic to generate efficient code parsing function arguments.

Since Python 3.12, when targeting Python internals, Argument Clinic initializes _PyArg_Parser.kwtuple with singletons objects using the _Py_SINGLETON(name) API. This API cannot be used outside Python.

Private functions with a _PyArg_Parser argument:

  • _PyArg_ParseStackAndKeywords()
  • _PyArg_ParseTupleAndKeywordsFast()
  • _PyArg_UnpackKeywords()
  • _PyArg_UnpackKeywordsWithVararg()

cc @serhiy-storchaka

Linked PRs

@vstinner
Copy link
Member Author

A code search on _PyArg_Parser in PyPI top 8,000 projects (2023-11-15) found 5 projects:

  • multidict (6.0.4)
  • multiprocess (0.70.15)
  • mypy (1.7.0)
  • orjson (3.9.10)
  • pickle5 (0.0.12)

@nickelpro
Copy link

nickelpro commented Nov 16, 2023

These APIs are effectively required for using METH_FASTCALL | METH_KEYWORDS which is a documented part of the API.

Notably METH_FASTCALL is part of the Stable ABI and requires the not-mentioned-here-but-related _PyArg_ParseStack().

"Private, but documented, and partially considered part of the stable ABI" is not the same thing as "Private". Removing these functions leaves no API to interact with these documented call mechanisms. Unless you intend to remove METH_FASTCALL and its derivatives from the documentation and the stable ABI, something must replace these private functions.

Anecdotally, we make extensive use of _PyArg_ParseStack() and _PyArg_ParseStackAndKeywords() internally at NYU in high-traffic CPython methods. I don't think a code search of PyPI is a responsible way to gauge the existence of C API consumers.

@vstinner
Copy link
Member Author

What I'm trying to do here is to list projects using these projects, see how they use the API, and which public API is needed.

Anecdotally, we make extensive use of _PyArg_ParseStack() and _PyArg_ParseStackAndKeywords() internally at NYU in high-traffic CPython methods.

Do you have some examples? Which calling convention do you use? METH_FASTCALL, METH_FASTCALL | METH_KEYWORDS, and/or something else?

"Private, but documented, and partially considered part of the stable ABI" is not the same thing as "Private". Removing these functions leaves no API to interact with these documented call mechanisms.

This situation is unfortunate. METH_FASTCALL was created as a private API but apparently, it was adopted outside Python before it was standardized by PEP 590 – Vectorcall: a fast calling protocol for CPython.

As the author of METH_FASTCALL, I would like to add a clean public, documented and tested API to make these calling convention fully usable :-)

  • METH_FASTCALL
  • METH_FASTCALL | METH_KEYWORDS
  • METH_METHOD | METH_FASTCALL | METH_KEYWORDS

In terms of functions, so far the following functions were mentioned:

  • _PyArg_ParseStack()
  • _PyArg_ParseStackAndKeywords() -- use _PyArg_Parser
  • _PyArg_ParseTupleAndKeywordsFast() -- use _PyArg_Parser
  • _PyArg_UnpackKeywords() -- use _PyArg_Parser
  • _PyArg_UnpackKeywordsWithVararg() -- use _PyArg_Parser

Is the number of arguments always a signed Py_ssize_t type? Or do we have to care about PyVectorcall_NARGS() which uses an unsigned size_t type? If I understood correctly, unsigned size_t is only used to call a function. But inside a function implementation, the calling convention is to pass a number of arguments as a signed Py_ssize_t.

@nickelpro
Copy link

nickelpro commented Nov 16, 2023

Do you have some examples? Which calling convention do you use? METH_FASTCALL, METH_FASTCALL | METH_KEYWORDS, and/or something else?

I don't think we do anything fancy, basically the same straightforward code you would see pop out of argument clinic:

// Tiny METH_FASTCALL example function
static PyObject *print_func(PyObject *self,
    PyObject *const *args, Py_ssize_t nargs) {
  const char *str;
  if(!_PyArg_ParseStack(args, nargs, "s", &str))
    return NULL;
  puts(str);
  Py_RETURN_NONE;
}

or

// Tiny METH_FASTCALL | METH_KEYWORDS example function
static PyObject* prefix_print(PyObject* self, PyObject* const* args,
    Py_ssize_t nargs, PyObject* kwnames) {
  static const char* _keywords[] = {"", "prefix", NULL};
  static _PyArg_Parser _parser = {.keywords = _keywords,
      .format = "s|s:prefix_print"};

  const char* str;
  const char* prefix = NULL;
  if(!_PyArg_ParseStackAndKeywords(args, nargs, kwnames, &_parser, &str,
         &prefix))
    return NULL;

  if(prefix)
    printf("%s", prefix);
  puts(str);

  Py_RETURN_NONE;
}

for METH_FASTCALL and METH_FASTCALL | METH_KEYWORDS respectively. I imagine if we converted the latter usage to vectorcall it would be identical with the exception of replacing nargs with PyVectorcall_NARGS(nargsf), so vectorcall vs METH_FASTCALL isn't really the problem here unless I'm completely missing some public API for parsing vectorcall arguments.

A quick Ctrl-F shows no one using kwtuple yet, but I think kwtuple is a basically good feature (and we use pre-initialized global string objects in some other places), so I would be sad to see that capacity go away as well.

I would like to add a clean public, documented and tested API to make these calling convention fully usable

Ya 100%. I understand I'm just some random and you're the core dev here, just wanted to raise the objection that breaking this functionality without having that replacement ready doesn't seem like a win to me. The APIs are already internal, there's no stability contract, but if they exist anyway in an unbroken state, what's the harm in leaving them accessible until a replacement is available?

As a practical matter if these disappeared we would be replicating the declarations in our own headers for the Python versions where no solutions existed, and that seems like a pointless maintenance burden to push out into the world.

Is the number of arguments always a signed Py_ssize_t type? Or do we have to care about PyVectorcall_NARGS() which uses an unsigned size_t type?

PyVectorcall_NARGS() accepts a size_t but returns an ssize_t, from the point of view of the parser nargs is always signed.

@webknjaz
Copy link
Contributor

@vstinner in case of multidict, here's the PR that started using this private API aio-libs/multidict#681. For now, I reverted it under Python 3.13: aio-libs/multidict#909.
The original PR implies there's a 2.2x difference in performance, so it'd be useful to get an equivalent replacement.

@AdamWill
Copy link
Contributor

blender also uses _PyArg_Parser extensively: https://projects.blender.org/blender/blender/search?q=_PyArg_Parser . I am not enough of a C coder to know how easy it would be to port all those uses to the public argument parsing API.

vstinner added a commit to vstinner/cpython that referenced this issue Jul 2, 2024
Restore removed private _PyArg_Parser structure and private
_PyArg_ParseTupleAndKeywordsFast() function.

Recreate Include/cpython/modsupport.h header file.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 2, 2024
Restore removed private _PyArg_Parser structure and private
_PyArg_ParseTupleAndKeywordsFast() function.

Recreate Include/cpython/modsupport.h header file.
@vstinner
Copy link
Member Author

vstinner commented Jul 2, 2024

@AdamWill:

blender also uses _PyArg_Parser extensively: https://projects.blender.org/blender/blender/search?q=_PyArg_Parser . I am not enough of a C coder to know how easy it would be to port all those uses to the public argument parsing API.

I wrote PR gh-121262 to restore the private _PyArg_Parser structure and the private _PyArg_ParseTupleAndKeywordsFast() function. Is it enough for Blender?

vstinner added a commit to vstinner/cpython that referenced this issue Jul 2, 2024
Restore the private _PyArg_Parser structure and the private
_PyArg_ParseTupleAndKeywordsFast() function, previously removed
in Python 3.13 alpha 1.

Recreate Include/cpython/modsupport.h header file.
@AdamWill
Copy link
Contributor

AdamWill commented Jul 3, 2024

@vstinner on a naive grep it looks like it should be enough, but blender at least also uses _PySet_NextEntry, so I can't get a complete compile done to verify unless that's put back in or it's ported over to the generic iterator stuff (which I'm not a good enough C coder to do myself).

@vstinner
Copy link
Member Author

vstinner commented Jul 3, 2024

@vstinner on a naive grep it looks like it should be enough

Good. I was only asking for "PyArg" APIs.

vstinner added a commit that referenced this issue Jul 3, 2024
Restore the private _PyArg_Parser structure and the private
_PyArg_ParseTupleAndKeywordsFast() function, previously removed
in Python 3.13 alpha 1.

Recreate Include/cpython/modsupport.h header file.
@vstinner
Copy link
Member Author

vstinner commented Jul 3, 2024

I restored the removed private _PyArg_Parser structure and the private _PyArg_ParseTupleAndKeywordsFast() function in the change f8373db. I close the issue.

@vstinner vstinner closed this as completed Jul 3, 2024
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 3, 2024
Restore the private _PyArg_Parser structure and the private
_PyArg_ParseTupleAndKeywordsFast() function, previously removed
in Python 3.13 alpha 1.

Recreate Include/cpython/modsupport.h header file.
(cherry picked from commit f8373db)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue Jul 3, 2024
gh-112136: Restore removed _PyArg_Parser (GH-121262)

Restore the private _PyArg_Parser structure and the private
_PyArg_ParseTupleAndKeywordsFast() function, previously removed
in Python 3.13 alpha 1.

Recreate Include/cpython/modsupport.h header file.
(cherry picked from commit f8373db)

Co-authored-by: Victor Stinner <vstinner@python.org>
@nickelpro
Copy link

I think this has been prematurely closed as no solution has been provided for the other _PyArg_Parser API functions, which are not 1-to-1 clones of _PyArg_ParseTupleAndKeywordsFast()

@vstinner
Copy link
Member Author

vstinner commented Jul 3, 2024

Ok, I reopen the issue.

@vstinner vstinner reopened this Jul 3, 2024
noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
Restore the private _PyArg_Parser structure and the private
_PyArg_ParseTupleAndKeywordsFast() function, previously removed
in Python 3.13 alpha 1.

Recreate Include/cpython/modsupport.h header file.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
Restore the private _PyArg_Parser structure and the private
_PyArg_ParseTupleAndKeywordsFast() function, previously removed
in Python 3.13 alpha 1.

Recreate Include/cpython/modsupport.h header file.
@vstinner
Copy link
Member Author

vstinner commented Sep 9, 2024

Oh, I lost track of this issue and we are now past Python 3.13 rc2 release! The next release should be 3.13 final.

Anecdotally, we make extensive use of _PyArg_ParseStack() and _PyArg_ParseStackAndKeywords() internally at NYU in high-traffic CPython methods.

Can't you use the internal C API for that? Something like:

#define Py_BUILD_CORE_MODULE
#include <Python.h>
#include <pycore_modsupport.h>

@nickelpro
Copy link

nickelpro commented Sep 10, 2024

Yep, that's what we're doing.

I'm more raising the issue because there should be a blessed replacement. _PyArg_ParseTupleAndKeywordsFast() is three steps away from _PyArg_ParseStackAndKeywords(), but the latter is the more natural fit for METH_FASTCALL functions. It would be weird for the former to be the only canonized replacement going forward.

@vstinner
Copy link
Member Author

I'm more raising the issue because there should be a blessed replacement.

The _PyArg_Parser structure has multiple issues:

  • It has many members.
  • It's quite complex.
  • It requires _PyOnceFlag atomic type: I don't think that it's available in the limited C API.
  • Keyword names are expressed as bytes strings and Python objects: the same structure is used for "initialization" and for "runtime".
  • Member such as next should not be accessible in the public structure (must be private instead).

I failed to design a better replacement. So far, no one else proposed a better replacement.

@vstinner
Copy link
Member Author

I propose to close the issue and leave the situation as it is. You should opt-in for the internal C API if you want to use these APIs sadly. For now, there is no good replacement in the public C API.

@nickelpro
Copy link

If the official upstream answer is "do not replace" then I agree there's nothing left to do here

@vstinner
Copy link
Member Author

I would be very interested to see a public API to replace these internal C APIs. But so far, nobody managed to propose a better replacement. Time to time, I'm still trying to design a replacement, but I didn't find anything satisfying myself.

Maybe a less optimized API should be designed to leak less implementation details.

For now, I close the issue.

vstinner added a commit to vstinner/cpython that referenced this issue Nov 21, 2024
pycore_modsupport.h no longer needs pycore_lock.h.
vstinner added a commit that referenced this issue Nov 21, 2024
pycore_modsupport.h no longer needs pycore_lock.h.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants