-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
bpo-43693: Add _PyCode_New() and do some related cleanup. #26258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-43693: Add _PyCode_New() and do some related cleanup. #26258
Conversation
I agree that we should add a non-API function for creating code objects and Passing a struct as a single parameter to a function means that you have to ensure that all future versions can handling trailing zeroes. PyObject *make_thing(struct thing *t); Version 1:
Version 2:
Any client code written for version 1 will compile for version 2 and then crash mysteriously when run. |
I see your point but consider it less of an issue for an internal-only API like this. For the most part this PR is about addressing the pain points I ran into while making changes around PyCodeObject:
I'm okay with dialing back the changes if you think they aren't worth it, but the definitely are based in my experience thus far with PyCodeObject. |
I think we all agree that the API for making PyCodeObjects shouldn't have been public in the first place. But it is 😞 The accessor macros, Ultimately we want something like https://github.com/markshannon/peps/blob/pep-mappable-pyc-file/pep-066x.rst#runtime-objects |
23ec578
to
2a3de5d
Compare
Please remove construction of a CodeObject using a structure. As I explained earlier it isn't safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I found a few things worth taking a look at :)
That only applies for public APIs. This is a purely internal API. The compiler will catch the case you worry about in that explanation.
It allows us something not entirely unlike keyword arguments in C, which I find a big gain to readability. |
No, the compiler won't catch it. It is legal to leave trailing fields off a struct assignment. |
@markshannon, calling Regardless, there isn't a lot of value to a long discussion about this. If you feel strongly about this then I'll change it to a function with many parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is too big for me to be confident that I'm not missing something important.
Why does it need to be so large?
There are a lot of changes to frameobject.c, typeobject.c and ceval.c that seem unrelated.
Several of which look like they will degrade performance.
PyObject *co_consts; /* list (constants used) */ | ||
PyObject *co_names; /* list of strings (names used) */ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
co_code
, co_consts
and co_names
are the hottest fields in the code object. Could make sure that they are near the start of the object. Moving the metadata section after this section should do.
@@ -24,9 +27,147 @@ struct _PyOpcache { | |||
char optimized; | |||
}; | |||
|
|||
|
|||
// We would use an enum if C let us specify the storage type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still use an enum to define the values though, even if you store them in a char.
} | ||
|
||
static inline bool | ||
_PyCode_CodeIsValid(PyCodeObject *co) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't check for validity, at all. A full validity check would be large and slow. Maybe rename to reflect what it does?
} | ||
|
||
static inline _Py_CODEUNIT * | ||
_PyCode_GetInstructions(PyCodeObject *co) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to be misleading with quickening, so could you drop it for now.
Plus these accessor functions have an annoying tendency to end up in the public API, historically.
PyObject *filename, PyObject *name, int firstlineno, | ||
PyObject *linetable, PyObject *exceptiontable) | ||
static int | ||
check_code(struct _PyCodeConstructor *con, enum check_mode mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using boolean parameters (and check_mode
is a boolean) that change behavior is generally regarded as bad API design.
I think it would be better if the legacy form just cleared the exception then called PyErr_BadInternalCall()
@@ -649,6 +649,9 @@ frame_dealloc(PyFrameObject *f) | |||
static inline Py_ssize_t | |||
frame_nslots(PyFrameObject *frame) | |||
{ | |||
assert(frame->f_valuestack - frame->f_localsptr == | |||
// fastlocals + builtins + globals + locals | |||
frame->f_code->co_nfastlocals + 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see it is just misnamed, and the comment is misleading. Cell and free variables are also included.
@@ -917,146 +920,28 @@ PyFrame_New(PyThreadState *tstate, PyCodeObject *code, | |||
return f; | |||
} | |||
|
|||
/* Convert between "fast" version of locals and dictionary version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why all these changes? This seems unrelated to adding _PyCode_New()
.
@@ -8830,71 +8831,50 @@ static int | |||
super_init_without_args(PyFrameObject *f, PyCodeObject *co, | |||
PyTypeObject **type_p, PyObject **obj_p) | |||
{ | |||
if (co->co_argcount == 0) { | |||
if (!_PyCode_HasFastlocals(co, CO_FAST_POSONLY | CO_FAST_POSORKW)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? This is performance critical code, and this looks a lot slower.
} | ||
|
||
int | ||
_PyCode_OffsetFromIndex(PyCodeObject *co, int index, _PyFastLocalKind kind) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the use of this function everywhere.
If inlined, it should be no slower than the code it is replacing, but what is the point?
index + co->co_nlocals
is clearer to me than _PyCode_OffsetFromIndex(co, index, CO_FAST_CELL)
.
index
is far, far clearer than
_PyCode_OffsetFromIndex(co, index, CO_FAST_LOCAL)`
if (co->co_flags & CO_VARKEYWORDS) { | ||
kwdict = PyDict_New(); | ||
if (kwdict == NULL) | ||
goto fail; | ||
i = total_args; | ||
int offset = _PyCode_OffsetFromIndex(co, total_args, CO_FAST_LOCAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is performance critical code and this is going to be slower unless _PyCode_OffsetFromIndex
is inlined.
When you're done making the requested changes, leave the comment: |
If this PR is too big to review well enough then I'll split it up. |
The key change here is the addition of
_PyCode_New()
, which is an internal-only API to replace use ofPyCode_NewWithPosOnlyArgs()
. The problem with that older API is the it has so many parameters. This becomes especially painful when modifying its signature. The new API uses a single struct to hold the values needed to create a newPyCodeObject
instance.Other changes:
co_nfastlocals
,co_ncellvars
, andco_nfreevars
; this matters because I plan on replacingco_varnames
, etc. with a single consolidated tupleco_varnames
, etc. from ceval.c to codeobject.c (as functions); this reduced coupling makes it easier to change howPyCodeObject
works internallyPyCodeObject
fields so they are grouped a bit more logicallyThere is still code in ceval.c (
_PyEval_MakeFrameVector()
specifically, and pyframeobject.c that is coupled to the internal structure of PyCodeObject, but this PR is big enough already. 🙂https://bugs.python.org/issue43693