-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-46841: Use inline caching for attribute accesses #31640
Conversation
Python/ceval.c
Outdated
/* XXX: Remove this next line to make test_asyncio very angry! */ \ | ||
DEOPT_IF(LOAD_##attr_or_method == LOAD_ATTR, LOAD_##attr_or_method); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests pass if we deopt every LOAD_ATTR_MODULE
right here. Removing this line activates some very strange failures!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casting a _PyLoadMethodCache
to a _PyAttrCache
may not be valid.
(We're also going to want to hit this with the buildbots once it's working.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the change to use arrays in the structs. It's much clearer that my _m1
field.
But, be careful of casting one struct to another, and be aware that the C compiler treats arrays as simple pointers.
@@ -124,7 +124,7 @@ def jabs_op(name, op, entries=0): | |||
def_op('BUILD_LIST', 103) # Number of list items | |||
def_op('BUILD_SET', 104) # Number of set items | |||
def_op('BUILD_MAP', 105) # Number of dict entries | |||
name_op('LOAD_ATTR', 106) # Index in name list | |||
name_op('LOAD_ATTR', 106, 4) # Index in name list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existence of the LOAD_MODULE_ATTR_OR_METHOD
macro implies that the LOAD_ATTR
and LOAD_METHOD
caches should have the same layout. Yet, this has size 4 and LOAD_METHOD
has size 10.
This may be the source of your problem.
Are you using the two version numbers consistently?
_Py_CODEUNIT dict_offset; | ||
_Py_CODEUNIT keys_version[2]; | ||
_Py_CODEUNIT descr[4]; | ||
} _PyLoadMethodCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that LOAD_METHOD_MODULE
and LOAD_ATTR_MODULE
share code, maybe change this to:
typedef struct {
_PyAttrCache attr;
_Py_CODEUNIT keys_version[2];
_Py_CODEUNIT descr[4];
} _PyLoadMethodCache;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather leave it as-is. Otherwise, accessing the attr
methods, including the counter
, is a bit awkward.
In response to your other comments: I don't think we're doing anything illegal here, since we only use one consistent cache layout for each specialized form (meaning, _PyLoadMethodCache
and _AttrCache
members never alias each other). The exception is the counter
member for all instructions, but C guarantees that it the first member of a struct will always be at offset 0
. So no issues there.
Python/ceval.c
Outdated
/* XXX: Remove this next line to make test_asyncio very angry! */ \ | ||
DEOPT_IF(LOAD_##attr_or_method == LOAD_ATTR, LOAD_##attr_or_method); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casting a _PyLoadMethodCache
to a _PyAttrCache
may not be valid.
When you're done making the requested changes, leave the comment: |
Okay, that was officially the wildest bug I've ever dealt with. The problematic bytecode sequence (buried deep in
Quickened:
After specializating the first
Finally, after specializing the second
Did you catch that? The cached index was turned into a The code that did that is here. Basically, the low byte of the cached index just happened to be What it really did, though, was change the module attribute cached index from Phew. |
This whole incident highlights one limitation with the new inline caching approach: it's no longer safe to peek backwards at previous instructions, since you might actually be looking into a cache entry instead. I don't see how we can keep I'll also go over the existing specialization code and see if we're doing this anywhere else. |
It also highlights the need for a |
Looping in @sweeneyde, since I might need to get rid of |
I didn't see the macro benchmark needle move too much when it was added, so if it's architecturally necessary to, I have no problem with it being scrapped for now. |
🤖 New build scheduled with the buildbot fleet by @brandtbucher for commit 452c78c 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Buildbot failures look unrelated. |
LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_ATTR_WITH_HINT | ||
); | ||
if (err < 0) { | ||
return -1; | ||
} | ||
if (err) { | ||
if (_Py_OPCODE(instr[0]) == LOAD_ATTR_INSTANCE_VALUE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common pair and it saves a POP/PUSH pair and an incref/decref pair, so I'd like to revisit this in the future.
Do we have an issue for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion: faster-cpython/ideas#291
Issue: https://bugs.python.org/issue46823
CC @markshannon. These three share a lot of code, so it makes sense to convert them together.
Marking asDO-NOT-MERGE
sinceLOAD_ATTR_MODULE
is somehow causing assertion failures and unclosed resources intest_asyncio
:Any ideas on why that might be are definitely appreciated... I've been staring at this for a while and am completely at a loss.The rest of the PR is ready for review, though.https://bugs.python.org/issue46841