-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-119866: Spill before escaping calls in all cases. #124392
GH-119866: Spill before escaping calls in all cases. #124392
Conversation
…ntax error, just a tool limitation.
I've addressed all the issues. I've increased the number of functions that we treat as non-escaping, but not every function mentioned above. I think it safer to do this than to try to mark everything that is non-escaping and risk making a mistake, or any of those functions changing in the future such that it does escape. Since the cost of spilling is low, spilling around such maybe-non-escaping calls has no measurable performance impact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to approve this on the condition that usage of DEAD
is properly documented in a follow-up PR. I don't think it's so clear when to use it correctly and when it might be misused.
Alternatively, you can add the documentation in, since you need to fix the failing test cases anyways. |
I will leave the documentation of |
self.assertListEqual(gc.get_referrers(exc), []) | ||
|
||
asyncio.run(main()) | ||
self.assertListEqual(gc.get_referrers(exc), [main_coro]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are spilling the stack pointer, so the GC can see the locals of the generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to asyncio look fine to me, but maybe a comment (along the lines of what you explained to Kumar) would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm far from an expert on the cases generator, but I didn't see any obvious issues there. A few notes:
- It should be an error if somebody uses a name after it has been killed by either
DEAD(...)
orINPUTS_DEAD()
. - It should be an error to kill a name twice in the same path.
- Can you remind me why
DEAD(...)
/INPUTS_DEAD()
needs to be explicit in the code, and can't just be inferred from the location of the last use? I vaguely remember discussing this at the sprint, but I forget why. - Can we handle scalars-under-arrays in the analyzer, so we don't have these awkward length-one arrays?
I'm approving this PR because I think it probably needs to happen and I trust that this is the best way of doing this. But I think in general, we might need to take a step back and consider the huge amount of complexity the has accumulated in bytecodes.c
, and the thousands of lines of code in the cases generator that process it. The DSL was originally introduced to reduce the boilerplate and complexity in the interpreter loop... I think it still does a good job of this, as evidenced by this PR. But the mental load when reading and editing this file has crept up incrementally with time, and I'm worried that rate is accelerating. It's beyond the scope of this issue, but we should probably consider if the DSL should be reworked to better support the needs of the cases generator.
Thanks for tackling this, by the way. It was a big project and it's really cool to see it working correctly (the newly-failing test was neat to see). |
Because variables hold references to objects the last use doesn't kill the variable. |
Theoretically yes, but mixing scalars and arrays is awkward. We want to be able to move scalars into registers, but having gaps in the in-memory stack makes things complicated. |
#125046 for the other issues. |
Also fixes #123391
This PR spills the stack pointer across escaping call, also writing the contents of the stack to memory where necessary.
How it works:
Adds a new Storage class which combines the state of the stack and the local variables within a micro-op.
Adds copying and merging functionality to that class, and the underlying stack and locals, to support tracking state across divergent flow.
Splits and merges the storage in if statements.
Tracks the state of conceptual stack and local variables, ensuring the necessary values are written to memory when needed.
The code generator needs to tell when inputs are dead, in order to known when the stack_pointer should be reduced when a call escapes. We can track explicit
PyStackRef_CLOSE
andDECREF_INPUTS
, but many cases are implicit.For these cases we add the
DEAD
macro to mark the variable as dead.To simplify parsing, the code generator also enforces PEP 7 rules for braces.
A few of the changes in bytecodes.c are a result of this change.
Both interpreter performance and JIT performance show no slowdown.
Replaces #123397