-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tier2 peephole optimization: remove extraneous _CHECK_STACK_SPACE ops #116168
Comments
This might need some labels. I also looked around for a parent ticket, but didn't find anything particularly obvious |
I will link to #115419 |
I don't know anything about the plan here, so this may not be news. :-) Couldn't you check for the needed stack space once at the beginning of the executor? And if there's a loop, lift it out of there. (And if it fits within the stack space of the code object already, drop it altogether.) |
Makes sense. Let's go with consolidating them first then another follow-up PR for lifting. Lifting stuff out of loops requires |
Is anyone working on this? Otherwise I'd like to give it a go. |
@gvanrossum I think @lazorchakp might be working on this, @lazorchakp do you mind if Guido takes over? If you have something with an implementation up but no tests that's okay, you can just submit a draft PR and Guido can probably take over if you don't mind. |
I'd also be happy to mentor @lazorchakp ! |
Sorry everyone - I've been busier than anticipated and haven't made concrete progress on this yet. If it's alright though, I'd still like to try it, and I should be able to get a draft up tonight if that works. @Fidget-Spinner has given me some guidance over email, but I'd gladly accept as much mentoring as you both are willing to provide 🙂 That being said, @gvanrossum if you'd rather go for it yourself, that's quite alright! Just let me know. |
No, I'd love to have you do it! My guess is that the critical trick should be to change the uop that checks for stack space to take a number indicating the amount of stack space needed, rather than digging that value out of the code object (which must be retrieved from the function object, which must be dug out of the stack at position Maybe this number can be put in its operand field by the initial trace projection pass (which needs the code object anyway). Then a later dedicated pass can combine stack checks as long as they are within the same "scope" (i.e., |
That sounds great. It makes sense to me that we'd now need |
I'm at the point where I have the operand for each CASE 1 - sequential function calls:
proposed optimization:
CASE 2 - nested function calls:
proposed optimization:
If these cases are correct, I should be able to combine everything into a single _CHECK_STACK_SPACE uop per trace, as @gvanrossum mentioned above. |
Yes, both cases look right. |
I have this mostly working, but I'm stuck on one critical piece: actually using the new Does anyone have ideas about how I might be able to modify |
@lazorchakp if the instruction will not fit the format in tier 1, you can add a new uop _CHECK_STACK_SPACE_COMBINED or something. Then manually swap out for that uop in the optimizer analysis. |
Tier 1 should be unaffected. IIRC there are #ifdefs you can check for. |
I don't know what the "eyes" reaction means. Do you need more info? If so please ask another question. :-) |
Ah sorry, didn't want to generate too much noise - I'll be more clear in the future. I just restarted work on this for the evening and these comments are enough to unblock me for now. I won't hesitate to ask another question if I get stuck somewhere! |
I'm getting ready to open the PR here and am wrapping up some final tests. One of my new tests fails on This is it, if anyone's interested (under def test_many_nested(self)
# overflow the trace_stack
def dummy_a(x):
return x
def dummy_b(x):
return dummy_a(x)
def dummy_c(x):
return dummy_b(x)
def dummy_d(x):
return dummy_c(x)
def dummy_e(x):
return dummy_d(x)
def dummy_f(x):
return dummy_e(x)
def dummy_g(x):
return dummy_f(x)
def dummy_h(x):
return dummy_g(x)
def testfunc(n):
a = 0
for _ in range(n):
a += dummy_h(n)
return a
self._run_with_optimizer(testfunc, 32) error:
|
Seems like a bug. Please open an issue with the repro. I will fix it. Thanks! |
@Fidget-Spinner I opened #117180. Thanks, and lmk if you need more details. |
…ove-extra-_check_stack_space-ops
…ove-extra-_check_stack_space-ops
Still have to iterate with the pipeline a bit in my draft PR. I'll move it to Open and send an update here when it's officially ready for review. |
…ove-extra-_check_stack_space-ops
…ove-extra-_check_stack_space-ops
Just opened #117242 for review |
…ove-extra-_check_stack_space-ops
…ove-extra-_check_stack_space-ops
…ove-extra-_check_stack_space-ops
…ove-extra-_check_stack_space-ops
This merges all `_CHECK_STACK_SPACE` uops in a trace into a single `_CHECK_STACK_SPACE_OPERAND` uop that checks whether there is enough stack space for all calls included in the entire trace.
This is done -- thanks again! |
This merges all `_CHECK_STACK_SPACE` uops in a trace into a single `_CHECK_STACK_SPACE_OPERAND` uop that checks whether there is enough stack space for all calls included in the entire trace.
Implement a new peephole optimization for the tier2 optimizer that removes _CHECK_STACK_SPACE if we see that this check is already present
Linked PRs
_CHECK_STACK_SPACE
uops #117242The text was updated successfully, but these errors were encountered: