-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significantly speedup ESP on large expressions that contain many strings #3467
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is an exciting improvement! I'll need to study the code some more and see how it affects perf on my codebase where I previously saw terrible performance.
I ran this over our codebase, and fixed a bug. This is the outer loop, previously |
Thanks for the report! Still seeing somewhat bad performance on an internal file with a large expression (nested calls, dictionaries, lots of strings). I think I reported #2314 based on a similar file. On this PR branch:
However, on 22.12.0 (compiled), the same file in preview mode takes:
|
So this PR clearly makes things better, but the performance penalty on ESP is still big enough (probably; I haven't tried with mypyc) that we should have a conversation about whether the tradeoff is acceptable. That can wait though. |
for i, leaf in enumerate(LL): | ||
string_indices = [] | ||
idx = 0 | ||
while is_valid_index(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best for another PR, but there may be an optimization opportunity here. is_valid_index(idx)
is basically equivalent to idx < len(LL)
, but it does so through some indirection and a nested function. According to Jukka mypyc isn't good at compiling nested functions, so this code may get a good speedup if we just use idx < len(LL)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this (JelleZijlstra@9b1f0b1) but didn't see a significant difference under ESP, though I didn't do rigorous benchmarking.
Description
Previously, when ESP 1) merges strings in
StringMerger
; 2) strips parens inStringParenStripper
, it finds and does one transform for one string group. Each transform will create a newline
with 1) one group of strings merged; 2) one group of strings' parens stripped. This newline
is then re-checked by those transformers. This isO(n^2)
complexity.Since these transformers won't cause line breaks, the same transforms can be done at one pass, and it only creates one new
line
. The new approach isO(n)
.Tested on the example from #3340 (not compiled, since I'm failing to use cibuildwheel to compile with mypyc on my machines):
Fixes #3340, since this no longer triggers recursion limit errors.
Hopefully this also solves #2314
Checklist - did you ...
CHANGES.md
if necessary?