-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache the scope name prefix to prevent scope traversal in a tight loop #708
Conversation
Codecov Report
@@ Coverage Diff @@
## main #708 +/- ##
=======================================
Coverage 94.80% 94.80%
=======================================
Files 247 247
Lines 25701 25711 +10
=======================================
+ Hits 24365 24375 +10
Misses 1336 1336
Continue to review full report at Codecov.
|
@@ -667,6 +657,7 @@ def __init__( | |||
self.name = name | |||
self.node = node | |||
self._scope_overwrites = {} | |||
self._name_prefix = self._make_name_prefix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pyre is unhappy with this line. Any suggestions @zsol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a bug to me, might be worth raising with Pyre folks internally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why join(filter(None, ...))
all the time? I don't think any of those list items can be None
. Seems easier to just use f-strings everywhere.
But this is only really important in get_qualified_names_for
libcst/metadata/scope_provider.py
Outdated
return {QualifiedName(".".join(parts), QualifiedNameSource.LOCAL)} | ||
return { | ||
QualifiedName( | ||
".".join(filter(None, [self.scope._name_prefix, full_name])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither of these can be None
, and this is a pretty hot function. This should be much faster
".".join(filter(None, [self.scope._name_prefix, full_name])), | |
f"{self.scope._name_prefix}.{full_name}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filter(None,...) also removes empty strings, which is what I need here. Happy to consider other ways to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"{a}.{b}" if a else b
is about 4x faster still
@@ -667,6 +657,7 @@ def __init__( | |||
self.name = name | |||
self.node = node | |||
self._scope_overwrites = {} | |||
self._name_prefix = self._make_name_prefix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a bug to me, might be worth raising with Pyre folks internally
Maybe worth putting a |
e81678c
to
8860e95
Compare
0.4.7 - 2022-07-12 Fixed * Fix get_qualified_names_for matching on prefixes of the given name by @lpetre in Instagram/LibCST#719 Added * Implement lazy loading mechanism for expensive metadata providers by @Chenguang-Zhu in Instagram/LibCST#720 0.4.6 - 2022-07-04 New Contributors - @superbobry made their first contribution in Instagram/LibCST#702 Fixed - convert_type_comments now preserves comments following type comments by @superbobry in Instagram/LibCST#702 - QualifiedNameProvider optimizations - Cache the scope name prefix to prevent scope traversal in a tight loop by @lpetre in Instagram/LibCST#708 - Faster qualified name formatting by @lpetre in Instagram/LibCST#710 - Prevent unnecessary work in Scope.get_qualified_names_for_ by @lpetre in Instagram/LibCST#709 - Fix parsing of parenthesized empty tuples by @zsol in Instagram/LibCST#712 - Support whitespace after ParamSlash by @zsol in Instagram/LibCST#713 - [parser] bail on deeply nested expressions by @zsol in Instagram/LibCST#718 0.4.5 - 2022-06-17 New Contributors - @zzl0 made their first contribution in Instagram/LibCST#704 Fixed - Only skip supported escaped characters in f-strings by @zsol in Instagram/LibCST#700 - Escaping quote characters in raw string literals causes a tokenizer error by @zsol in Instagram/LibCST#668 - Corrected a code example in the documentation by @zzl0 in Instagram/LibCST#703 - Handle multiline strings that start with quotes by @zzl0 in Instagram/LibCST#704 - Fixed a performance regression in libcst.metadata.ScopeProvider by @lpetre in Instagram/LibCST#698 0.4.4 - 2022-06-13 New Contributors - @adamchainz made their first contribution in Instagram/LibCST#688 Added - Add package links to PyPI by @adamchainz in Instagram/LibCST#688 - native: add overall benchmark by @zsol in Instagram/LibCST#692 - Add support for PEP-646 by @zsol in Instagram/LibCST#696 Updated - parser: use references instead of smart pointers for Tokens by @zsol in Instagram/LibCST#691
Summary
We don't need to recompute this name prefix every time we try to lookup a qualified name for a node. This value is immutable when the scope is constructed.
Context
There is a certain type of file that causes the scope provider to take an extraordinary amount of time.
These files tend to fit a specific pattern:
Here is an example: https://github.com/FreeOpcUa/python-opcua/blob/0.98.11/opcua/server/standard_address_space/standard_address_space_part9.py
I've been running these w/ yappi to figure out where the scope provider might be improved.
Test Plan
Previously this file would take ~70 seconds on my laptop (Apple M1 Pro, 32GB RAM). Now it takes ~30 seconds.
Here is the top hits from yappi (sorted by tsub) before and after:
Before
After