Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make import astroid faster (again) #2161

Closed
Pierre-Sassoulas opened this issue May 1, 2023 · 5 comments · Fixed by #2210 or #2303
Closed

Make import astroid faster (again) #2161

Pierre-Sassoulas opened this issue May 1, 2023 · 5 comments · Fixed by #2210 or #2303
Assignees
Labels
Enhancement ✨ Improvement to a component topic-performance
Milestone

Comments

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented May 1, 2023

Steps to reproduce

python3 -X importtime -c 'import astroid'
scalene --html pylint/__main__.py --- pylint/constants.py

import_astroid_too_slow

import time: self [us] | cumulative | imported package
import time:       157 |        157 |   _io
import time:        47 |         47 |   marshal
import time:       351 |        351 |   posix
import time:       965 |       1518 | _frozen_importlib_external
import time:        96 |         96 |   time
import time:       167 |        262 | zipimport
import time:       122 |        122 |     _codecs
import time:       726 |        848 |   codecs
import time:       497 |        497 |   encodings.aliases
import time:      1443 |       2787 | encodings
import time:       156 |        156 | encodings.utf_8
import time:        66 |         66 | _signal
import time:        36 |         36 |     _abc
import time:       252 |        288 |   abc
import time:       548 |        835 | io
import time:        41 |         41 |       _stat
import time:       146 |        186 |     stat
import time:       738 |        738 |     _collections_abc
import time:       106 |        106 |       genericpath
import time:       255 |        360 |     posixpath
import time:      3296 |       4579 |   os
import time:       127 |        127 |   _sitebuiltins
import time:       273 |        273 |     apport_python_hook
import time:       132 |        404 |   sitecustomize
import time:      1287 |       6395 | site
import time:       127 |        127 |       itertools
import time:       110 |        110 |       keyword
import time:        51 |         51 |         _operator
import time:       219 |        270 |       operator
import time:       133 |        133 |       reprlib
import time:        46 |         46 |       _collections
import time:      1002 |       1684 |     collections
import time:       199 |        199 |     types
import time:        41 |         41 |     _functools
import time:       509 |       2432 |   functools
import time:       529 |        529 |       enum
import time:        67 |         67 |         _sre
import time:       203 |        203 |           sre_constants
import time:       300 |        502 |         sre_parse
import time:      1262 |       1830 |       sre_compile
import time:        54 |         54 |       _locale
import time:       139 |        139 |       copyreg
import time:       569 |       3119 |     re
import time:       135 |        135 |     token
import time:       848 |       4101 |   tokenize
import time:       190 |        190 |     warnings
import time:       269 |        459 |   importlib
import time:       124 |        124 |       __future__
import time:       117 |        117 |       collections.abc
import time:       434 |        434 |         contextlib
import time:      1557 |       1990 |       typing
import time:       351 |        351 |         astroid.util
import time:      1682 |       1682 |         astroid.typing
import time:       892 |       2924 |       astroid.exceptions
import time:       593 |        593 |                 _weakrefset
import time:       912 |       1504 |               weakref
import time:        91 |         91 |                   org
import time:        40 |        131 |                 org.python
import time:        38 |        168 |               org.python.core
import time:       253 |       1925 |             copy
import time:       154 |        154 |                 _ast
import time:      1483 |       1636 |               ast
import time:       143 |        143 |                   _opcode
import time:       285 |        428 |                 opcode
import time:       432 |        859 |               dis
import time:       240 |        240 |               importlib.machinery
import time:       134 |        134 |               linecache
import time:      2600 |       5468 |             inspect
import time:      1872 |       9264 |           dataclasses
import time:       408 |       9672 |         pprint
import time:       281 |        281 |         astroid.context
import time:       670 |        670 |             importlib._abc
import time:       217 |        886 |           importlib.util
import time:       131 |        131 |               fnmatch
import time:       120 |        120 |                 _winapi
import time:        57 |         57 |                 nt
import time:        47 |         47 |                 nt
import time:        46 |         46 |                 nt
import time:        93 |         93 |                 nt
import time:       439 |        800 |               ntpath
import time:        54 |         54 |               errno
import time:       102 |        102 |                 urllib
import time:      1342 |       1444 |               urllib.parse
import time:      1043 |       3470 |             pathlib
import time:       286 |       3756 |           astroid.const
import time:       123 |        123 |             astroid.interpreter
import time:       183 |        306 |           astroid.interpreter._import
import time:       384 |        384 |                 traceback
import time:       225 |        225 |                   _string
import time:      3610 |       3834 |                 string
import time:       910 |        910 |                 threading
import time:       116 |        116 |                 atexit
import time:     10004 |      15246 |               logging
import time:       678 |        678 |               sysconfig
import time:       234 |        234 |               astroid.interpreter._import.util
import time:       587 |        587 |               _sysconfigdata__x86_64-linux-gnu
import time:      2256 |      19000 |             astroid.modutils
import time:       985 |      19985 |           astroid.interpreter._import.spec
import time:       335 |        335 |           astroid.transforms
import time:       502 |      25767 |         astroid.manager
import time:       898 |        898 |         astroid.nodes.as_string
import time:       134 |        134 |         astroid.nodes.const
import time:       240 |        240 |         astroid.nodes.utils
import time:       876 |      37865 |       astroid.nodes.node_ng
import time:      7112 |      50129 |     astroid.nodes._base_nodes
import time:       506 |        506 |       astroid.decorators
import time:       822 |        822 |         astroid.interpreter.objectmodel
import time:       728 |       1550 |       astroid.bases
import time:      3685 |       5740 |     astroid.nodes.node_classes
import time:       121 |        121 |         astroid.filter_statements
import time:       112 |        112 |         astroid.nodes.scoped_nodes.utils
import time:       291 |        522 |       astroid.nodes.scoped_nodes.mixin
import time:       103 |        103 |         astroid.interpreter.dunder_lookup
import time:       657 |        759 |       astroid.nodes.scoped_nodes.scoped_nodes
import time:       283 |       1564 |     astroid.nodes.scoped_nodes
import time:       527 |      57958 |   astroid.nodes
import time:       803 |        803 |       typing_extensions
import time:       230 |       1032 |     astroid.constraint
import time:    119122 |     119122 |       astroid.raw_building
import time:       247 |     119368 |     astroid.helpers
import time:       436 |        436 |     astroid.objects
import time:       237 |        237 |       astroid.arguments
import time:       306 |        543 |     astroid.protocols
import time:       600 |     121978 |   astroid.inference
import time:        93 |         93 |   astroid.__pkginfo__
import time:        84 |         84 |   astroid.astroid_manager
import time:        87 |         87 |     astroid.brain
import time:       189 |        275 |   astroid.brain.helpers
import time:       967 |        967 |     textwrap
import time:       370 |        370 |       astroid._ast
import time:       511 |        880 |     astroid.rebuilder
import time:       275 |       2120 |   astroid.builder
import time:       123 |        123 |   astroid.inference_tip
import time:        99 |         99 |   astroid.brain.brain_numpy_utils
import time:        49 |         49 |     math
import time:        27 |         27 |       _bisect
import time:       141 |        167 |     bisect
import time:        28 |         28 |     _random
import time:        26 |         26 |     _sha512
import time:       418 |        687 |   random
import time:     12916 |     203319 | astroid

Current behavior

Currently astroid import time account for 25+% of the parse time for single file analysis in pylint (common with parallelized pre-commit hook). This is 50+% with --disable=all. This is a follow up to #1320 where the import time was already decreased by 60%. We can see that importing astroid.raw_building now takes 60% of the remaining import time, astroid.nodes._base_nodes (3.5%) and astroid.nodes.node_classes (1,7%) are the next biggest offenders that we control.

Expected behavior

Faster astroid import to make pylint more reactive.

@DanielNoord
Copy link
Collaborator

Doesn't raw_building as a side effect also build the AST for the stdlib?

@Pierre-Sassoulas
Copy link
Member Author

I think the problem might lie in

_astroid_bootstrapping()

def _astroid_bootstrapping() -> None:
"""astroid bootstrapping the builtins module"""
# this boot strapping is necessary since we need the Const nodes to
# inspect_build builtins, and then we can proxy Const
builder = InspectBuilder()

def _set_proxied(const) -> nodes.ClassDef:
# TODO : find a nicer way to handle this situation;
return _CONST_PROXY[const.value.__class__]

@DanielNoord
Copy link
Collaborator

Yeah, that is the building of the stdlib ast.

@Pierre-Sassoulas
Copy link
Member Author

Pierre-Sassoulas commented May 1, 2023

I tried doing it lazily with:

diff --git a/astroid/manager.py b/astroid/manager.py
index 7f62fd42..f88b7360 100644
--- a/astroid/manager.py
+++ b/astroid/manager.py
@@ -87,6 +87,11 @@ class AstroidManager:
 
     @property
     def builtins_module(self) -> nodes.Module:
+        if "builtins" not in self.astroid_cache:
+            from astroid.raw_building import _astroid_bootstrapping
+
+            _astroid_bootstrapping()
+
         return self.astroid_cache["builtins"]

     def visit_transforms(self, node: nodes.NodeNG) -> InferenceResult:

diff --git a/astroid/raw_building.py b/astroid/raw_building.py
index 45eeb10b..4cb38438 100644
--- a/astroid/raw_building.py
+++ b/astroid/raw_building.py
@@ -724,6 +724,3 @@ def _astroid_bootstrapping() -> None:
             )
             builder.object_build(klass, _type)
             astroid_builtin[_type.__name__] = klass
-
-
-_astroid_bootstrapping()

Importing astroid.raw_building is faster but the import time seems equivalent overall (maybe slightly faster but definitely not noticeably). It's probably built at import time anyway, because we instantiate an astroid Manager (?) Don't know if it's possible/reasonable to instantiate the manager in the calling code.

@Pierre-Sassoulas
Copy link
Member Author

The import time was more than halved on my machine (after importing once). First import is 330ms before, 300ms after, but subsequent one are 150ms before, 60ms after.

The final result (hot cache) for reference:

import time:       119 |        119 |   _io
import time:        27 |         27 |   marshal
import time:       239 |        239 |   posix
import time:       351 |        734 | _frozen_importlib_external
import time:        82 |         82 |   time
import time:       138 |        219 | zipimport
import time:        40 |         40 |     _codecs
import time:       238 |        278 |   codecs
import time:       304 |        304 |   encodings.aliases
import time:       441 |       1021 | encodings
import time:       140 |        140 | encodings.utf_8
import time:        65 |         65 | _signal
import time:        24 |         24 |     _abc
import time:        99 |        122 |   abc
import time:       119 |        240 | io
import time:        29 |         29 |       _stat
import time:        56 |         84 |     stat
import time:       658 |        658 |     _collections_abc
import time:        36 |         36 |       genericpath
import time:        63 |         99 |     posixpath
import time:       292 |       1131 |   os
import time:        63 |         63 |   _sitebuiltins
import time:       269 |        269 |         warnings
import time:       120 |        388 |       importlib
import time:        53 |        440 |     importlib.machinery
import time:       137 |        137 |       importlib._abc
import time:       128 |        128 |           itertools
import time:        95 |         95 |           keyword
import time:        44 |         44 |             _operator
import time:       189 |        233 |           operator
import time:       116 |        116 |           reprlib
import time:        54 |         54 |           _collections
import time:       595 |       1218 |         collections
import time:       229 |        229 |           types
import time:        39 |         39 |           _functools
import time:       408 |        675 |         functools
import time:       468 |       2360 |       contextlib
import time:       101 |       2597 |     importlib.util
import time:      1735 |       1735 |           enum
import time:        80 |         80 |             _sre
import time:       376 |        376 |               re._constants
import time:       370 |        746 |             re._parser
import time:       125 |        125 |             re._casefix
import time:       551 |       1500 |           re._compiler
import time:       123 |        123 |           copyreg
import time:      8565 |      11921 |         re
import time:       127 |      12047 |       fnmatch
import time:        46 |         46 |       errno
import time:        81 |         81 |         urllib
import time:       865 |        945 |       urllib.parse
import time:        50 |         50 |         _winapi
import time:        40 |         40 |         nt
import time:        37 |         37 |         nt
import time:        48 |         48 |         nt
import time:        31 |         31 |         nt
import time:        34 |         34 |         nt
import time:       253 |        490 |       ntpath
import time:       694 |      14220 |     pathlib
import time:       415 |      17670 |   __editable___astroid_3_0_0a5_dev0_finder
import time:       121 |        121 |     apport_python_hook
import time:       185 |        305 |   sitecustomize
import time:      4083 |      23251 | site
import time:       126 |        126 |     token
import time:       907 |       1032 |   tokenize
import time:       110 |        110 |       __future__
import time:       114 |        114 |       collections.abc
import time:       119 |        119 |         _typing
import time:      1871 |       1989 |       typing
import time:       214 |        214 |         astroid.util
import time:      1517 |       1517 |         astroid.typing
import time:       526 |       2256 |       astroid.exceptions
import time:       177 |        177 |                 _weakrefset
import time:      4525 |       4701 |               weakref
import time:       116 |        116 |                   org
import time:        42 |        157 |                 org.python
import time:        43 |        200 |               org.python.core
import time:       239 |       5140 |             copy
import time:        85 |         85 |                 _ast
import time:      1277 |       1361 |               ast
import time:        41 |         41 |                   _opcode
import time:       325 |        365 |                 opcode
import time:       781 |       1146 |               dis
import time:       129 |        129 |               linecache
import time:      2861 |       5496 |             inspect
import time:       492 |      11127 |           dataclasses
import time:       257 |      11383 |         pprint
import time:       847 |        847 |         astroid.context
import time:       208 |        208 |           astroid.const
import time:        90 |         90 |             astroid.interpreter
import time:       157 |        247 |           astroid.interpreter._import
import time:       765 |        765 |                   textwrap
import time:       441 |       1205 |                 traceback
import time:        31 |         31 |                   _string
import time:      1088 |       1119 |                 string
import time:       477 |        477 |                 threading
import time:        36 |         36 |                 atexit
import time:      1675 |       4510 |               logging
import time:       468 |        468 |               sysconfig
import time:       147 |        147 |               astroid.interpreter._import.util
import time:       462 |        462 |               _sysconfigdata__x86_64-linux-gnu
import time:      1361 |       6945 |             astroid.modutils
import time:       751 |       7696 |           astroid.interpreter._import.spec
import time:       395 |        395 |           astroid.transforms
import time:       451 |       8995 |         astroid.manager
import time:       401 |        401 |         astroid.nodes.as_string
import time:        99 |         99 |         astroid.nodes.const
import time:       220 |        220 |         astroid.nodes.utils
import time:      6258 |      28199 |       astroid.nodes.node_ng
import time:      1284 |      33950 |     astroid.nodes._base_nodes
import time:       153 |        153 |       astroid.decorators
import time:       590 |        590 |         astroid.interpreter.objectmodel
import time:       443 |       1032 |       astroid.bases
import time:      3515 |       4698 |     astroid.nodes.node_classes
import time:       104 |        104 |         astroid.filter_statements
import time:        78 |         78 |         astroid.nodes.scoped_nodes.utils
import time:       284 |        465 |       astroid.nodes.scoped_nodes.mixin
import time:        97 |         97 |         astroid.interpreter.dunder_lookup
import time:       791 |        888 |       astroid.nodes.scoped_nodes.scoped_nodes
import time:       210 |       1562 |     astroid.nodes.scoped_nodes
import time:       282 |      40491 |   astroid.nodes
import time:       275 |        275 |     astroid.constraint
import time:       321 |        321 |       astroid.objects
import time:       269 |        269 |       astroid.raw_building
import time::       229 |        818 |     astroid.helpers
import time::       142 |        142 |       astroid.arguments
import time::       659 |        801 |     astroid.protocols
import time::       638 |       2530 |   astroid.inference
import time::       104 |        104 |   astroid.__pkginfo__
import time::        80 |         80 |   astroid.astroid_manager
import time::        82 |         82 |     astroid.brain
import time::       246 |        327 |   astroid.brain.helpers
import time::       480 |        480 |       astroid._ast
import time::       643 |       1122 |     astroid.rebuilder
import time::       332 |       1454 |   astroid.builder
import time::       118 |        118 |   astroid.inference_tip
import time::        89 |         89 |   astroid.brain.brain_numpy_utils
import time::        62 |         62 |     math
import time::        28 |         28 |       _bisect
import time:       127 |        155 |     bisect
import time:        26 |         26 |     _random
import time:        28 |         28 |     _sha512
import time:       412 |        679 |   random
import time:      7837 |      54737 | astroid

Biggest import time now are:

self [us] cumulative imported package
6258 28199 astroid.nodes.node_ng
1284 33950 astroid.nodes._base_nodes
282 40491 astroid.nodes
7837 54737 astroid

Which is to be expected there's a lot inside those module (and it's now shorter than importing re).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component topic-performance
Projects
None yet
3 participants