Skip to content

Improve import time #326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Sachaa-Thanasius opened this issue Mar 30, 2025 · 3 comments
Open

Improve import time #326

Sachaa-Thanasius opened this issue Mar 30, 2025 · 3 comments

Comments

@Sachaa-Thanasius
Copy link

Sachaa-Thanasius commented Mar 30, 2025

importlib_resources currently takes a while to import. From rough local testing with whatever Python versions I have installed, the figures are something like this:

OS Python Implementation Python Version(s) Import Time (ms)
Windows 10 CPython 3.9-3.13 ~27-37
Windows 10 PyPy 3.10 ~89
WSL 2 CPython 3.9-3.13 ~8.5-12.5

(The numbers were obtained by cloning the main branch and running <python> -m timeit -n 1 -r 1 -- "import importlib_resources" a bunch without a venv, just raw import without site-packages baggage. If need be, I can get more precise results.)

For context on my perspective, most of the standard library modules that I've interacted with take a fraction of that time.

To improve this, I've been messing about and managed to get the figures down by at least 7x:

OS Python Implementation Python Version(s) Import Time (ms)
Windows 10 CPython 3.9-3.13 ~3.5-6.5
Windows 10 PyPy 3.10 ~5.5
WSL 2 CPython 3.9-3.13 ~0.6-1.5

This improvement mostly comes from the following:

  1. Replacing the inspect usage with a cheaper pattern that other stdlib modules use.
  2. Isolating typing-related symbols and annotations-related symbols so that they are only imported when deferred annotations using them are evaluated (with e.g. inspect.get_annotations or typing.get_type_hints).
  3. Isolating less-used but very expensive modules/submodules such that they are only imported on demand, i.e. only when deferred annotations containing them are evaluated or functions using them are called.
  4. Consolidating small submodules to avoid reading and compiling several extra files (also, it makes the structure more like importlib_metadata's, imo).

That's all without a few tricks employed by other standard library modules like importlib._bootstrap and importlib.util, e.g. avoiding importing functools and contextlib in exchange for slightly more verbose code. Doing those would make the improvement at least 10x.

Considering this is a "foundational" library (meant to partially replace the widely used pkg_resources) and is also in the standard library, and the potential for improvement is so large with a pretty small amount of effort, I was wondering if there would be interest in improving this, and if so, whether it would be acceptable for me to attempt that by upstreaming some of my work?

P.S. The branch I've been experimenting on is here if you're curious, though it does too many things (e.g. fixes some bugs, completes the typing, adds unnecessary personalization) for me to consider making a PR with it directly. Makes more sense to do things piecemeal.

@Sachaa-Thanasius
Copy link
Author

Just following up to gauge interest in making this happen.

cc @jaraco

@jaraco
Copy link
Member

jaraco commented Apr 6, 2025

Thanks for the proposal! I'm open to these improvements, especially where the value exceeds the drawbacks. Here's what I recommend:

  • Perform independent changes in separate commits. This way, we can capture the specific performance benefits of each change and make choices about the tradeoffs of each commit separately.
  • Consider deferring the more invasive changes for later commits, as they're more likely to be rewritten or declined.
  • Where the changes aren't protected by regression tests, I'll often suggest to include comments to protect the change (so it isn't "optimized" away by a subsequent change).
  • For very independent changes, consider contributing them as separate PRs.

importlib_resources currently takes a while to import.

It's all relative. This library dramatically improved import time over pkg_resources.

If need be, I can get more precise results.

No need, but let's definitely collect per-commit differences in some environment (CPython, any OS).

4. Consolidating small submodules to avoid reading and compiling several extra files (also, it makes the structure more like importlib_metadata's, imo).

I'll be resistant to this change, especially if it degrades the cognitive benefits of separation of concerns, but feel free to propose it.

I was wondering if there would be interest in improving this, and if so, whether it would be acceptable for me to attempt that by upstreaming some of my work?

Definitely! And once contributed here, it will get rolled into CPython as well (as importlib.resources). Thanks for contributing here as it provides a richer way to contribute faster.

Feel free also to contribute any bug fixes as well (in separate PRs).

@Sachaa-Thanasius
Copy link
Author

I'll get on it, then. We'll see how this goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants