gh-103193: cache calls to `inspect._shadowed_dict` in `inspect.getattr_static` #104267

AlexWaygood · 2023-05-07T14:59:12Z

EDIT: These performance numbers are out of date with the current PR; read further down the thread for the up-to-date numbers.

This dramatically speeds up calls to inspect.getattr_static.

Here are benchmark results on main, using @sobolevn's benchmark script from #103193 (comment):

type[Foo]                :   88 ±  0 ns
Foo                      :  158 ±  0 ns
type[Bar]                :   89 ±  0 ns
Bar                      :  158 ±  0 ns
WithParentClassX         :  224 ±  0 ns
Baz                      :  205 ±  0 ns
WithParentX              :  271 ±  1 ns
type[Missing]            :  252 ±  0 ns
Missing                  :  207 ±  0 ns
Slotted                  :  200 ±  1 ns
Method                   :  160 ±  1 ns
StMethod                 :  158 ±  0 ns
ClsMethod                :  158 ±  0 ns

And here are results with this PR:

type[Foo]                :   54 ±  0 ns
Foo                      :   85 ±  0 ns
type[Bar]                :   53 ±  0 ns
Bar                      :   85 ±  0 ns
WithParentClassX         :  102 ±  0 ns
Baz                      :   96 ±  0 ns
WithParentX              :  113 ±  0 ns
type[Missing]            :  110 ±  0 ns
Missing                  :   98 ±  0 ns
Slotted                  :  137 ±  0 ns
Method                   :   85 ±  0 ns
StMethod                 :   85 ±  0 ns
ClsMethod                :   85 ±  0 ns

With this PR, inspect.getattr_static is fast enough that even isinstance() calls like this, with "pathological" runtime-checkable protocols, are faster than they were on Python 3.11:

Pathological protocol

from typing import Protocol, runtime_checkable

@runtime_checkable
class Foo(Protocol):
    a: int
    b: int
    c: int
    d: int
    e: int
    f: int
    g: int
    h: int
    i: int
    j: int
    k: int
    l: int
    m: int
    n: int
    o: int
    p: int
    q: int
    r: int
    s: int
    t: int
    u: int
    v: int
    w: int
    x: int
    y: int
    z: int

class Bar:
    def __init__(self):
        for attrname in 'abcdefghijklmnopqrstuvwxyz':
            setattr(self, attrname, 42)

isinstance(Bar(), Foo)

This approach makes me a little nervous. There are two plausible reasons I thought of why adding a cache here might not be a good idea:

It could cause references to the klass argument to be held by the cache even after they've been deleted elsewhere, which could be unexpected behaviour for a low-level function like getattr_static
Perhaps it's possible that whether or not a __dict__ attribute is shadowed could change at some point during the lifetime of a class.

However, I think we're okay on both points.

For objection (1): _shadowed_dict is only ever called on type objects. The vast majority of classes are defined once in the global namespace and never deleted, so it shouldn't be an issue that the cache is holding strong references to the type objects. (If we do think this is an issue, I also experimented with a version of this PR that uses a weakref.WeakKeyDictionary as a cache. It also sped things up, but not by nearly as much.)

For objection (2): It doesn't seem to be possible, once a class has been created, to change the class's __dict__ attribute, at least from Python code. So for any given class klass, _shadowed_dict(klass) should always return the same result.

@carljm, what do you think?

Issue: Optimise inspect.getattr_static #103193

carljm · 2023-05-07T15:09:14Z

So for any given class klass, _shadowed_dict(klass) should always return the same result.

The __dict__ can't change, but the __mro__ can :/

>>> class A: pass
...
>>> class B: pass
...
>>> class C(A): pass
...
>>> C.__mro__
(<class '__main__.C'>, <class '__main__.A'>, <class 'object'>)
>>> C.__bases__ = (B,)
>>> C.__mro__
(<class '__main__.C'>, <class '__main__.B'>, <class 'object'>)

So I don't think this is safe.

AlexWaygood · 2023-05-07T15:12:24Z

The __dict__ can't change, but the __mro__ can :/

Well, good that we're discovering holes in getattr_static test coverage :)

What about something like this, where _shadowed_dict is only cached for the lifetime of a single getattr_static call? It's slower than this patch currently, as it turns out that the call to .cache_clear() is pretty expensive. But it's still noticeably faster than main:

# add @lru_cache() to _shadowed_dict, like in this PR currently

def getattr_static(obj, attr, default=_sentinel):
    try:
        # existing getattr_static implementation here
    finally:
        _shadowed_dict.cache_clear()

carljm · 2023-05-07T15:18:06Z

How does the performance comparison look if you make _shadowed_dict take the mro tuple rather than the class? I.e. call sites would turn into _shadowed_dict(_static_getmro(klass)).

I think if you did that, then it would be safe to put an LRU cache on _shadowed_dict.

I think it should be a bounded cache, though, not unbounded.

The pathological scenario would be someone creating lots of transient classes inside a function, and calling getattr_static on instances of them. It seems prettu unlikely that someone is doing all of that, and expecting the transient classes to be GCed. But I think we at least should ensure the memory use in such a case is bounded.

AlexWaygood · 2023-05-07T15:20:23Z

I think it should be a bounded cache, though, not unbounded.

@functools.lru_cache() creates a cache with size 128 by default, so my PR already proposes a bounded cache ;)

>>> functools.lru_cache(lambda: 1).cache_info()
CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)

AlexWaygood · 2023-05-07T15:29:44Z

How does the performance comparison look if you make _shadowed_dict take the mro tuple rather than the class? I.e. call sites would turn into _shadowed_dict(_static_getmro(klass)).

It's not as fast as this PR, but it's a lot faster than main. I'll switch to that approach.

AlexWaygood · 2023-05-07T15:38:06Z

Lib/inspect.py

+def _shadowed_dict(klass):
+    return _shadowed_dict_from_mro_tuple(_static_getmro(klass))


I considered inlining the calls to _static_getmro, but I thought this made it more readable. It also preserves git blame better, and having to pass through this function doesn't seem to have much impact on performance

AlexWaygood · 2023-05-07T16:30:07Z

Here's the new performance numbers using the getattr_static benchmark:

type[Foo]                :   61 ±  0 ns
Foo                      :  100 ±  0 ns
type[Bar]                :   61 ±  0 ns
Bar                      :  100 ±  0 ns
WithParentClassX         :  129 ±  0 ns
Baz                      :  118 ±  0 ns
WithParentX              :  146 ±  0 ns
type[Missing]            :  148 ±  0 ns
Missing                  :  126 ±  1 ns
Slotted                  :  162 ±  2 ns
Method                   :  105 ±  0 ns
StMethod                 :  106 ±  0 ns
ClsMethod                :  100 ±  0 ns

The same benchmark on main:

type[Foo]                :   90 ±  0 ns
Foo                      :  163 ±  0 ns
type[Bar]                :   90 ±  0 ns
Bar                      :  162 ±  0 ns
WithParentClassX         :  229 ±  0 ns
Baz                      :  209 ±  0 ns
WithParentX              :  276 ±  0 ns
type[Missing]            :  258 ±  0 ns
Missing                  :  213 ±  0 ns
Slotted                  :  203 ±  0 ns
Method                   :  162 ±  0 ns
StMethod                 :  163 ±  0 ns
ClsMethod                :  163 ±  0 ns

With this PR now, isinstance() checks against runtime-checkable protocols are faster than they were on 3.11 as long as the protocol has 13 members or fewer. On main, they are only faster than on 3.11 with 6 members or fewer.

On main, the results of the typing_runtime_protocols pyperformance benchmark are 262 us +- 11 us. With this PR, they are 178 us +- 9 us.

AlexWaygood · 2023-05-07T16:32:13Z

(Skipping news because, if this patch is correct, it shouldn't change behaviour at all other than improving performance. And the news entry in https://github.com/python/cpython/pull/103195/files should already cover the performance boost.)

AlexWaygood · 2023-05-07T17:45:58Z

Thanks @carljm :D

I listed you as a co-author on the commit; I don't think I would have thought of using the mro tuple as a cache key on my own :)

…getattr_static` (python#104267) Co-authored-by: Carl Meyer <carl@oddbird.net>

pythongh-103193: cache calls to inspect._shadowed_dict

cb580ba

AlexWaygood added the performance Performance or resource usage label May 7, 2023

bedevere-bot mentioned this pull request May 7, 2023

Optimise inspect.getattr_static #103193

Closed

AlexWaygood requested a review from carljm May 7, 2023 14:59

Use mro tuple as the cache key

c26587c

AlexWaygood commented May 7, 2023

View reviewed changes

AlexWaygood added 2 commits May 7, 2023 16:46

Add regression test for if a class has its mro mutated(!)

b3cc123

Update perf numbers in whatsnew

2f4b42f

AlexWaygood added the skip news label May 7, 2023

AlexWaygood marked this pull request as ready for review May 7, 2023 16:22

bedevere-bot added the awaiting core review label May 7, 2023

AlexWaygood added type-feature A feature request or enhancement stdlib Python modules in the Lib dir 3.12 bugs and security fixes labels May 7, 2023

AlexWaygood changed the title ~~gh-103193: cache calls to inspect._shadowed_dict~~ gh-103193: cache calls to inspect._shadowed_dict in inspect.getattr_static May 7, 2023

carljm approved these changes May 7, 2023

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels May 7, 2023

AlexWaygood merged commit 1b19bd1 into python:main May 7, 2023

bedevere-bot removed the awaiting merge label May 7, 2023

AlexWaygood deleted the simple-lru-cache branch May 7, 2023 17:45

jbower-fb pushed a commit to jbower-fb/cpython-jbowerfb that referenced this pull request May 8, 2023

pythongh-103193: cache calls to inspect._shadowed_dict in `inspect.…

86ca8d2

…getattr_static` (python#104267) Co-authored-by: Carl Meyer <carl@oddbird.net>

williamwen42 mentioned this pull request Apr 17, 2024

[3.12] getattr_static can result in reference leaks #118013

Closed

AlexWaygood mentioned this pull request Apr 23, 2024

gh-118013: Use weakrefs for the cache key in inspect._shadowed_dict #118202

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-103193: cache calls to `inspect._shadowed_dict` in `inspect.getattr_static` #104267

gh-103193: cache calls to `inspect._shadowed_dict` in `inspect.getattr_static` #104267

AlexWaygood commented May 7, 2023 •

edited

Loading

carljm commented May 7, 2023

AlexWaygood commented May 7, 2023 •

edited

Loading

carljm commented May 7, 2023

AlexWaygood commented May 7, 2023 •

edited

Loading

AlexWaygood commented May 7, 2023

AlexWaygood May 7, 2023 •

edited

Loading

AlexWaygood commented May 7, 2023 •

edited

Loading

AlexWaygood commented May 7, 2023

AlexWaygood commented May 7, 2023

		def _shadowed_dict(klass):
		return _shadowed_dict_from_mro_tuple(_static_getmro(klass))

gh-103193: cache calls to inspect._shadowed_dict in inspect.getattr_static #104267

gh-103193: cache calls to inspect._shadowed_dict in inspect.getattr_static #104267

Conversation

AlexWaygood commented May 7, 2023 • edited Loading

carljm commented May 7, 2023

AlexWaygood commented May 7, 2023 • edited Loading

carljm commented May 7, 2023

AlexWaygood commented May 7, 2023 • edited Loading

AlexWaygood commented May 7, 2023

AlexWaygood May 7, 2023 • edited Loading

Choose a reason for hiding this comment

AlexWaygood commented May 7, 2023 • edited Loading

AlexWaygood commented May 7, 2023

AlexWaygood commented May 7, 2023

gh-103193: cache calls to `inspect._shadowed_dict` in `inspect.getattr_static` #104267

gh-103193: cache calls to `inspect._shadowed_dict` in `inspect.getattr_static` #104267

AlexWaygood commented May 7, 2023 •

edited

Loading

AlexWaygood commented May 7, 2023 •

edited

Loading

AlexWaygood commented May 7, 2023 •

edited

Loading

AlexWaygood May 7, 2023 •

edited

Loading

AlexWaygood commented May 7, 2023 •

edited

Loading