Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-44953: Add vectorcall for itemgetter and attrgetter #27828

Merged
merged 5 commits into from
Feb 10, 2022

Conversation

sweeneyde
Copy link
Member

@sweeneyde sweeneyde commented Aug 19, 2021

@sweeneyde
Copy link
Member Author

This was inspired by GH-27782, which does the same, and more, for methodcaller.

@sweeneyde
Copy link
Member Author

Benchmarks for this change:

itemgetter: Mean +- std dev: [operator_main] 45.3 ns +- 1.3 ns -> [operator_vec] 29.5 ns +- 0.7 ns: 1.54x faster
attrgetter: Mean +- std dev: [operator_main] 61.6 ns +- 1.7 ns -> [operator_vec] 43.8 ns +- 0.9 ns: 1.41x faster

@sweeneyde sweeneyde requested a review from markshannon August 19, 2021 09:40
@rhettinger rhettinger added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 20, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit f9fc83a 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 20, 2021
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Sep 20, 2021
@sweeneyde
Copy link
Member Author

The __vectorcalloffset__ behavior was added in GH-20026. I don't think I understand that design decision rather than adding a new PyType_Slot.slot value called something like Py_tp_vectorcall, but it appears to be what we have. The corresponding behavior of __dictoffset__ and __weaklistoffset__ goes back to 2019 in GH-16076. Are there any plans to change these, or is there anything holding this up?

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Oct 4, 2021
@sweeneyde
Copy link
Member Author

sweeneyde commented Jan 3, 2022

Any plans for this? It may also be relevant to bpo-46148 about optimizing Pathlib.

@sweeneyde
Copy link
Member Author

Does anyone with a commit bit want to merge this? This is the kind of callable that's often called in loops.

@sweeneyde
Copy link
Member Author

I re-ran some benchmarks, and they still look good. I plan to merge this in about 24 hours if there are no more objections.

from operator import itemgetter, attrgetter
from itertools import repeat
from collections import namedtuple, deque
from pyperf import Runner

class DefaultClass:
    def __init__(self, a, b):
        self.a = a
        self.b = b

class SlotsClass:
    __slots__ = "a", "b"
    def __init__(self, a, b):
        self.a = a
        self.b = b

NamedTuple = namedtuple("NT", ["a", "b"])
MAP_LOOPS = 10_000

attr_classes = {
    'DefaultClass': DefaultClass,
    'SlotsClass': SlotsClass,
    'NamedTuple': NamedTuple,
}

item_classes = {
    'tuple': tuple,
    'list': list,
    'dict': dict.fromkeys,
}

namespace = {
    'IG': itemgetter(1),
    'AG': attrgetter('a'),
    'repeat': repeat,
    'deque': deque,
} | attr_classes | item_classes

runner = Runner()

for classname in attr_classes:
    runner.timeit(
        name=f"{classname}-1",
        setup=f"obj = {classname}(11, 22)",
        stmt="AG(obj)",
        globals=namespace
    )
    runner.timeit(
        name=f"{classname}-map",
        setup=f"obj = {classname}(11, 22)",
        stmt=f"deque(map(AG, repeat(obj, {MAP_LOOPS})), maxlen=0)",
        globals=namespace,
        inner_loops=MAP_LOOPS,
    )

for classname in item_classes:
    runner.timeit(
        name=f"{classname}-1",
        setup=f"obj = {classname}((1, 2, 3, 4, 5))",
        stmt="IG(obj)",
        globals=namespace,
    )
    runner.timeit(
        name=f"{classname}-map",
        setup=f"obj = {classname}((1, 2, 3, 4, 5))",
        stmt=f"deque(map(IG, repeat(obj, {MAP_LOOPS})), maxlen=0)",
        globals=namespace,
        inner_loops=MAP_LOOPS,
    )

Results:

Benchmark main_bench PR-27828
tuple-map 27.6 ns 13.2 ns: 2.09x faster
dict-map 39.4 ns 24.2 ns: 1.62x faster
NamedTuple-map 40.0 ns 24.8 ns: 1.61x faster
list-map 38.2 ns 23.8 ns: 1.61x faster
SlotsClass-map 42.8 ns 28.1 ns: 1.53x faster
DefaultClass-map 45.5 ns 31.0 ns: 1.47x faster
tuple-1 56.7 ns 41.4 ns: 1.37x faster
NamedTuple-1 70.1 ns 52.7 ns: 1.33x faster
dict-1 69.4 ns 52.2 ns: 1.33x faster
SlotsClass-1 75.1 ns 56.6 ns: 1.33x faster
list-1 68.7 ns 52.1 ns: 1.32x faster
DefaultClass-1 76.9 ns 59.4 ns: 1.29x faster
Geometric mean (ref) 1.48x faster

@sweeneyde sweeneyde merged commit 0a14506 into python:main Feb 10, 2022
@sweeneyde sweeneyde deleted the itemgetter_vectorcall branch February 10, 2022 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants