Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-126156: Improve performance of creating Morsel objects #126157

Merged
merged 10 commits into from
Oct 31, 2024

Conversation

bdraco
Copy link
Contributor

@bdraco bdraco commented Oct 30, 2024

Replaces the manually constructed loop with using the dict.update on a pre-constructed dict

This works out to 3.8x faster, benchmark below

Replaces the manually constructed loop with a call to `dict.update`
@bdraco bdraco marked this pull request as ready for review October 30, 2024 01:48
Copy link
Contributor

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have benchmarks using pyperf instead of timeit please? In addition can you tell us whether benchmarks are using a debug build, a release build or a PGO build? (the last one would be the realistic one). Thanks!

Lib/http/cookies.py Outdated Show resolved Hide resolved
@ZeroIntensity
Copy link
Member

ZeroIntensity commented Oct 30, 2024

I doubt we'll see much of a speedup on pyperformance, this is a tiny change. The timeit benchmark is more important to me here.

@picnixz
Copy link
Contributor

picnixz commented Oct 30, 2024

I doubt we'll see much of a speedup on pyperformance

I wasn't talking about pyperformance but with pyperf. I want to know whether it's a release or debug build as well (this matters a lot). In addition, this could help us in quantizing the precise improvements in a NEWS entry (using the geometric mean, not just max-comparisons).

Lib/http/cookies.py Outdated Show resolved Hide resolved
@bdraco
Copy link
Contributor Author

bdraco commented Oct 31, 2024

Python 3.14.0a1 (v3.14.0a1:8cdaca8b25e, Oct 15 2024, 15:51:57) [Clang 16.0.0 (clang-1600.0.26.3)] on darwin (python mac installer)

python3.14 morel_bench_pypref.py
.....................
Morsel: Mean +- std dev: 451 ns +- 8 ns
.....................
FastMorsel: Mean +- std dev: 112 ns +- 2 ns

Python 3.13.0rc2 (v3.13.0rc2:ec610069637, Sep 6 2024, 17:57:29) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin` (python mac installer)

python3.13 morel_bench_pypref.py
.....................
Morsel: Mean +- std dev: 494 ns +- 7 ns
.....................
FastMorsel: Mean +- std dev: 159 ns +- 2 ns

Python 3.11.8 (main, Feb 6 2024, 21:21:21) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin (homebrew)

.....................
Morsel: Mean +- std dev: 553 ns +- 7 ns
.....................
FastMorsel: Mean +- std dev: 181 ns +- 2 ns

Benchmark

from http.cookies import Morsel
import timeit
import pyperf


class FastMorselWithUpdate(dict):
    """A class to hold ONE (key, value) pair.

    In a cookie, each such pair may have several attributes, so this class is
    used to keep the attributes associated with the appropriate key,value pair.
    This class also includes a coded_value attribute, which is used to hold
    the network representation of the value.
    """

    # RFC 2109 lists these attributes as reserved:
    #   path       comment         domain
    #   max-age    secure      version
    #
    # For historical reasons, these attributes are also reserved:
    #   expires
    #
    # This is an extension from Microsoft:
    #   httponly
    #
    # This dictionary provides a mapping from the lowercase
    # variant on the left to the appropriate traditional
    # formatting on the right.
    _reserved = {
        "expires": "expires",
        "path": "Path",
        "comment": "Comment",
        "domain": "Domain",
        "max-age": "Max-Age",
        "secure": "Secure",
        "httponly": "HttpOnly",
        "version": "Version",
        "samesite": "SameSite",
    }

    _reserved_defaults = dict.fromkeys(_reserved, "")

    _flags = {"secure", "httponly"}

    def __init__(self):
        # Set defaults
        self._key = self._value = self._coded_value = None

        # Set default attributes
        dict.update(self, self._reserved_defaults)


runner = pyperf.Runner()
runner.bench_func(name="Morsel", func=Morsel)
runner.bench_func(name="FastMorsel", func=FastMorselWithUpdate)

Copy link
Contributor

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that those numbers are for a release + PGO build and they look great! Overall, using a class variable for defaults reduces the variation in the running time so it's better as well IMO. The numbers I got are as follows:

Using dict.update(self, self._reserved_defaults)

.....................
Morsel (ref): Mean +- std dev: 484 ns +- 11 ns
.....................
Morsel (new): Mean +- std dev: 124 ns +- 3 ns
+-----------+--------+----------------------+
| Benchmark | ref    | new                  |
+===========+========+======================+
| Morsel    | 484 ns | 124 ns: 3.89x faster |
+-----------+--------+----------------------+

Using self |= self._reserved_defaults

.....................
Morsel (ref): Mean +- std dev: 477 ns +- 13 ns
.....................
Morsel (new): Mean +- std dev: 106 ns +- 2 ns
+-----------+--------+----------------------+
| Benchmark | ref    | new                  |
+===========+========+======================+
| Morsel    | 477 ns | 106 ns: 4.52x faster |
+-----------+--------+----------------------+

So we can be a bit more faster if we use the |= operator instead (I think we can also update the NEWS entry to include the improvement factor which is 4.5 in non-debug optimized builds).

@bdraco
Copy link
Contributor Author

bdraco commented Oct 31, 2024

I assume that those numbers are for a release + PGO build and they look great!

Yes, they are

So we can be a bit more faster if we use the |= operator instead (I think we can also update the NEWS entry to include the improvement factor which is 4.5 in non-debug optimized builds).

I made the adjustment and updated the NEWS entry

@picnixz
Copy link
Contributor

picnixz commented Oct 31, 2024

Ah! I just wondered something. Is there any reason to use dict.update in order to by-pass a custom __ior__/update actually? because that's what the current implementation does (namely bypassing a custom update).

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Typically, we don't like to accept changes for performance increases on pure-Python code, but I think this one is small enough that it's ok.

@bdraco
Copy link
Contributor Author

bdraco commented Oct 31, 2024

Ah! I just wondered something. Is there any reason to use dict.update in order to by-pass a custom __ior__/update actually? because that's what the current implementation does (namely bypassing a custom update).

AFAICT it would only be a concern for .update() since custom __ior__ isn't implemented in Morsel

Running the change as aio-libs/aiohttp#9594 as well

@picnixz
Copy link
Contributor

picnixz commented Oct 31, 2024

LGTM. Typically, we don't like to accept changes for performance increases on pure-Python code

Isn't it assuming that there is no C implementation (and I don't think there is, or is there?)?

AFAICT it would only be a concern for .update() since custom ior isn't implemented in Morsel

Yes, but what about a user overriding it? maybe we should keep dict.update in the constructor?

@bdraco
Copy link
Contributor Author

bdraco commented Oct 31, 2024

I guess someone might subclass it so I reverted to the dict.update call

@picnixz picnixz self-requested a review October 31, 2024 16:06
Copy link
Contributor

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that and sorry for the back and forth!

…OSqv0.rst

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
@bdraco
Copy link
Contributor Author

bdraco commented Oct 31, 2024

Thanks for that and sorry for the back and forth!

No worries. It was quite constructive. Thanks.

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Oct 31, 2024

we don't like to accept changes for performance increases on pure-Python code

@ZeroIntensity we do accept performance improvements in pure Python code. They just a) need to either be substantial improvements or in very hot code, b) not hurt readability, c) not meaningfully change semantics. We also consider ourselves to be not perfect judges of "not meaningfully change semantics" (this is e.g. an important reason why we reject cosmetic changes / refactors)

This case is motivated by real-world profiling in addition to having a meaningful result on microbenchmark, and is arguably more readable. All to say, thanks for a nice change! :-)

@hauntsaninja hauntsaninja merged commit dd3c0fa into python:main Oct 31, 2024
40 checks passed
@ZeroIntensity
Copy link
Member

@hauntsaninja Yeah, agreeing with all of that. I more meant that it's common for performance increases in pure-Python to not meet those requirements, and this PR is small enough that it's not a problem. Sorry for the confusion!

@bdraco
Copy link
Contributor Author

bdraco commented Oct 31, 2024

Thanks

@bdraco bdraco deleted the morsel_init branch October 31, 2024 20:48
picnixz pushed a commit to picnixz/cpython that referenced this pull request Dec 8, 2024
…thon#126157)

Replaces the manually constructed loop with a call to `dict.update`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants