Description
deepcopy()
can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets.
Adding a fast path for this case similar to #114266 would significantly speed up such cases by about 4x - 28x while having little impact on the default path and not adding too much complexity. With such a patch the following benchmarking script would show a significant speedup compared to main
:
import pyperf
runner = pyperf.Runner()
setup = """
import copy
a = {"list": [1, 2 ,3 ,4], "t": (1, 2, 3), "str": "hello", "subdict": {"a": True}}
t = ()
fs = frozenset()
l = []
s = set()
d = {}
deep = [[], (), {}, set(), frozenset()]
"""
runner.timeit(name="deepcopy dict", stmt=f"b = copy.deepcopy(a)", setup=setup)
runner.timeit(name="deepcopy empty tuple", stmt=f"b = copy.deepcopy(t)", setup=setup)
runner.timeit(name="deepcopy empty frozenset", stmt=f"b = copy.deepcopy(fs)", setup=setup)
runner.timeit(name="deepcopy empty list", stmt=f"b = copy.deepcopy(l)", setup=setup)
runner.timeit(name="deepcopy empty set", stmt=f"b = copy.deepcopy(s)", setup=setup)
runner.timeit(name="deepcopy empty dict", stmt=f"b = copy.deepcopy(d)", setup=setup)
runner.timeit(name="deepcopy multiple empty containers", stmt=f"b = copy.deepcopy(deep)", setup=setup)
deepcopy dict: Mean +- std dev: [baseline] 1.86 us +- 0.06 us -> [optimize-empty-copy] 2.02 us +- 0.02 us: 1.09x slower
deepcopy empty tuple: Mean +- std dev: [baseline] 285 ns +- 2 ns -> [optimize-empty-copy] 48.4 ns +- 0.9 ns: 5.89x faster
deepcopy empty frozenset: Mean +- std dev: [baseline] 1.47 us +- 0.11 us -> [optimize-empty-copy] 49.9 ns +- 1.5 ns: 29.44x faster
deepcopy empty list: Mean +- std dev: [baseline] 323 ns +- 2 ns -> [optimize-empty-copy] 82.7 ns +- 2.5 ns: 3.91x faster
deepcopy empty set: Mean +- std dev: [baseline] 1.46 us +- 0.10 us -> [optimize-empty-copy] 85.4 ns +- 4.9 ns: 17.04x faster
deepcopy empty dict: Mean +- std dev: [baseline] 326 ns +- 4 ns -> [optimize-empty-copy] 83.3 ns +- 2.6 ns: 3.91x faster
deepcopy multiple empty containers: Mean +- std dev: [baseline] 4.13 us +- 0.04 us -> [optimize-empty-copy] 1.16 us +- 0.02 us: 3.56x faster
Geometric mean: 5.48x faster
This might conflict with @eendebakpt efforts in #91610 or could be something that should be added to the proposed C version as well.
For context, I noticed this when using pydantic models with mutable default values where pydantic would deep copy the default value upon class instantiation. E.g.:
class Foo(pydantic.BaseModel):
bar: list[int] = []
To be fair the proper fix in this case would be not to use a mutable default value in pydantic and switch to pydantic.Field(default_factory=list)
similar to dataclasses instead which is much faster. But potentially there might be other scenarios where deepcopying empty iterables might be common.
I'm happy to make a PR unless it conflicts with the efforts going on in #91610.