-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Although this would be beneficial for all int use cases, I am currently motivated by range / enumerate / itertools.count performance.
What dominates performance in all 3 mentioned utilities is PyLongObject creation:
# PY3.12
from collections import deque
consume = deque(maxlen=0).extend
INT_RANGE = list(range(100_000))
%timeit consume(range(100_000)) # 2.3 ms
%timeit consume(iter(INT_RANGE)) # 0.6 msAlso, there are FREELISTs:
S1="
import collections as coll
BHOLE = coll.deque(maxlen=0).extend
"
S2="
import collections as coll
BHOLE = coll.deque(maxlen=0).extend
a = list(range(100_000)) # This consumes FREELIST
"
PYEXE="./cpython/main/python.exe"
$PYEXE -m timeit -s $S1 'BHOLE(range(100_000))' # 1.2 ms
$PYEXE -m timeit -s $S2 'BHOLE(range(100_000))' # 1.2 ms
$PYEXE -m timeit -s $S2 'BHOLE(iter(a))' # 0.3 msI checked that FREELIST is utilised when using S1 and is not when using S1.
python/cpython#126865 suggests 10-20% performance improvement, but I can not see any big difference in performance when objects are re-used versus when they are not. Either way, although 10-20% improvement is great, this would provide improvement of different order.
E.g. performance difference when using pre-stored ints.
S="
from collections import deque
from itertools import chain
chain = chain.from_iterable
consume = deque(maxlen=0).extend
"
$PYEXE -m timeit -s $S "consume(range(100_000))" # 1.2 ms
$PYEXE -m timeit -s $S "consume(chain([range(256)] * (100_000 // 256)))" # 0.5 msSo I would dare to suggest increasing _PY_NSMALLPOSINTS to 1000 or 10_000.
It would provide 4x better performance.
It would be extra 10 or 100 KB.
Given extensive usage of mentioned objects (and int in general), I suspect this could have observable benefit at quite reasonable cost.
If this is feasible, exact number needs to be determined. It should not be too hard to find out what sort of number would cover a certain percentage of use cases.