-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
gh-138946: list.sort
enhancement proposal: Adaptivity for binarysort
#138947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Since you asked for more ideas... Tim and I once talked about things like this here: #116939 |
This is pretty much what I have done to incorporate it so not to damage performance of non-target cases. In many ways it resembles galloping approach. It switches on/off and grows "time-off" parameter on failed attempts. |
I have a simpler Python implementation of the adaptative algorithm. I made it look like the C implementation, kind of. def adaptative_binary_insertion_sort(a, n=0, ok=0):
n = n or len(a)
last = 0
for ok in range(ok + 1, n):
pivot = a[ok]
L = 0
R = ok - 1 # Ensures that pivot will not compare with itself.
# M is the index of the element that will be compared
# with the pivot. So start from the last moved element.
M = last
while L <= R:
if pivot < a[M]:
R = M - 1
last = M # Stores the index of the last moved element.
else:
L = M + 1
last = L # Stores the index of the last moved element.
M = (L + R) >> 1
if last < ok: # Don't move the element to its existing location
for M in range(ok, last, -1):
a[M] = a[M - 1]
a[last] = pivot # Move pivot to its last position. It's so simple, I think it can be implemented by modifying a few lines of the original |
Discrepancies that are >1-2% is most likely a fluke related to my machine or something similar. I took my PR, reverted it to Although 2 versions are now identical, still getting similar discrepancies in timings: ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ macOS-11.7.10-x86_64-i386-64bit-Mach-O | CPython: 3.15.0a0 ┃
┃ 50 repeats, 1,000 times | 2025-09-21T14:09:59 ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ Units: ns main ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ sorted(WORST30) ┃ 557 ± 7 ┃
┃ sorted(WORST100) ┃ 5,520 ± 21 ┃
┃ sorted(WORST640) ┃ 66,458 ± 445 ┃
┃ sorted(WORST6400) ┃ 728,827 ± 4,817 ┃
┃ sorted(BEST) ┃ 79,658 ± 699 ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
┃ Units: ns code exactly matches main ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ sorted(WORST30) ┃ 557 ± 13 ┃
┃ sorted(WORST100) ┃ 6,037 ± 104 ┃
┃ sorted(WORST640) ┃ 68,031 ± 414 ┃
┃ sorted(WORST6400) ┃ 760,257 ± 8,552 ┃
┃ sorted(BEST) ┃ 82,632 ± 656 ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
┃ Units: ns adaptivity ┃
┃ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃ sorted(WORST30) ┃ 591 ± 5 ┃
┃ sorted(WORST100) ┃ 5,437 ± 31 ┃
┃ sorted(WORST640) ┃ 69,969 ± 431 ┃
┃ sorted(WORST6400) ┃ 764,258 ± 7,773 ┃
┃ sorted(BEST) ┃ 42,973 ± 362 ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ Which one to compare the 3rd table? 1st or 2nd? |
Not sufficiently happy with this. |
But let it percolate in the background. I tried to stay out of this for a change, to let you find your way through the minefield. You did good! And it probably best is put on hold for now. But can also be picked up again. Hope sprints eternal 😄. |
Yeah, let's keep this open. For the time being the only improvement that would not cause any harm is shaving off ~1% comparisons from Thus, can factor the first "insort" out and put it before the loop.
Detecting such tiny difference is close to impossible, but nevertheless:
|
V1 Info (outdated)
Currently, adaptivity is simple.
last
)diff = abs(new_idx - last)
last
last += diff
binarysort
It is primarily targeted at data already sorted to significant degree (e.g. stock price data).
However it so happens that it handles some other patterns as well.
e.g.:
[-1, 1, -2, 2, -3, 3, ...]
.diff
will always be the full length of sorted part, so it will be jumping from one end to the next in 1 step.Microbenchmarks
For optimised comparisons this has little effect.
As can be seen, the worst case is small random data.
But in the same way that small data feels the biggest adverse effect, the positive effect is also the largest as greater (or all) portion of data is sorted using
binarysort
only.However, the impact is non-trivial for costly comparisons.
list.__lt__
is probably the fastest of all the possible ones.For Pure Python user implemented
__lt__
, the impact would be greater.V3 Getting closer to desirable result.
Raw integers & floats (specialised comparison functions)
Above wrapped into lists
list.sort
enhancement proposal: Adaptivity forbinarysort
#138946