-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a faster sort algorithm pdqsort for Array.Sort #68436
Conversation
Tagging subscribers to this area: @dotnet/area-system-runtime |
internal static ref T UnguardedAccess<T>(Span<T> span, int index) | ||
{ | ||
// return ref Unsafe.Add(ref MemoryMarshal.GetReference(span), index); | ||
// TODO: investigate where can we use the above line with confidence instead of this. | ||
return ref span[index]; | ||
} | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
internal static Span<T> UnguardedSlice<T>(Span<T> span, int begin, int end) | ||
{ | ||
return MemoryMarshal.CreateSpan(ref UnguardedAccess(span, begin), end - begin); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using unsafe code throughout a complex algorithm like sort is a major liability. The existing sort is intentionally using safe bounds checked access, except a few carefully audited places.
Changing the unstable Array.Sort algorithm is semi-breaking change. It generates a ton of follow up work in projects out there in updating their test baselines. It would be best to switch to a stable sort algorithm that we can keep tweaking for performance in non-breaking way. What's the best stable sort algorithm available and how does it compare with this and with what we have today? I understand that we would be leaving some perf on the table with stable sort algorithm, but we would be gaining predictability that has a lot of value. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
It seems fluxsort allocates, which may be undesirable in certain cases. |
Subsequently changing to an even better stable algorithm could break baselines again, right?
Just curious, are there algorithms that take advantage of vectorization? I don't see it mentioned there. |
This comment was marked as outdated.
This comment was marked as outdated.
@jkotas Aren't we already using an unstable algorithm? We used introsort before which is already unstable. |
@jkotas Since we already used an unstable sort algorithm, I would like to land this PR (pdqsort). The previous failure was caused by a mistake where I forgot to handle null case while comparing. |
It's unstable but determistic. @jkotas' concern is that a lot of code, especially tests, has taken a dependency on the ordering, and we don't want to churn it only to churn it again. |
No. All stable sort algorithms produce same results. If we switch to a stable algorithm, switching to even better stable algorithm would not change observable behavior.
Yes, you can write specialized sort algorithms that take advantage of vectorization and parallelization. They are good candidates for standalone nuget package, not for the good-enough sort algorithm in the core library. |
I would like to see the CI first. If there is no test failure related to this change, I think it's okay to ship it. |
I see CircleInConvex is failing because the cmp function is malformed: static int cmp(Point a, Point b)
{
if (a.X < b.X || a.X == b.X && a.Y < b.Y)
return 1;
else
return 0;
} The above cmp function results in indeterminate behavior, which is strongly coupled with introsort behavior (code really shouldn't assume Array.Sort having a introsort implementation). Note that If I change to any sort algorithm other than intersort such as heap sort, insertion sort and etc, the CircleInConvex will also fail. I think instead of keeping the original sorting algorithm to make test pass, we should change the cmp function of the test. |
Ah yes, I was thinking that stable means, always orders elements that compare identically in some predictable order. But actually means, always orders elements that compare identically in the same order they were provided. |
Test failures are unrelated. |
CircleInConvex is a good test fix. It may be good to submit it as a separate PR.
I will leave it for @dotnet/area-system-runtime owners to figure out how to drive this behavior change through. For example, based on past experience - this sort algorithm change would generate a bunch of work for Bing. I think we can get people behind something like this: "We are sorry for changing the sort algorithm again. This is the last time. The sort algorithm is going to be stable from now and improvements are not going to have observable impact going forward (as long as your compare function is correct).". On other hand, "We are sorry for changing the sort algorithm again and we are likely going to do it at least one more time soon so prepare to do the work on your test baselines twice." will be harder to get people behind. There are likely going to be asks for keeping the current sort algorithm under a switch, but that is not particularly great for small binary sizes that are important for other form factors. The whole thing gets complicated. |
|
Most methods of pdqsort can be shared with introsort (insertion sort and heap sort), I think we can provide a runtime switch to fallback to the old introsort behavior, and it won't introduce observable code size regression. |
Does stability matter for native types anyway as you can't tell the difference between two ints with the same value? pdqsort could be used for a known list of types? |
I do not have much else to say than what I have said in #68436 (comment) . If you agree that we want to end up with a stable sorting algorithm, the intermediate steps create extra work. |
Timsort is a good stable sort for references, so you could have Timsort for general sorting and Pdqsort for primitives where stability doesn't matter... Only downside is it can allocate, but it's likely you can have a degree of stack allocation for small cases and only switch to dynamic if needed |
When the algorithm becomes stable, people will take a dependency on that behavior (justified or not). This means that it can never again be made unstable. It's a one-way street. This feels like .NET 1.0 locking in a specific It's also like the .NET 1.0 |
I'm just trying to get clarity on what an unstable sort means on primitive structs seeing as you can't tell the difference between two integers with the same value? I can certainly see the difference if you're using an extracted key to sort different values but using an unstable sort for Array.Sort<int[]>(array) seems safe from the stability perspective as you can't tell.... Whether it's worth having a faster unstable sort for this kind of case is another question! |
I'm agreeing with @GSPP. Once we switch to a stable sort algorithm and document the behavior, we will never be able to go back to the unstable world again because people will start to rely on the stable behavior. |
That is an intentional and desirable option. Developers are already reliant on the "unstable" sort having not changed and therefore being viewed as "stable" when in actuality its simply deterministic. |
Then we can consider adding a |
As per the above, if any break is taken we are more interested in taking a 1-time break to make the default sorting algorithm stable. This allows us to improve perf over time without continuously hurting developers who may be dependent on a particular behavior. A faster but unstable sorting algorithm could then be discussed independently, but I imagine its better for that to simply be provided via a 3rd party package maintained by the community. |
As for the other question that popped up, there is indeed no difference between "stable" and "unstable" for primitive types as there is no way to disambiguate That being said, having a significantly different algorithm for primitive types needs to be worth the cost. That is, it needs to be shown that the additional code complexity and long term maintenance of the additional algorithm is worthwhile. This will likely require showing not only perf numbers but also showing that such sequences occur "frequently enough" that this shouldn't just be handled by some 3rd party package maintained by the community. |
Unless the app provides an IComparer that treats some values as equivalent. |
Sure, but that's not comparing primitives, that's using a custom comparer and is effectively treated as "custom type" at that point. |
Based on the work https://drops.dagstuhl.de/opus/volltexte/2016/6389/pdf/LIPIcs-ESA-2016-38.pdf and C++ implementation https://github.com/orlp/pdqsort.
I haven't implemented the branch-less partition right, which can be improved in the future.
Benchmark: https://gist.github.com/hez2010/6b52929ee1755788c34818972c46aefb
Fixes #68325