-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Implement stable MergeSort for Array.prototype.sort #5724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if (!scriptContext->IsJsBuiltInEnabled()) | ||
| { | ||
| builtinFuncs[BuiltinFunction::JavascriptArray_IndexOf] = library->AddFunctionToLibraryObject(arrayPrototype, PropertyIds::indexOf, &JavascriptArray::EntryInfo::IndexOf, 1); | ||
| /* No inlining Array_Sort */ library->AddFunctionToLibraryObject(arrayPrototype, PropertyIds::sort, &JavascriptArray::EntryInfo::Sort, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not added this to the builtinFuncs object - I'm unsure if that is necessary - seems to be just for referencing methods from intl.js?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
builtinFuncs is, I think, how the JIT gets information for CallDirecting native functions -- "inlining", technically, but not in the same way as inlining a JS function.
| __chakraLibrary.raiseFunctionArgument_NeedFunction = platform.raiseFunctionArgument_NeedFunction; | ||
| __chakraLibrary.functionBind = platform.builtInJavascriptFunctionEntryBind; | ||
| __chakraLibrary.objectDefineProperty = _objectDefineProperty; | ||
| __chakraLibrary.toString = platform.String; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain if this is the right way to access String()? (copied logic from intl.js)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String() is actually not the same as the specification's ToString() operation; the former has a special case for Symbols. ToString(symbol) will throw; String(symbol) will not. Attempting to sort an array with any symbols in it using the default comparator should throw, and it does at least in V8:
> ([ Symbol(), Symbol(), Symbol() ]).sort()
TypeError: Cannot convert a Symbol value to a string
at Array.sort (native)
I'm not sure there's a clean way to trigger ToString() coercion other than
let stringifiedValue = "" + valueToStringify;| } | ||
| }); | ||
|
|
||
| platform.registerChakraLibraryFunction("DefaultMergeSort", function(array, length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method handles the no comparator function case. I made it a whole separate function to make the logic simpler.
As the default comparison is to compare strings I make an array of Strings of every element in the input array first - as converting for each comparison was slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification says to do a ToString() coercion on the two items to be compared for every comparison. Can ToString() have side effects? If so your optimization will be an observable spec violation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, no particular sorting algorithm is specified and I suppose it's entirely possible for each item to only be passed to the comparator once so maybe it's not meaningfully observable after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToString() can have side effects, but I believe that that's listed as a situation where the resulting data structure is implementation-defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Penguinwizzard What I meant was, the specification says the default compare function should do ToString() on each operand each time it’s called, while @rhuanjl claims to have ToString()’d the whole array in advance to compare the stringified items directly, and I wasn’t sure if that was a spec violation or not.
It’s probably fine in practice since the actual order and number of comparisons is already implementation-defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure it's fine per spec BUT - it's still not fast enough I think I need to make my mechanism for default compare better before you can merge this - whilst the results with a provided compare method are comparable (per the numbers in the gist above) currently there is a large regression with the default compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default compare "intentionally" (by design) causes numbers to sort as strings -- obviously a custom comparison function will achieve better performance. What do you have in mind to improve the default comparator? Seems like the ToString operation would be the bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running it through a profiler at the moment the bottleneck actually seems to be the less than operation being used on the strings I'm not sure if either there isn't a Jit fast path for it OR it isn't being hit- I got similar performance scrapping the cacheing and implementing a method in JsBuiltinEngineInterfaceObject.cpp to convert to string and compare - if I have a method that just compares combined with cacheing I may be a little better - if that doesn't work I have a few other ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the < operator doesn't seem to have any fast path for strings. I tried a small JS file that used < on strings a bunch and it generated code that:
- Checks whether both operands are ints (they never were, but I guess maybe it pays to be optimistic sometimes)
- If not, branches to a helper block
- In the helper block, calls Op_Less, which will check whether the operands are numbers again and then switch based on their type
- Checks whether there was an implicit call and bails out if so
I don't think this speed regression should block merging this PR, because I expect most callers pass a comparator function anyway, and stable output is a very nice upgrade. I could imagine two options for improvement:
- Expose a native method for string-string comparisons and call it here
- Change the lowerer to generate nicer code for comparisons between likely-string values (which could benefit code outside this method too)
Code for < operation:
GLOBOPT INSTR: BrLt_A $L9, s28[LikelyCanBeTaggedValue_String].var, s32[LikelyCanBeTaggedValue_String].var #0000 Func # (#1.1), #2 Bailout: #0009 Func (#1.1), #2 (BailOutOnImplicitCalls)
1E4D781120F s107(r15).i64 = MOV s28(r10)[LikelyCanBeTaggedValue_String].var
1E4D7811212 s107(r15).i64 = SHR s107(r15).i64, 48 (0x30).i8
1E4D7811216 s108(rbx).i64 = MOV s32(r11)[LikelyCanBeTaggedValue_String].var
1E4D7811219 s108(rbx).i64 = SHR s108(rbx).i64, 32 (0x20).i8
1E4D781121D s107(r15).i64 = OR s107(r15).i64, s108(rbx).i64
1E4D7811220 CMP s107(r15).i32, 65537 (0x10001).i32
1E4D7811227 JNE $L24
1E4D781122D CMP s28(r10)[LikelyCanBeTaggedValue_String].i32, s32(r11)[LikelyCanBeTaggedValue_String].i32
1E4D7811230 JLT $L9
1E4D7811232 $L25:
1E4D7811232 s162(rbx).i32 = XOR s162(rbx).i32, s162(rbx).i32 Func # (#1.1), #2
The helper at L24 looks like this:
1E4D7811584 $L24: [helper]
1E4D7811584 s28<-32>.var = MOV (r10).var Func # (#1.1), #2
1E4D7811588 s32<-40>.var = MOV (r11).var Func # (#1.1), #2
1E4D781158C s168<-80>.i64 = MOV (r8).i64 Func # (#1.1), #2
1E4D7811590 s167<-72>.i64 = MOV (r9).i64 Func # (#1.1), #2
1E4D7811594 s166<-64>.i64 = MOV (rdx).i64 Func # (#1.1), #2
1E4D7811598 s165<-56>.i64 = MOV (rcx).i64 Func # (#1.1), #2
1E4D781159C s164<-48>.i64 = MOV (rax).i64 Func # (#1.1), #2
1E4D78115A0 s109(rbx).var = MOV s32(r11)[LikelyCanBeTaggedValue_String].var Func # (#1.1), #2
1E4D78115A3 s110(r15).var = MOV s28(r10)[LikelyCanBeTaggedValue_String].var Func # (#1.1), #2
1E4D78115A6 arg3(s113)(r8).u64 = MOV 0x01DCD5DCC300 (ScriptContext).u64
1E4D78115B0 arg2(s114)(rdx).var = MOV s109(rbx).var
1E4D78115B3 arg1(s115)(rcx).var = MOV s110(r15).var
1E4D78115B6 s116(rax).u64 = MOV Op_Less.u64
1E4D78115C0 s99(rbx).u64 = MOV s99<-24>.u64
1E4D78115C4 [s99(rbx).u64 <0x01DCD5DC1FE0 (&ImplicitCallFlags)>].u8 = MOV 1 (0x1).i8
1E4D78115C7 s28<-32>.var = MOV s28(r10).var
1E4D78115CB NOP 3 (0x3).i8
1E4D78115CE s32<-40>.var = MOV s32(r11).var
1E4D78115D2 s112(rax).i64 = CALL s116(rax).u64 Func # (#1.1), #2
1E4D78115D5 s111(rbx).i64 = MOV s112(rax).i64
1E4D78115D8 s99(r11).u64 = MOV s99<-24>.u64 Func # (#1.1), #2
1E4D78115DC CMP [s99(r11).u64 <0x01DCD5DC1FE0 (&ImplicitCallFlags)>].u8, 1 (0x1).i8 Func # (#1.1), #2
1E4D78115E0 (rax).i64 = MOV s164<-48>.i64
1E4D78115E4 (rcx).i64 = MOV s165<-56>.i64
1E4D78115E8 (rdx).i64 = MOV s166<-64>.i64
1E4D78115EC (r9).i64 = MOV s167<-72>.i64
1E4D78115F0 (r8).i64 = MOV s168<-80>.i64
1E4D78115F4 (r11).var = MOV s32<-40>.var
1E4D78115F8 (r10).var = MOV s28<-32>.var
1E4D78115FC JEQ $L26 Func # (#1.1), #2
1E4D78115FE $L27: [helper]
1E4D78115FE s28<-32>.var = MOV s28(r10).var
1E4D7811602 s32<-40>.var = MOV s32(r11).var
1E4D7811606 (rdx).i64 = MOV s111(rbx).i64 Func # (#1.1), #2
1E4D7811609 (rcx).u64 = MOV 0x01E4D75EDC60 (BailOutRecord).u64 Func # (#1.1), #2
1E4D7811613 (rax).u64 = MOV SaveAllRegistersAndBranchBailOut.u64 Func # (#1.1), #2
1E4D781161D CALL (rax).u64 Func # (#1.1), #2 Bailout: #0009 Func (#1.1), #2 (BailOutOnImplicitCalls)
1E4D7811620 JMP $L23
1E4D7811625 $L26: [helper]
1E4D7811625 TEST s111(rbx).i64, s111(rbx).i64
1E4D7811628 JNE $L9 Func # (#1.1), #2
1E4D781162E JMP $L25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also want to add the capability to know that the array is entirely strings; then we'd be able to do this particular operation significantly more quickly, and it wouldn't surprise me if there's a good number of rwc situations where arrays of densely packed strings are common.
|
|
||
| assert.isTrue(getterCalled); | ||
| assert.areEqual([101,11,22,,77,16], arr, '77 and 16 are not part of the sort so they are not sorted'); | ||
| assert.areEqual([101,11,101,,77,16], arr, '77 and 16 are not part of the sort so they are not sorted'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case involves a getter which provides a value each time it's called but does not set anything back. The order of comparisons in your sort therefore effects what values end up in your sorted array - hence the change of output.
The test was meant to check that 77 and 16 which are added to the end of the array by the getter aren't sorted - that is still the case.
test/Array/array_sort3.js
Outdated
| write(arr); | ||
|
|
||
| function comparefn(x,y) { arr[0]="test"; return x - y; } | ||
| function comparefnOne(x,y) { arr[0]="test"; return x - y; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case had two identically named compare functions therefore hoisting meant only the second one was used. Renamed so both are used.
| try { | ||
| var propName = 1; | ||
| Object.defineProperty(Array.prototype, propName, { | ||
| Object.defineProperty(Object.prototype, propName, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case was testing incorrect behaviour see #5719
ab369a9 to
3f290d8
Compare
| while (position < length) { | ||
| const left = position; | ||
| let right = position + bucketSize; | ||
| right < length ? right : length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this line do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like a no-op; it evaluates to either right; or length; which will have no effect. Remnant from refactoring maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this you walk off the end of the array unless its length is a power of 2. Perhaps I should add comments explaining the algorithm?
Example - array of length 7:
-
step 1 - the first loop does pairs:
0-1, 2-3, 4-5 (you'll note that it actually runs with i = 1 -> i < length always looking at i and i-1 so no risk of walking off the array) -
step 2 (now in this loop) - bucket size = 4:
0-4, 5-8
BUT if you run with right = 8 you would access array[7] which doesn't exist - this condition limits it so you look at 5-7 instead which is fine. -
step 3 - bucket size = 8:
0-8
Again you need right capped at length which is 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, you have a bare ternary here:
// this is the entire statement
right < length ? right : length;Which is basically the same thing as writing
0;Did you mean to assign the result of the ternary to something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yes it's an error It's meant to be: right = right < length ? right : length;
I'm confused that this doesn't a fail a test with this error here. I'm either covering the same case with a condition somewhere else - which I cannot currently see or there's no array in a test case with a length that isn't a power of 2.
EDIT: OK yes this error introduces a bug not currently caught by a test case:
let arr = [0, 1, 2, 3, 4, 7, 9];
Object.defineProperty(Array.prototype, 7, { get: function () { print ("SURPRISE"); return 5;}});
print (arr.sort((a, b) => b - a));"SURPRISE" will be printed and the value 5 will be added into the array - I guess it's an odd case unique to an incorrect implementation of merge sort - I'll obviously update the code - unsure if it's worth adding a test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping that fixing this issue will also improve performance, because indexing out of bounds on arrays is very slow, and putting undefined inputs into the comparator function is likely deoptimizing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed and I've simplified some of the related logic a little.
My tests don't show any consistent improvement for fixing this - though I imagine that numerous sorts of small arrays would show an improvement whereas I have been testing with sorting just 6 large arrays.
|
Looking at ways to make this faster - around 10% of runtime is spent in arrayCreateDataPropertyOrThrow which is only necessary to avoid hitting setters or read only values placed in the array prototype chain - are there any alternatives I could use to do that? |
|
You mentioned that you've tried CreateObject(null) to avoid the prototype issues; have you also tried removing the prototypes from the temporary arrays, like |
|
I can verify experimentally that stripping the prototype from an array represents a 20-25% speedup for writing new entries into that selfsame array--and a much smaller but still measurable boost for reading. Test code (run in miniSphere using let n = 1e7;
let a = [];
let b = [];
b.__proto__ = null;
SSj.instrument(() => {
for (let i = 0; i < n; ++i)
a[i] = 812;
}, "10m x WRITE a[]")();
SSj.instrument(() => {
for (let i = 0; i < n; ++i)
a[i];
}, "10m x READ a[]")();
SSj.instrument(() => {
for (let i = 0; i < n; ++i)
b[i] = 812;
}, "10m x WRITE b[]")();
SSj.instrument(() => {
for (let i = 0; i < n; ++i)
b[i];
}, "10m x READ b[]")();Results (i7-6700HQ): |
|
On further testing, it turns out that the pattern above reverses if the arrays are pre-allocated using |
|
Interesting, thanks @fatcerberus. I think builtins can use |
|
|
||
| // perform a merge but only if it's necessary | ||
| if (mid < length && compareArray[mid] < compareArray[mid - 1]) { | ||
| right = right < length ? right : length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think it would be a little cleaner to move let right = position + bucketSize down here so all the right computation is visually together and the right variable is scoped to the smallest block that uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
sethbrenith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
|
unless/until we implement a JIT fast-path for Regarding insertion sort - I've heard of sort implementations that use mergeSort/quickSort at large scale, and insertion sort segments once they drop below some predetermined size. I'm not sure it'd be worth either the added complexity or the effort to test, but I would be curious to see what the results would be given that you've already confirmed insertion sort being faster for smaller arrays. |
|
Thanks for the feedback so far.
|
|
The main problem we have with JS built-ins right now is that we generate poor code if the built-in is not inlined and uses different type of objects. The JIT'd code for these needs to be generic enough to allow all of these different objects (or arrays in this case) and the resulting code is slow. In your perf test, you have int and float arrays. We can generate ok code for this, but if you call sort on strings or objects after that, the code will degrade. We've avoided this for now by making sure all built-ins are inlined, and assuming the types are stable at the call-site most of the times. This also has the benefit of getting callback functions inlined (note that it probably was in your case because all call-sites had the same compare function). Insertion sort is a lot smaller than merge sort and could probably be inlined, getting all these benefits. As for the default compare case, I'm assuming people using it are passing an array of strings. We could handle this by optimizing toString() calls on strings, and avoid the extra code bloat. |
2a57d9e to
1d340d0
Compare
|
I have made the following updates based on points above:
I assume this could do with some more serious/robust performance testing and so remains not mergeable as is. Additionally if anyone uses the default sort method there really needs to be a faster path for the stringA < stringB operation. Please let me know if you have any other feedback/points for me to update for or alternatively if this is a bad time for this and I should leave it/close it. |
| let j = 0, k = 0, l = 0; | ||
| while (i < length) { | ||
| const item = compArray[i]; | ||
| l = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that l comes after ijk, but it's also really hard to read. Could you use a different name please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I would prefer more descriptive names too, like i->numSorted, j->lowerBound, k->insertionPoint, l->upperBound.
In reply to: 221307258 [](ancestors = 221307258)
test/Array/array_qsortr_random.js
Outdated
| // Test 1000 random arrays of 1000 elements, print out the failures. | ||
| stressTestSort(1000, 1000); | ||
| // Test 1000 random arrays of 3000 elements, print out the failures. | ||
| stressTestSort(1000, 3000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this test expanded to cover all of the cases where our behavior could vary: longer and shorter than the insertion/merge cutoff point, numbers and objects and strings, with and without comparators. For the object cases, part of the verification could be checking that the output was stable. We might have to knock down the iteration count to 100 or so to keep the test quick enough, but I think this could provide a good level of validation.
| let value; | ||
| while (i < length) { | ||
| value = array[i]; | ||
| compArray[i++] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we gain any noticeable improvement on arrays of strings using default comparison if we tried to detect that case and avoided creating these extra objects? It would mean an extra walk through the array up-front doing typeof checks, but I imagine that would still be faster than allocating all of these temporary objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But would typeof be detectable e.g. via a proxy - per spec array.prototype.sort doesn’t use typeof. I suppose I could make a native method that does this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think typeof can trigger any side effects but I’d have to double-check the specification to be sure.
|
I’ve pushed an update to:
|
|
I've added a further update to reverse the merge direction - this reduces the memory requirement and number of operations for sorts of arrays with lengths that are not powers of 2. My only remaining concern is the performance in the following 3 cases:
Possible solutions:
My preference is option 2 as I think it will have the best overall performance + obviously will use the code I've written - I'm happy to write the additional code for this option, obviously the downside is that it will increase the complexity of the sort implementation as there will be native and Js versions. |
|
Perf is not usually a target concern for NoJIT, so I think it would be "fine" to take the perf hit in that case and just disable the tests in NoJIT, and that would at least get this PR completed. Then we can follow up with a native fallback and detect the cases where it should be used. |
d9549b5 to
ae8d6aa
Compare
|
Aside from the merge conflicts which should be fairly easy to solve is this blocked based on anything else? Performance questions perhaps? Please let me know if there's anything further I should do on this. |
test/Array/array_sort_random.js
Outdated
|
|
||
| for (let i = 1; i < size; ++i) | ||
| { | ||
| if (array[i-1].value > array[i].value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also check that if array[i-1] === array[i], then array[i-1].index < array[i].index (to verify that stability is maintained while moving things around).
| { | ||
| if (sorted[i-1] > sorted[i]) | ||
| { | ||
| throw new Error ("Default sort has not sorted by strings"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output of a randomized test is only actionable if it contains the input array data. The case on line 82 currently prints out the input array, but we need to extend that pattern to apply to all of the test cases. We could JSON.stringify the array to get nice printable data for the object test case.
| }); | ||
|
|
||
| platform.registerChakraLibraryFunction("CreateCompareArray", function(array, length) { | ||
| let useCompareArray = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use a ;
|
Thanks for your patience, and great work on this @rhuanjl! I have a few minor comments, but otherwise this looks good to me. Once those comments are addressed, I'll merge the change unless somebody else says not to within two days. Benchmark results on my machine (test-pogo-x64), for whoever is curious: |
….sort Merge pull request #5724 from rhuanjl:mergeSort Picking up on the discussion in #5661 This PR implements a stable bottom up Merge Sort as a JsBuiltin for arrays of any length up to 2^32 (well I hit out of memory trying to allocate an array with length above 2^29 but in theory). I'm not sure if it's good enough to merge as is but would appreciate feedback. **EDIT:** I've made some large edits to the below to reflect changes made. **Issues to consider:** 1. **Performance - DefaultCompare** - My Default Compare sort is very slow despite cacheing all the string conversions at the start it. The string less than operation is a significant bottle neck - I have tried: - a native chakraLibrary method to compare strings - this was about the same performance as using less than - using charCodeAt in a loop - this was also about the same performance as using less than 1. **Insertion sort** - I have included an insertion sort directly in the Array.prototype.sort function used for short arrays - could consider what the best cut off is before switching to mergeSort instead - currently length of 2048 is used. 1. **Memory usage** - My implementation of merge sort needs a buffer array with length up to half the length of the array being sorted. 1. **Scope** - I've not looked at the sort method for TypedArrays obviously stabilising that doesn't make sense (though may be worth looking at its performance on xplat as it uses the earlier mentioned slow xplat qsort for arrays of any length) 1. **Tests** - I've consolidated most of the pre-existing array sort tests and also added a test for sorting a variety of random arrays and ensuring that the sort is both correct and stable 1. **General** - see other comments I've added below... fixes: #5661 fixes: #5719
|
This doesn’t seem to be fully stable just yet. Test case:
|
|
@mathiasbynens this is still only in the master branch of chakra - it's not in 1.11 (Note, I'm an external contributor and do not know if or when the CC team are planning to include this in a release) |
|
@rhuanjl Ah, great! I saw the release being cut days after this PR was merged, and assumed the patch was part of the release. |
|
Semver-patch releases don’t cut from master - they are bug fixes only. |
Picking up on the discussion in #5661 This PR implements a stable bottom up Merge Sort as a JsBuiltin for arrays of any length up to 2^32 (well I hit out of memory trying to allocate an array with length above 2^29 but in theory).
I'm not sure if it's good enough to merge as is but would appreciate feedback.
EDIT: I've made some large edits to the below to reflect changes made.
Issues to consider:
Performance - DefaultCompare - My Default Compare sort is very slow despite cacheing all the string conversions at the start it. The string less than operation is a significant bottle neck - I have tried:
Insertion sort - I have included an insertion sort directly in the Array.prototype.sort function used for short arrays - could consider what the best cut off is before switching to mergeSort instead - currently length of 2048 is used.
Memory usage - My implementation of merge sort needs a buffer array with length up to half the length of the array being sorted.
Scope - I've not looked at the sort method for TypedArrays obviously stabilising that doesn't make sense (though may be worth looking at its performance on xplat as it uses the earlier mentioned slow xplat qsort for arrays of any length)
Tests - I've consolidated most of the pre-existing array sort tests and also added a test for sorting a variety of random arrays and ensuring that the sort is both correct and stable
General - see other comments I've added below...
fixes: #5661
fixes: #5719