-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slices: add Sort, SortStable, IsSorted, BinarySearch, and func variants #47619
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Changes that speed up the internal implementation probably don't need to go through the proposal process. I think this proposal should focus exclusively on the API additions.
Similarly, I think a generic counterpart to At that point, I think it would be reasonable to further deprecate Similar to the above, I think we should include a search function which operates on a slice but uses a custom comparison func (again, better naming needed).
The majority of my use cases for If we do all that, then the shape of the (non-deprecated) sorting API essentially looks like this:
Finally, even without my suggested modifications, this proposal will leave the sort API in a fairly messy state: something like ~half of the functions/types will be deprecated, and it's hard to come up with good names for the new functions because all the good names are already used by the old ones. I wonder if we should consider deprecating the entire package and introducing a new sorting package instead. |
Thanks, fixed.
These are reasonable suggestions, and I'd like to see what others think. It's also reasonable to start with a minimal approach, adding just the APIs presented in this proposal, and then add more variants in later releases when we have gained more experience with generics in the stdlib (and perhaps some naming conventions emerge).
This is a good point and something we considered, but it seems like a large step to take in 1.18; I'll be happy to hear other opinions as well, of course. |
🚴 :
The other name suggestions seem fine. |
One thing Python has that's there is not a great way to do in Go is sort with a key func. If you're sorting using some expensive transform, it might be nice to run the transform O(N) times instead of O(N log N) times.
|
I am not convinced about the Of suffix convention when the T is not essentially required. Package sort is the completely general sort. It seems like sort of slices would fit better in package slices - slices.Sort. What about:
? |
Maybe a third package like |
I don't think slices would have a dependency on sort at all. |
Ah, missed the part about it having a separate implementation. Still, putting it in slices doesn't feel right. I would expect to find it in or around sort. I suppose there would be a lot of deprecation messages pointing to the new location regardless. To add an additional 👍 to the Key request I've written a lot of less functions that were just |
This is a good point, Russ. It will, though, put us in the uncomfortable (IMHO) situation of having both |
If |
Here's a solution that makes no one happy: |
A note about this: as the background section of this proposal mentions, a generic implementation of slice sorting will be used to reimplement the existing |
I opened #47657 for |
This proposal has been added to the active column of the proposals project |
A common operation that I do is iterating over a map in sorted manner (usually of a string key). #47649 provides: func Keys[K comparable, V any](m map[K]V) []K Which we can combine with the above proposal to do: ks := maps.Keys(m)
sort.SliceOf(ks)
for _, k := range ks {
...
} I wonder if we can simplify this by having for _, k := range sort.SliceOf(maps.Keys(m)) {
...
} The detriment of returning the slice is that it makes it ambiguous from the function signature whether |
What about having maps.SortedKeys? Seems like a fairly common need. |
The The other direction seems more palatable. We could add func MapKeysOf[K constraints.Ordered, V any](m [K]V) []K However, it might be weird that it's signature and operation is different than the similarly named |
@dsnet I also find that I need this a lot. However, returning the input just to allow fewer newlines to be typed (which I guess would be called "chaining" if we were talking about methods) isn't the prevailing style. Also, as you say, it makes the signature less obvious; people will be much more likely to call Personally, your second example seems pretty good to me:
If we want to shorten it, I agree with @carlmjohnson that we should add to
Regarding adding this to I also think in the future for a lot of these use cases I might instead use a sorted container instead (a map that maintains the keys in sorted order; presumably backed by a tree). However, that requires more API design work (especially around iteration) and I expect we won't see anything like that in the standard library before 1.19. |
I am not sure we should do the extra return value - it will look like maybe it returns a different slice. I still believe we should think about slices.Sort instead of sort.NewSliceName; see #47619 (comment). |
Having these functions in the We can just deprecate In your list of new functions:
Is |
@ianlancetaylor it's the optimal implementation I'm after, the reflection overhead in the existing |
The implementation that @ianlancetaylor posted is probably still more efficient than I'm not sure how feasible it is as I'm not particularly familiar with the code, but you could probably make a copy of the |
@anacrolix Here is an unofficial implement. Benchmark based on eliben's
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I agree with having separate versions of the implementation for Here is two implementations using same sorting algorithm:
The generic version is 2x faster in the benchmark, significantly faster than the interface-based one. Also here is another issue(#50154 ) that suggests using pdqsort in the |
Another benchmark: https://gist.github.com/PeterRK/e2fa03a2657d4a406ee289c80612380b Result:
|
The existing codegen strategy in sort.go relied on parsing the sort.go source with go/ast and a combination of an AST rewrite + code text rewrite with regexes to generate zfuncversion -- the same sort functionality with a different variant of data. In preparation for implementing golang#47619, we need a more robust codegen strategy. To generate variants required for the generic sort functions in the slices package, we'd need significanly more complicated AST rewrites, which would make genzfunc.go much heavier. Instead, redo the codegen strategy to use text/template instead of AST rewrites. gen_sort_variants.go now contains the code for the underlying sort functions, and generates multiple versions of them based on Variant configuration structs. With this approach, adding new variants to generate generic sort functions for the slices package becomes trivial. See the discussion in golang#47619 for more details on the design decisions. Change-Id: I8af784c41b1dc8ef92aaf6321359e8faa5fe106c
The existing codegen strategy in sort.go relied on parsing the sort.go source with go/ast and a combination of an AST rewrite + code text rewrite with regexes to generate zfuncversion -- the same sort functionality with a different variant of data. In preparation for implementing golang#47619, we need a more robust codegen strategy. To generate variants required for the generic sort functions in the slices package, we'd need significanly more complicated AST rewrites, which would make genzfunc.go much heavier. Instead, redo the codegen strategy to use text/template instead of AST rewrites. gen_sort_variants.go now contains the code for the underlying sort functions, and generates multiple versions of them based on Variant configuration structs. With this approach, adding new variants to generate generic sort functions for the slices package becomes trivial. See the discussion in golang#47619 for more details on the design decisions. Change-Id: I8af784c41b1dc8ef92aaf6321359e8faa5fe106c
The existing codegen strategy in sort.go relied on parsing the sort.go source with go/ast and a combination of an AST rewrite + code text rewrite with regexes to generate zfuncversion -- the same sort functionality with a different variant of data. In preparation for implementing golang#47619, we need a more robust codegen strategy. To generate variants required for the generic sort functions in the slices package, we'd need significanly more complicated AST rewrites, which would make genzfunc.go much heavier. Instead, redo the codegen strategy to use text/template instead of AST rewrites. gen_sort_variants.go now contains the code for the underlying sort functions, and generates multiple versions of them based on Variant configuration structs. With this approach, adding new variants to generate generic sort functions for the slices package becomes trivial. See the discussion in golang#47619 for more details on the design decisions. Change-Id: I8af784c41b1dc8ef92aaf6321359e8faa5fe106c
Change https://golang.org/cl/378134 mentions this issue: |
Implements golang/go#47619 in the exp/slices package as a testing ground prior to inclusion in the standard library. Relies on the modified sorting function code generator proposed in https://go-review.googlesource.com/c/go/+/353069 to automatically generate the code of the sorting functions. Benchmark comparing sort.Ints with the generic Sort function added in this CL to sort a slice of int: name old time/op new time/op delta Sort-8 12.0ms ± 1% 6.5ms ± 1% -46.02% (p=0.000 n=9+10) Benchmark comparing sort.Sort with SortFunc to sort a slice of struct pointers based on one field in the struct: name old time/op new time/op delta SortStructs-8 18.6ms ± 2% 15.9ms ± 3% -14.43% (p=0.000 n=10+10) Change-Id: Ic301aae7e5b8f99144e39b8a77fde897779588ed Reviewed-on: https://go-review.googlesource.com/c/exp/+/378134 Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Cody Oss <codyoss@google.com> Trust: Jeremy Faller <jeremy@golang.org>
With https://go-review.googlesource.com/c/exp/+/378134, the functions outlined in #47619 (comment) are now implemented in the One caveat: we chose to exclude Should this proposal remain open until |
Let's leave this open for consideration for 1.19. I'll put it on hold for now. |
I ran a detail benchmark including this implement.
|
It's common to sort struct array in my experience. In such case, generic func api can be slower than legacy api on AMD CPU. See benchmark result on EPYC-7K83 machine.
Performance benefit of reference compare fcunc is significant.
Pointer array will be sorted faster than struct array only when size is small engouth to cache. |
The existing codegen strategy in sort.go relied on parsing the sort.go source with go/ast and a combination of an AST rewrite + code text rewrite with regexes to generate zfuncversion -- the same sort functionality with a different variant of data. In preparation for implementing #47619, we need a more robust codegen strategy. To generate variants required for the generic sort functions in the slices package, we'd need significanly more complicated AST rewrites, which would make genzfunc.go much heavier. Instead, redo the codegen strategy to use text/template instead of AST rewrites. gen_sort_variants.go now contains the code for the underlying sort functions, and generates multiple versions of them based on Variant configuration structs. With this approach, adding new variants to generate generic sort functions for the slices package becomes trivial. See the discussion in #47619 for more details on the design decisions. Change-Id: I8af784c41b1dc8ef92aaf6321359e8faa5fe106c Reviewed-on: https://go-review.googlesource.com/c/go/+/353069 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Trust: Than McIntosh <thanm@google.com>
I have opened a new proposal #57433 to move golang.org/x/exp/slices into the standard library. Closing this proposal as completed. Thanks for all the discussion. |
Update, Sep 15 2021: the current proposal is at #47619 (comment). - @rsc
This proposal is for use with #43651 (and also relies on the accompanying
constraints
package described in #45458). We propose adding a new generic function for sorting slices in thesort
package and using this code to implementsort.Ints
,sort.Float64s
andsort.Strings
much more efficiently. Similarly, we will do this for other functionality in thesort
package.If this proposal is accepted, the changes will be included with the first release of Go that implements #43651 (we currently expect that that will be Go 1.18).
Part 1 - exposing generic sort functions in the
sort
package APIWe propose exposing a generic sort function for sorting a slice as part of the public API of the
sort
package. Specifically, a new exported function:Typical invocation for some slice of an ordered type (like []int64) will look like this (type inference will obviate the need to specify a type):
Sorts the provided slice in-place, similarly to today’s
sort.Slice
. The name of this function is subject to discussion.With this function in the
sort
package, we can add deprecation notices to existing concrete sort functions:sort.Ints
,sort.Float64s
,sort.Strings
- though these functions will remain in the package in line with Go 1 compatibility guarantee.Part 2 - generic functions for additional
sort
package functionalityWe propose to provide a generic version of the current
sort.IsSorted
function:Efficient stable sorts:
Binary search:
Background - generic sort implementation
An internal sorting function:
Can be using the same implementation as the internal
quickSort_func
today, with the type parameter replacing theinterface{}
and having to provide aless
function. Early benchmarks show that this approach is 3-4x faster than the currentsort.Ints
. This function can be added and used to implement exported functions likesort.Ints
orsort.Strings
even if we decide not to add new APIs via this proposal.The text was updated successfully, but these errors were encountered: