-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: detect and optimize slice insertion idiom append(sa, append(sb, sc...)...) #31592
Comments
I would say this optimization is more needed than this one, #21266, made in Go SDK 1.11, for it is used more popularly, and in fact, before Go 1.11, there is already a slice-extending method which is even more efficient than the optimization made in Go SDK 1.11. Evidence: package main
import (
"testing"
)
type T = int
var sx = []T{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
func SliceGrow_OneLine(base []T, newCapacity int) []T {
return append(base, make([]T, newCapacity-cap(base))...)
}
func SliceGrow_VerboseCopy(base []T, newCapacity int) []T {
m := make([]T, newCapacity)
copy(m, base)
return m
}
var sa []T
func Benchmark_SliceGrow_OneLine(b *testing.B) {
for i := 0; i < b.N; i++ {
sa = SliceGrow_OneLine(sx, 100)
}
}
var sc []T
func Benchmark_SliceGrow_VerboseCopy(b *testing.B) {
for i := 0; i < b.N; i++ {
sc = SliceGrow_VerboseCopy(sx, 100)
}
} Benchmark result:
|
/cc @randall77 @griesemer |
cc @martisch There are a lot of variants of this. E.g. I'm genuinely not sure how many of these we want to recognize in the compiler vs, say, writing a blog post illustrating how to write these efficiently or wait for generics and implement them in the standard library. |
I see at least 3 patterns here:
I think the first 2 should be easiest to implement and subjectively also the most used. If others dont think the first 2 are a no-go I can have go at it :) (but if someone else is eager to work on it dont hold off on it) |
I'd like to try this, but I think it could be generalized, as the original comment mentioned. For nested appends, one would need at most 1 call to init {
// continue walking the node tree and place the args
// in temporaries s1, s2, s3, ...
newcap := len(s1) + len(s2) + len(s3) + ...
if newcap > cap(s1) {
s1 = growslice(s1, newcap)
}
s1 = s1[:s1]
idx := 0
copy(s1[idx:], s2)
idx += len(s2)
if cap(s2) >= len(s3) {
copy(s2[:cap(s2)], s3)
}
copy(s1[idx:], s3)
idx += len(s3)
if cap(s3) >= len(s4) {
copy(s3[:cap(s3)], s4)
}
...
}
s This only allocates for the resulting slice and mimics the behavior of actually calling I should mention that I think I understand most of what's going on in the slice generation except for this bit: go/src/cmd/compile/internal/gc/walk.go Lines 2891 to 2895 in 904f046
|
Careful with I would still suggest to solve this for the simple case to understand the complexities involved better and then we can see if the generalization is worth even more complexity while still being used in production code. |
To provide some statistics (source: gocorpus with all repositories checked):
For comparison, here are some results on the same corpus:
Which is understandable, 2-arguments append is far more common than any other form, but this gives us the answer by how much (~51 times). (The search patterns are written in gogrep syntax, if you're curious.) Just out of curiosity, does this slices.Insert perform the required operation in the comparable performance that we can achieve by teaching the compiler to make the suggested rewrite? |
No, it doesn't. Generics doesn't help here. func Insert[S ~[]E, E any](s S, i int, v ...E) S {
tot := len(s) + len(v)
if tot <= cap(s) {
s2 := s[:tot]
copy(s2[i+len(v):], s[i:])
copy(s2[i:], v)
return s2
}
s2 := make(S, tot)
copy(s2, s[:i])
copy(s2[i:], v)
copy(s2[i+len(v):], s[i:])
return s2
} That's why I proposed to add On the other hand, the generic code does provide a chance to simplify the potential compiler optimization. |
Excessive zeroing is a known issue. I think it's a separate fruit on its own. |
Mostly, except for the case handled by a Go 1.15 optimization: var s = make([]T, n)
copy(s, x)
copy(s[len(x):], y) In the above code, the elements within For the complexities in reality, the optimization has many limitations:
That means, currently, the above generic code could be a little faster: func Insert[S ~[]E, E any](s S, i int, v ...E) S {
tot := len(s) + len(v)
if tot <= cap(s) {
s2 := s[:tot]
copy(s2[i+len(v):], s[i:])
copy(s2[i:], v)
return s2
}
x := s[:i]
s2 := make(S, tot)
copy(s2, x)
copy(s2[i:], v)
copy(s2[i+len(v):], s[i:])
return s2
} |
I decided to close this issue, in favor of The one line trick might cause data incomplete when the result slice length overflows |
|
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What did you do?
What did you expect to see?
Small performance difference.
What did you see instead?
The text was updated successfully, but these errors were encountered: