-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: allow cap(make([]T, m, n)) > n #24204
Comments
It would be hard to "gofix" this in some hypothetical "go1to2" tool. We'd need to conservatively rewrite all |
It might be an interesting experiment to hack this into the compiler/runtime and do a regression test and see what breaks (@dsnet). Yes, it'd be ugly, but the gofix rule would at least be very simple. As I noted in #24163, this optimization could be made opt-in (instead of opt-out) with an appropriate runtime function. There's another option in this arena, which is to say "roughly" rather than == or >= in the spec, as we do with maps. The compiler and runtime could use this freedom along with PGO or introspection to make better decisions, treating the info from the user as just a hint to be used when no other info was available. |
The three-index syntax of a slice was introduced to extend the precise control of slice |
Fortunately the bitwise complement operator is |
|
Also |
That becomes rather clumsy though, because to properly round you'll need to incorporate the element size too, not just the element count. So users would have to write something like:
or
(And of course
That control is important, but seems exceedingly rarely used. Looking through Google's internal Go code base, I've found only a couple uses of full slice expressions, and they're all of the form:
where the original I strongly suspect the majority of users don't care about how much capacity is allocated for them by |
(Don't get offended.) This proposal basically proposes to cripple the language and make it less predictable (like really, imagine debugging a bug caused by this) in order to accommodate a very minor CPU-architecture-specific performance optimization. This mistake has been done many times in many languages. Don't make Go one of them. |
I have recently written some code that would be broken by this change. Easy to fix, of course, but it'd be interesting to know whether this is more common. The gist of it was:
|
I think that's strongly overstating the potential downsides. Here's a simple experiment to just uniformly overallocate every
This breaks a couple standard library and regress tests, but they all appear to be for verifying that There was one hairy crash in runtime/trace, which I've now diagnosed. The process for tracking it down was:
I would argue the real issue here is that runtime is using an inconsistent mix of
It didn't seem too bad. :) |
Now that you have it working, if you change it to round up instead, are there any notable stdlib benchmark changes? Interestingly, allp cap issues were also what I hit with CLs 21813 and 22197, which changed append behavior, which is further evidence that just using |
Btw, if someone really wants to make this happen, I would suggest a slight modification and get the requirements as follows:
So, when you don't specify the capacity explicitly, the compiler can optimize that for you. This would be sort of acceptable, I guess. |
Fortunately, So you add a new package, Or, you add a three-arg “bounded |
I don't think you care about the lower/upper-bound at all. You either care about the exact capacity, or you don't, in which case the compiler can insert anything and it's fine. |
@slrz You could also write that code as var commonPrefix = make([]byte, N)
func f(arg Type) []byte {
return append(append([]byte{}, commonPrefix...), computeSuffix(arg)...)
} If #18605 and #12854 are accepted, that would simplify further: func f(arg Type) []byte {
return append({}, commonPrefix..., computeSuffix(arg)...)
} That seems much more difficult to |
A related question for this proposal: Is it ok for slice literals to have cap > len? E.g. can |
I tend to feel that if you ask for an explicit capacity, as in I think there is an argument for not precisely specifying the capacity for |
It occurs to me that even if |
Change https://golang.org/cl/111646 mentions this issue: |
Now that CL 109816 is in, you can sorta kinda approximate this in user code: s := append([]T(nil), make([]T, m)...)[:0] It's more expensive than CL 111646 shows it in action, along with some benchmark results. You can see that there is a pretty significant improvement in some cases. And the regressions in other cases would go away if this were a language change rather than a compiler optimization of a pretty awkward bit of code. Might be interesting for Go 1.12 to look at other |
Interesting! I had not thought about this issue in relation to https://golang.org/cl/109517 " cmd/compile: optimize append(x, make([]T, y)...) slice extension". I will have a look at recognizing append([]T(nil), make([]T, m)...)[:0] and rewriting it into Others then can use that cl to do their own experiments and see if it helps and we do not have to change the spec for it. If it turns out to be helpful in comparison to the complexity it adds to the compiler we could submit the cl to recognize: append([]T(nil), make([]T, m)...)[:0] or alike making a spec change not necessary to cover the optimization use case until this proposal has been accepted. |
Change https://golang.org/cl/111736 mentions this issue: |
If we want to require users to opt-in to flexible capacity allocation, I think CL 111736 is the right solution: just recognize an existing valid code pattern like My hypothesis with this proposal though was that most code would benefit from allowing the runtime to automatically increase capacity of initial slice allocations, just like how it automatically increases it for appends. |
Provided it can be done in a backwards compatible, opt-in, basis, this proposal looks worthwhile to me. Although I haven't always been comfortable with some of the uses proposed for make([]T, n, ...) looks ideal here. |
make([]T, n, ...) means at least n allocated. What about "at least n+100" for example? make([]T, n, n+100...)
make([]T, n, 100+n...) does look weird a bit. (and I am not sure about compiler complications) |
make([]T, n+100, ...) |
@tandr or you could write |
(That said, I'm not convinced new syntax is merited here.) |
Thanks @josharian, it is indeed a possibility. As I see it for |
We have places in our code we know upper bound, but not a lower one for the slice or map, and start filling it up filtering the values from another (streaming) source. It is not "common" but it does happen. Cases where map or slice got expanded past initial known capacity are rather got collapsed into using default capacity - aka |
This change as originally proposed does not comply with http://golang.org/design/28221-go2-transitions#language-redefinitions, so I don't think it could be adopted as-is: the spec clearly states that |
That said, if we get generics you could envision a generic library function that encapsulates some idiom that the compiler can recognize: package slices
// Make allocates and returns a slice of n elements of type T, with an arbitrary capacity ≥ n.
func [type T]Make(n int) []T {
return append([]T(nil), make([]T, n)...)
} That gives call sites like var s = slices.Make[T](n) which to me doesn't seem appreciably worse than any of the other syntax alternatives mentioned above, and does not require any additional language or tooling changes beyond generics per se. |
I ran into this recently, in the context of knowing a size I wanted to request, but it would definitely be preferable to actually use the whole space (just building buffers, so no benefit to me in a subslice of a buffer followed by unused space). I'd been thinking about how to express this, and in something more like C, I'd probably advocate for |
And here I am running into exactly this again, having totally forgotten about it since the last time was... apparently a bit over a month ago. I spent a little time noodling on concepts like, say, One alternative that would require a bit of work, but would be more unambiguously new, would be to use a 4-argument make, so (type, len, cap, extra), but it's not obvious that "extra" could have anything meaningful -- really, the only practical cases are (1) i want an exact cap, (2) i want a cap rounded up to size class. if i want more than one size class of rounding, i should probably just specify a larger goal anyway. So really, I think that of the things that I've seen, |
With #45955 we have: // Grow grows the slice's capacity, if necessary, to guarantee space for
// another n elements. After Grow(n), at least n elements can be appended
// to the slice without another allocation. If n is negative or too large to
// allocate the memory, Grow will panic.
func Grow[S constraints.Slice[T], T any](s S, n int) S which I assume has the freedom to allocate a larger capacity than Would this suffice for what this issue is aiming to accomplish? |
ProblemFYI, this is a real issue for me, I am implementing a What I write fuses slice operations together. a = append(append(a, b...), c...) Into: a = slices.Grow(a, len(b)+len(c))
a = a[:len(a)+len(b)+len(c)]
copy(a[len(a):], b)
copy(a[len(a)+len(b):], c) Which is faster, do only one allocation and is smaller. Doing that with slices from a := make([]int, 10)
b := append(a, 1) // b MUST be a copy of a, because the required capacity is 10+1 and a's capacity is 10.
doStuff(a)
fmt.Println(b[0]) // 0 even if doStuff do a[0] = 1 That forbid to write the optimization that we want here: a := make([]int, 10, 11)
b := a[:10] // Take a reference instead
b[10] = 1
doStuff(a) // a[0] = 1
fmt.Println(b[0]) // it is now 1 modified by doStuff (wrong) SolutionsOn the solutions sides of things, this is a breaking change of the spec, however I have never seen code that relies on that. I do like @bcmills's solution (#24204 (comment)). |
@dsnet s := slices.Grow([]Type(nil), c)[:l] But that is very unergonomic.
// assume data is a slice of whatever type
data = slices.Grow(data, len(more))
for _, v := range more {
data = append(data, process(v))
} |
Was thinking about this again and there may be a way (that AFAICT has not been proposed yet) to get the same benefits without language changes: modifying Let's say the user has initially allocate the slice with This would AFAIK preserve all current semantics (I am not aware of any guarantees made by append on the capacity after an append) while allowing to use the whole allocation if possible, would not require users to know specific incantations, and would have the potential benefit of reducing the amount of unused space in allocation tails (because slice capacities would tend to end up sizeclass-aligned after repeated appends). The biggest problem would be figuring out how to know whether the allocation tail is really unused, or if the user has manually shrunk the capacity (and therefore the tail can't really be used) but it doesn't seem like an impossible problem. |
@CAFxX I don't think we can quite do that, unfortunately, due to the problem you mention at the end.
When we append to We could figure that out by recording the actual allocated size in the heap metadata somehow. But that brings up the additional problem of multiple threads are simultaneously appending to It may be doable, but doesn't seem easy (coding-wise), and possibly seems expensive (runtime-wise). |
Yes, the simplest (but untested and somewhat still incomplete) idea I could come up with so far is storing a small (AFAIK we really need a single bit to signal whether the tail can be used or not) flag in the tail (or head) of the allocation, and accessing it atomically during modifications to the slice capacity (that, assuming uncontended access, shouldn't be horribly slow). Definitely, it wouldn't be trivial... but OTOH it would have pretty important upsides (works transparently for existing code, no language additions/redefinitions, no special syntaxes to get the compiler to do the right thing, ...) |
Following up on the comment from @dsnet from last fall, one update is that
A similar optimization might be possible in the And of course, people could have their own generic |
Have you considered making this kind of change with the use of go directive (similarly to loop variable semantics)? |
@mateusz834 The loop variable semantics change was a one-off, done because we felt it was important enough to break our own rules. We aren't going to do it again. |
Currently, the Go spec requires that
cap(make([]T, n)) == n
andcap(make([]T, m, n)) == n
. I propose relaxing this constraint from==
to>=
.Rationale: the Go runtime already pads allocations up to the next malloc bucket size. By treating the user supplied capacity argument as a lower bound rather than exact bound, there's a possibility to make use of this padding space.
Also, if users really need an exact capacity (which seems like it would be very rare), they can write
make([]T, n)[:n:n]
ormake([]T, m, n)[:m:n]
.(See also discussion in #24163.)
The text was updated successfully, but these errors were encountered: