-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Go 2: allow 3-index slices to elide the second (length) term #38081
Comments
I think question here is: what do people intuitively expect when the second operand is omitted? I think a decent argument could be made that people intuitively expect that if the second operand is omitted that the length remains unchanged. I think it's a little harder to argue for the min expression. But maybe that is OK. (Another possible default would be that omitting the second expression means that the length is unchanged, and if the capacity becomes smaller than the length then the expression panics at run time. The most important goal is not "omit the second expression as often as possible." I think the most important goal is "don't surprise the reader.") |
Indeed. You can think of the “min” part as meaning: “leave the length unchanged unless it would exceed the new capacity, in which case keep it as close to the original as possible”. The alternative is to panic if the original length exceeds the new capacity, but panicking is arguably never the expected behavior — that's why it is a panic instead of an error return or a |
A similar proposal by defaulting the |
In a cursory search of the One such example supports the “min” behavior over “default to the new cap”: func (s *asmState) clone() *asmState {
c := *s
c.buf = c.storage[:len(s.buf):cap(s.buf)]
return &c
} The other two use values that are not plausible defaults: {a[:10], a[:10:20], true, false},
{a[:10], a[5:10:20], true, true}, // resetBuf points buf at storage, sets the length to 0 and sets cap to be a
// multiple of the rate.
func (s *asmState) resetBuf() {
max := (cap(s.storage) / s.rate) * s.rate
s.buf = s.storage[:0:max]
} The vast majority of usage sites don't distinguish between “min” and “new cap” at all, because they use the same expression for both terms. However, they clearly support that “old length” alone is not a broadly-useful default. There are a few that slice down to a capacity (and equal size) obtained elsewhere: n, err := req.Body.Read(buf)
if n > 0 {
s.buf.put(&recvMsg{data: buf[:n:n]})
buf = buf[n:]
} // Allocate only once. Note that both dst and src escape when passed to
// Transform.
buf := [2 * initialBufSize]byte{}
dst := buf[:initialBufSize:initialBufSize]
src := buf[initialBufSize : 2*initialBufSize] args := (*[(math.MaxInt32 - 1) / unsafe.Sizeof((*C.sqlite3_value)(nil))]*C.sqlite3_value)(unsafe.Pointer(argv))[:int(argc):int(argc)] A few slice both the size and cap down to the original length, in which case the distinction between “new cap” and “original length” is irrelevant: // pad pads the code sequence with pops.
func pad(enc []byte) []byte {
if len(enc) < 4 {
enc = append(enc[:len(enc):len(enc)], zeros[:4-len(enc)]...)
}
return enc
} ckey := append(key[:len(key):len(key)], 0) And many slice an arbitrarily large // AllGroups returns a slice that can be used to iterate over the groups in g.
func (g *Tokengroups) AllGroups() []SIDAndAttributes {
return (*[(1 << 28) - 1]SIDAndAttributes)(unsafe.Pointer(&g.Groups[0]))[:g.GroupCount:g.GroupCount]
} u := (*[1 << 29]uint16)(unsafe.Pointer(&data[0]))[: len(data)/2 : len(data)/2] buf := (*[1 << 29]byte)(unsafe.Pointer(&v[0]))[: len(v)*2 : len(v)*2] n := len(buf)
bw := (*[maxRate / 8]uint64)(unsafe.Pointer(&buf[0]))[: n/8 : n/8] |
To me the issue here is not what is the most useful or most commonly used default. It is: what will people intuitively assume if the length is missing? |
I think those two judgments are closely connected: people will tend to intuitively assume whatever behavior is most convenient to their use-case, because they are primed to think about that use-case and not other possible behaviors. |
I think that to some extent you may be talking about the writer of the code, but I'm trying to talk about the reader of the code. |
I'm talking about both: neither the reader nor the writer of the code will expect a slice expression to panic when all of the non-elided indices are valid. They will not expect a panic in part because the corresponding 2-index slice does not panic when either index is elided, provided that the other index is valid. And, they will not expect a panic in part because they will observe that the code does not panic when trivial tests are run. (For the “cut cap down to length” use-cases, the distinction between the old length and new cap is meaningless, and for the “cut a buffer down to a computed size” use-cases, the code would panic under even the most trivial usage if the “min” operation were not applied.) |
There's some, right? We can't always statically prove which is min so that would need to happen at runtime? But perhaps that no different than the panicindex checks today. |
If in most examples the second operand of the slice expression is identical to the third operand of the slice expression, then perhaps the appropriate default for a missing second operand is simply the third operand. That is slightly simpler than the suggested min expression. A separate consideration is how easy it would be to miss the extra colon. // These mean different things:
a = s[:10]
s = s[::10] |
I agree with @ianlancetaylor that, if the second operand of the 3-index slice were omitted, then the most intuitive default would be the third operand rather than what is being proposed here. However, I have my doubts about whether the proposal is worth doing at all since in my experience 3-index slices are not very common and, when encountering them, people may have to look up what they mean anyway. |
Instead of defining |
That is indeed slightly simpler if we consider eliding only the second index but requiring the third index in perpetuity. However, if we also consider the potential future meanings for an elided third index, that default becomes more confusing. Arguably a = s[:] and a = s[::] should mean the same thing (“ @jimmyfrasche, defaulting the third argument to the second would have a similar problem: if the third index defaults to the second, then |
The static analysis is trivial for:
Those three cases cover all of the examples I have observed that use any plausible default for the second index. So, I suppose there is a theoretical run-time cost, but it is nearly always zero in practice. |
I don't understand why anyone would ever use a three index slice expression and leave the capacity unspecified. The whole point of a three index slice expression is to change the capacity. In a two index slice expressions there are good reasons to omit the first operand, and good reasons to omit the second operands, and so |
@bcmills Yes, |
Ok, I think I buy that argument:
That would indeed make my motivating example even more concise. With only the trailing index elided: cmd.Env = append(cfg.OrigEnv[:len(cfg.OrigEnv):], g.env...) and with both elided: cmd.Env = append(cfg.OrigEnv[::], g.env...) So, should we have @jimmyfrasche file a new proposal, or repurpose this one? |
The currently closed #25638 was posted earlier and proposes the same elision. We could re-open that thread. |
I'm still concerned by the comment that closed #25638: I don't think that it's obvious what it means when the third operand of a three-index slice expression is omitted. We can reason our way to a couple of different possibilities. We can say that an omitted capacity gives us an unchanged capacity. Or we can say that an omitted capacity gives us the length (if specified). We can say that an omitted length, with a specified capacity, gives us the maximal value, which is the new capacity. Or we can say that an omitted length gives us the current length (which might panic if larger than the specified capacity). With the two index slice expression there really isn't any confusion: omitting the value leaves it unchanged. We could use that rule for a three index slice expression but it isn't that useful, since it's not what people actually want from a three index slice expression. So it's not clear that there is any answer here that is both intuitive and useful. Which is why we didn't specify a default in the first place. If we have generics, we could write a function that does what you want. func SetCap(type Elem)(s []Elem, newcap int) []Elem {
return s[:newcap:newcap]
} |
I think
Both would require explanation because the third index in Go is distinct from other languages, but both of the length-dependent definitions are reasonable. All of the capacity-dependent definitions, including the Maybe there's nothing to do here, but, if not and there were generics, what I'd write is func Decap(type Elem)(s []Elem) []Elem {
return s[:len(s):len(s)]
} |
If I were reading a three-index slice in other people's code (especially in a code review) I would much rather see both the second and third indices as explicit. It draws attention to the fact that an important memory-aliasing-related operation is taking place, and it makes it clear what the exact intention is.
To me it's clear that if I'm debugging an aliasing bug, I'd rather have I see nothing wrong with a generic |
@IanTayler, when you take human factors into account, the choice is not just between explicit indices and elided indices. The choice today is among all of:
The point of this proposal is to reduce the rate of incidence of
The rate of incidence of correct explicit 3-index slices vs. correct elided 3-index slices seems like — at best — a secondary concern. (The verbosity of 3-index slices today provides a strong incentive for incorrect 2-index slices over correct-but-explicit 3-index slices, and 2-index slices can already elide both of the indices.) |
Per the discussion above, we just can't convince ourselves that there is any intuitive meaning for an omitted second operand in a three-element slice expression. It seems reasonable that it means "the length is unchanged," and it seems reasonable that it means "set the length to the (specified) capacity." Let's see if generic functions like Therefore, this is a likely decline. Leaving open for four weeks for final comments. -- for @golang/proposal-review |
No further comments. |
Just FYI, I have made two surveys on this topic:
The results are in line with anticipations. |
I propose to allow the second index of 3-index slice expressions to be elided, with the default value “min(original length, new capacity)”.
This is based on the observation that the 3-index slice form is almost always used with the new length set to either
0
or the same expression as the new capacity, and the new capacity set to either exactly the original length or value known not to exceed it (such as in numerous fixes for #34972).The specific default is chosen for the following properties:
a[::k]
for anyk
≤cap(a)
.x[::] ≡ x
.Template answers
Experienced.
C, C++, Standard ML, OCaml, Rust, JavaScript, Python, Bash, π-calculus, probably a few others I've missed.
Could go either way: 3-index slices are already relatively obscure, but this proposal arguably makes them more closely match the intuitive behavior.
This form was discussed in the original design doc, but explicitly deferred for later discussion. Now that we have released 12 intervening Go versions, I believe we have enough experience to continue that discussion.
It is not clear to me whether this point of the design has been revisited since Go 1.2. (If so, the original issue #1642 has no cross-references to that discussion.)
Anyone who uses
append
and doesn't enjoy debugging aliasing bugs.This proposal makes it easier (less verbose) to
append
to a slice that is never otherwise mutated, without risking overlapping mutations to the portion of the slice beyond its length.Allow the second (
high
) index of a 3-index “full slice expression” to be elided.Specifically, for an expression of the form
a[low : high : max]
, if thehigh
term is elided, it should default to min(len(a)
,max
).This corresponds to cases 3 and 6 in the original design doc. However, this proposal does not allow the third element to be elided: we do not need to choose between those two cases at this time.
If the middle index of a 3-index slice expression is omitted, it defaults to either the length of the original slice or the new capacity, whichever is smaller.
Yes. It is explicitly called out as a possibility in the Go 1.2 design doc, and I don't believe any change in the interim has rendered it impossible.
This example is from real code (in #38077), fixing an aliasing bug.
Unfixed:
go/src/cmd/go/internal/generate/generate.go
Line 441 in 2975b27
Fixed with the language as defined in Go 1.14:
Under this proposal:
The compiler and static analysis tools would need to be upgraded to accept the new form.
I believe the change to the compiler would be minimal.
go/ast.SliceExpr
already allows a nilExpr
in theHigh
position, so other syntax-based tools would be unaffected unless they assumeSliceExpr
invariants beyond what is promised in the documentation.Negligible.
None: an equivalent 3-index slice can already be written today.
Yes: assign the 3rd index to a temporary, then use that temporary (and a trivial
min
calculation) for both the 2nd and 3rd indices.No.
Objection: asked and answered.
It makes 3-index slices more consistent with 2-index slices, for which the
high
term may already be elided.No.
No.
No.
The text was updated successfully, but these errors were encountered: