-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: slices: add Divide (split into equally sized chunks) #65523
Comments
Do you have any use cases in mind for |
The veggiemonk version of Divide does Divide(6, 4) → 1, 2, 1, 2, which I find sort of weird. I like the avamsi version best, where it distributes the remainder starting from the first in the batch until there aren't any remainders left. An alternative would be to just pile all of the remainder onto the first or last slice, but I agree that you usually don't want that. |
Here's a working version with a test. // Divide returns an iterator that divides a slice into n sub-slices of roughly equal length.
// If there is a remainder from dividing the slice,
// it will be distributed evenly starting from the first sub-slice.
// Sub-slices may be empty if n is greater than the length of the slice.
// Divide panics if n is less than one.
// All sub-slices are clipped to have no capacity beyond their length.
func Divide[S ~[]E, E any](slice S, n int) iter.Seq[S] {
if n < 1 {
panic(fmt.Sprintf("attempted to divide a slice by n < 1: n = %d", n))
}
return func(yield func(S) bool) {
size := len(slice) / n
remainder := len(slice) % n
var start, end int
for range n {
start, end = end, end+size
if remainder > 0 {
remainder--
end++
}
if !yield(slice[start:end:end]) {
return
}
}
}
} |
At least for those of us with a Unix background, Split might be the better name. But also: Is this worth putting in the standard library? It's not a common operation in my experience. And as you demonstrate, it's not hard. |
I think it's tricky enough to warrant being in the standard library. I forget where the last time I needed it was, but I've definitely needed it in the last couple of years and just wrote a worse version that used a slice of slices and didn't handle the edge cases correctly. |
Should this live in func Chunk[E any](s Seq[E], n int) Seq[Seq[E]] {
// implementation
} |
@leaxoy there's a different proposal (#53987) for slices.Chunk (the int argument it takes is the chunk size) and there might be an xiter version if the slices version is accepted. slices.Divide (the int argument it takes is the number of chunks) OTOH needs len upfront to calculate the chunk size, so I don't think an iter / xiter version is possible. |
I think it would be very confusing to have an iterator chunker that takes the max size of the chunk, and a slice chunker that takes the number of chunks desired. Then |
On the topic of naming, it might be of interest that when I first read this proposal I misunderstood (based on the name and the short summary) that this was "divide a slice into chunks of size N (with a possible remainder)" rather than "divide a slice into N chunks of (roughly) the same length". I've seen the former more often in other languages and so I suppose that's why my mind went there first. Unless I'm the only one with that confusion, it seems to me like the name should try to communicate whether the Sadly, my only idea for that so far was to put an NChunks(slice, chunkCount)
ChunksOfN(slice, chunkLength) Hopefully either I'm the only one with this confusion (and thus it isn't a problem that needs solving at all) or someone else has a better imagination than me and suggests something more normal-looking. Edit: Sorry, I had this comment written in my browser and didn't submit it immediately due to a distraction, and then when I submitted it I found it was now a little redundant with earlier comments. I hope it's still at least a little useful. |
I agree that there's a question of which is Chunk and which is Divide and it's probable that you could get confused about which does what, but I'm not sure if there's a better solution than just having both under those names. |
If we accept Chunk, then Divide is:
Do I have that right? |
Divide(slice len 10, n 4) should ideally yield sub-slices of lens 3, 3, 2, 2. |
Is this facility important enough to commit to the standard library? The discussion shows the problem itself isn't even well understood. Beware the endless addition of "things one can do with a slice" (or map) to the library. Not every operation needs formal support, and the more things you put in the library, the more things need support from an already busy group. |
Indeed, I think it would lead to more confusion to know which one does what. I didn't think of all the possible ramifications when I restarted the discussion in #53987. (my first time participating on this repo). I appreciate the discussion but I would leave Thank you all for your time and attention. |
This proposal has been added to the active column of the proposals project |
To me, most of the trickiness seems to be not in the implementation, but in deciding what the implementation should do. ISTM that mainly, there are cases (I wouldn't even call them "edge" cases - numbers not being divisible is the norm) where there are multiple reasonable answers and it's ambiguous what a user might want. That, to me, speaks against inclusion in the standard library. Inclusion in the standard library would be a canonical policy decision, yet there doesn't seem to be a canonical policy on what users want. |
I have implemented splitting of arrays several times, but I think only once or twice have I explicitly re-used the same method to do the splitting, almost always re-writing it based on the surrounding context. I am with Merovius, that this proposal is going to be down to what the implementation should do. Expanding their comment with examples, Even if we decide to go with one particular "packing" over another (IE
Or perhaps we do none of these, explicitly shuffling the items in the same vein as how maps are "shuffled" to prevent predicting and packing the map? I feel that this is a situation where we need to examine what code publicly exists already. That would show us Unless/Until we get that examination, I cant see this proposal being accepted. |
Based on the discussion above, this proposal seems like a likely decline. |
No change in consensus, so declined. |
Proposal Details
Follow up to #53987. This is a proposal to add an iterator to the slices package that divides a slice into N equally sized subslices.
Bikeshed topics: What should the name of this function be? How does it handle special cases?
I think it should always iterate over N items, but if the slice is too short, it may return subslices of length 0. So
Should it be:
The text was updated successfully, but these errors were encountered: