Better indexing and iterators for structs to prevent copies #22567
peppergrayxyz
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm currently with Unicode text segmentation and have noticed a few things, which could be improved in general:
Examples:
The current way...
...will copy the whole string, although it is not needed:
I use
struct string
as example. Forstring
you could just dos[i]
instead ofs.bytes()[i]
, but this only works withstring
.struct
types cannot be indexed with[]
.indexing
The array syntax can be allowed to be used with structs
and mapped to different functions:
s.bytes[1]
-->s.bytes.get(i int) u8
s.bytes[7..11]
-->s.bytes.slice(start int, end int) []u8
-->for i /* ... */ get(i)
s.bytes
-->slice(0, s.len)
Some types may not support direct indexing, but only iterating. There should be a warning if they are accessed using and index or a range other than
[0]
or[..index]
. If only an index is needed, the sequence must be generated nevertheless, hence the syntax should be.....and
s.runes[..3]
is translated using the iteratorfor i /* ... */ s.runes.next(i)
.This clearly indicates if an index access is just a cast or if there is an iterator involved. This example looks strange, but it probably also is an antipattern to use an iterator like this.
It's hard to spot what is going wrong here:
but way easier if the syntax already hints at the issue:
General case:
There should be at least one of
get
orslice
ornext
. Their return value should be optional except when there is a len field.iterating
We have
.next
to iterate structs, but it is not mandatory for implementations that convert all data in one go to a target type (e.g.s.runes()
) to provide an iterator. Therange
part of the loop should not accept function calls, but the name of the target and the loop finds the proper iteratorfor b in s.bytes {
-->s.bytes.get(i)
for r in s.runes {
-->s.runes.next(i)
Not having these function calls would prevent full copies. If you really want the full copy, you'd have to make it explicit:
Currently,
.next
needs at least an index variable. If we had the possibility to read and modify the iterator variable of a loop, we could do:and convert it to
.next
would not require storing state, i.e. the iterator could be attached to a struct without increasing the size of the struct. The iterator variable will be provided by the loop.General case:
putting it together
We need a way to attach iterators/index functions
Probably not a good idea to stick anything to everything like in JavaScript, but there probably is an elegant way to handle this setup. Otherwise, this could just be done by convention:
the conversion functions should be there as well:
reuse
Most of these things already exists for arrays, but are not generalized so that they can be used with structs. The main motivation to have these possibilities for structs is to access an array in different ways without creating (full) copies.
summary
Beta Was this translation helpful? Give feedback.
All reactions