Hm, let me try to making a bullet list of all relevant aspects and assumptions that influence this decision. This is going to be a bit train-of-though-y, and not based on first-hand experience, so I might be talking bullshit or consider unimportant details. ;)

I'm assuming ~[] to be a non-growable owned slice under DST, and also that strings will be handled identically, so Vec/StrBuf and ~[]/~Str is interchangeably here. Furthermore, I'm assuming that the two valid options are "Vec everywhere as recommended default" and "Both Vec and ~[] as recommended default".

So, as far as I can see at least these aspects are important:

Requesting and resizing memory allocations is not zero-cost.
- Depending on allocator some operations might be zero cost, but you can't assume it in general
Forgetting about a suffix of an allocation does not work with all allocators.
- Dropping excess capacity needs some kind of shrink_allocation operation in the generic case.
Memory usage and time complexity in general should be as small as possible.
An allocator might provide you with more memory than asked for, with zero cost penalty.
- Meaning you can gain capacity for free sometimes.
User might desire a way to encode "this will never grow" into the type of a value.
- The usage is to improve concentration and ability to reason about code by knowing that a type is inherently ungrowable.
The convention about which type to use should be as simple as possible.
It should be easy to decide for the user which type to use.
Internal implementation details should not leak because of this convention.
People will make mistakes, or not read the docs, so the convention should not lead to errors spreading through a code base.
Some people think because ~[T] is a more build-in/vectory syntax than Vec<T>, it should be used more.

If it where just about 1-3, then the logical choice would be that everything that returns a vector that could conceivably have excess capacity (due to incremental build-up or similar) return a Vec, and everything that returns a vector that just requires a single allocation return a ~[]. If you want to grow it further, the latter can be turned into a Vec for zero cost, and if you don't want to grow you can just store both as is, with the only ẁasted space being the unused capacity field for a grow able vector, which is constant cost as opposed to the potential O(n) cost of shrink_allocation.

If you also consider 4, then the choice of using ~[] at all becomes harder, as its now a choice between allowing the user to saving one word of memory for zero cost, or potentially allowing the user to grow the allocation more cheaply than expected for zero cost, both being relatively weak optimizations.

5 might be a argument in favor of ~[], but we already offer better support than other languages for encoding this without using ~[]: A vector in a immutable location is already impossible to change, and both &[] and &mut [] offer a very cheap way to arrive at a type that is inherently not growable. However, both options are not exactly cognitive overhead free, as they both restrict the base type Vec to behave like ~[], rather than having a restricted type to begin with. (In other words, its slightly more complex)

But, there is also 6 and 7 to think about: Rust is already a complex language, so any convention a new user of the language has to learn up front should be as simple as is practical. And arrays and vectors, while being as basic as it gets, already are more complex than in other languages due to unsized [T], ownership, and unboxed values. And, as trivial as it sounds, always using the same type is easier than needing to decide or convert between two different types all the time.

For 8, if the convention becomes to choose between ~[] and Vec<T>, then this can leak implementation details, as the decision on what to return is partly based on how its constructed. This can also lead to API instability or performance problems, as a change in the implementation either causes the API to change, or to require calling shrink_allocation to keep the API stable. Again, using Vec consequently would solve this simply by adding the constant cost of a capacity field.

The problem with 9 is that a wrongly used ~[] can bubble up through API layers, as other users assume its being chosen for the correct reasons, or worse convert it to a Vec along the way, with a potentially useful capacity getting lost without anyone noticing.

Lastly, with 10 I think this is the same "provide short, intuitive build-in syntax for common structures" desire that gets expressed every time language features move into the library or get more generic. Certain things simply look way more verbose in todays rust than in that of one, two years ago, so every time a new verbosity/complexity develops, people try to find ways to minimize its impact. (I remember falling into the same trap at the time removing @T was first talked about)
In this case, this is about loosing ~[T] completely as a shortform (at least as a recommended default type). Seeing how this argument is really just about syntax and DST in general, I don't think it should be considered for this decision, as the semantic and performance implications are way more important, and the ergonomic ones more far reaching.

So in conclusion I think using Vec and StrBuf everywhere per default instead of also ~[] and ~Str is the better choice, as the advantages far outweigh the disadvantages.

Settle conventions for ~str/StrBuf, ~[T]/Vec #13717

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions