Description
Currently the documentation (or lack of it) suggests to use Unboxed
if you want unboxed, low-overhead vectors. In fact the wiki linked in the description says:
End users should use Data.Vector.Unboxed for most cases
However, in my experience, it's really not the better default, as opposed to Storable
(or perhaps Primitive
). Here I propose to update the documentation to make Unboxed
less prominent, and even explicitly discourage its use (to counter the natural appeal of its name).
The Unbox
class is hard to understand
It involves
- Multi-param type classes, and the class synonym trick (
Unbox
) - Data type families (which I think are even more niche than plain type families)
- The names
MVector
andVector
are used both for classes and data type families, adding to the confusion PrimMonad
(should monads even be a prerequisite to use efficient vectors for user-defined types?)
Hence, most new users would probably be unable to complete the sketched implementation of Unbox
for Complex
in the documentation of Data.Vector.Unboxed
.
Furthermore, because of the methods parameterized by PrimMonad
instances, it's not possible to use DerivingVia
to abstract any of those details. So you either pay 30 lines of boilerplate, or use CPP/Template Haskell. Even assuming users are capable of adapting that boilerplate to their own types, that doesn't look "very easy" (quoted from the documentation).
Unboxed is not the most efficient
In addition, although Unboxed
is touted as efficient because vectors of tuples are merely tuples of vectors, a flat Storable
vector is actually more efficient in almost all cases: it has much fewer indirections and having every entry in a contiguous piece of memory makes it more cache-friendly. The Storable
class is also so much easier to understand and use than Unbox
. Access patterns taking advantage of Unboxed
's layout are arguably rare.
That means this sentence at the top of Data.Vector.Primitive
(which is similar to Storable
in those respects) is incorrect:
Adaptive unboxed vectors defined in
Data.Vector.Unboxed
are significantly more flexible at no performance cost.
Storable
vs Primitive
Instead of Unboxed
, either Storable
or Primitive
should be promoted instead.
I'm still uncertain about the trade-offs of who should manage a vector's memory.
-
Although
Storable
is originally an FFI thing, it still seems to do a fine job for general-purpose programming, and it doesn't have the 2x factor of GC-controlled memory, as opposed toPrimitive
. -
MagicHash
is a recurrent point of confusion, and that also plays againstPrim
from a documentation/usability point of view.
I'm sure there are advantages to Primitive
, but I'm not sure they are sufficiently strong to make it a better default than Storable
.