From f4136a17956fe70560de5fd0775bed71a8ae4ff0 Mon Sep 17 00:00:00 2001 From: Aleksandra Wasilewska Date: Thu, 1 Aug 2024 18:07:08 +0200 Subject: [PATCH 1/5] Improve README --- vector/README.md | 87 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 85 insertions(+), 2 deletions(-) diff --git a/vector/README.md b/vector/README.md index 2524fd04..9e6818a6 100644 --- a/vector/README.md +++ b/vector/README.md @@ -1,6 +1,89 @@ The `vector` package [![Build Status](https://github.com/haskell/vector/workflows/CI/badge.svg)](https://github.com/haskell/vector/actions?query=branch%3Amaster) ==================== -An efficient implementation of `Int`-indexed arrays (both mutable and immutable), with a powerful loop optimisation framework. +This package includes various modules that will allow you to +work with vectors and use an optimisation framework called [*fusion*](#fusion). +In this context, vector is an `Int`-indexed array-like data structure with a simpler +API that can contain any Haskell value. Additionally, its equivalence +to C-style arrays and optimisation via fusion accelerates vector’s +performance and makes it a great alternative to list. By installing this +package, you’ll be able to work with [boxed, unboxed, storable, and primitive +vectors](#vectors-available-in-the-package) as well as their generic interface. -See [`vector` on Hackage](http://hackage.haskell.org/package/vector) for more information. + +## Table of Contents + + +- [Installation](#installation) +- [Tutorial](#tutorial) +- [Vector vs Array](#vector-vs-array) +- [Vectors Available in the Package](#vectors-available-in-the-package) +- [Fusion](#fusion) + +## Installation + +If you use **cabal**, modify `package.cabal` so that its `build-depends` +section includes vector package: +``` +build-depends: base ^>=4.17.2.1 + , vector ==0.13.1.0 +``` +If you use **stack**, modify `package.yaml` so that its `depends` +section includes vector package: +``` +dependencies: +- base >= 4.7 && < 5 +- vector == 0.13.1.0 +``` + + +## Tutorial + +A beginner-friendly tutorial for vectors can be found on +[MMHaskell](https://mmhaskell.com/data-structures/vector). + + +If you have already started your adventure with vectors, +the tutorial on [Haskell Wiki](https://wiki.haskell.org/Numeric_Haskell:_A_Vector_Tutorial) +covers more ground. + +## Vector vs Array + +Arrays are data structures that can store a multitude of elements +and allow immediate access to every one of them. Even though Haskell +has a built-in [Data.Array module](https://hackage.haskell.org/package/array-0.5.7.0), +arrays might be a bit overwhelming to use due to their complex API. +Conversely, vectors incorporate the array’s *O(1)* access to elements +with a much friendlier API of lists. Since they allow for framework +optimisation via loop fusion, vectors emphasise efficiency and keep +a rich interface. Unless you’re confident with arrays, it’s +well-advised to use vectors when looking for a similar functionality. + +## Vectors Available in the Package + +**Boxed vectors** store each of its elements as a pointer to its value. +Because we cannot directly access the contents of a boxed vector, they +are slower in comparison to unboxed vectors. + + +**Unboxed vectors** store solely their elements’ values instead of pointers. +To be unboxed, the elements need to be constant in size. Since we can directly +access the contents of the unboxed vector, working with them is quite efficient. + + +**Storable vectors** are pinned, convertible to and from pointers, and +usable in C functions. + + +**Primitive vectors** contain elements of primitive type. +Primitive types can be recognised by the hash sign attached at +the end of value and/or type’s name, e.g. `3#` or `Int#`. You can read +more about them [here](https://downloads.haskell.org/~ghc/5.00/docs/set/primitives.html). + +## Fusion + +An optimisation framework provided in this package, fusion +is a technique that merges several functions into one and forces +it to produce only one outcome. Without fusion, your program might +generate intermediate results for each function separately and +stall its performance. \ No newline at end of file From f2466743b5caf2a06ed2e25beed77a60bb776317 Mon Sep 17 00:00:00 2001 From: Aleks Date: Sat, 24 Aug 2024 14:06:49 +0200 Subject: [PATCH 2/5] Apply Suggestions --- vector/README.md | 69 ++++++++++++++++++------------------------------ 1 file changed, 26 insertions(+), 43 deletions(-) diff --git a/vector/README.md b/vector/README.md index 9e6818a6..267859da 100644 --- a/vector/README.md +++ b/vector/README.md @@ -14,28 +14,10 @@ vectors](#vectors-available-in-the-package) as well as their generic interface. ## Table of Contents -- [Installation](#installation) - [Tutorial](#tutorial) - [Vector vs Array](#vector-vs-array) - [Vectors Available in the Package](#vectors-available-in-the-package) -- [Fusion](#fusion) - -## Installation - -If you use **cabal**, modify `package.cabal` so that its `build-depends` -section includes vector package: -``` -build-depends: base ^>=4.17.2.1 - , vector ==0.13.1.0 -``` -If you use **stack**, modify `package.yaml` so that its `depends` -section includes vector package: -``` -dependencies: -- base >= 4.7 && < 5 -- vector == 0.13.1.0 -``` - +- [Stream Fusion](#stream-fusion) ## Tutorial @@ -61,29 +43,30 @@ well-advised to use vectors when looking for a similar functionality. ## Vectors Available in the Package -**Boxed vectors** store each of its elements as a pointer to its value. -Because we cannot directly access the contents of a boxed vector, they +**Lazy boxed vectors** (`Data.Vector`) store each of their elements as a +pointer to a heap-allocated value. Because of indirection, lazy boxed vectors are slower in comparison to unboxed vectors. - -**Unboxed vectors** store solely their elements’ values instead of pointers. -To be unboxed, the elements need to be constant in size. Since we can directly -access the contents of the unboxed vector, working with them is quite efficient. - - -**Storable vectors** are pinned, convertible to and from pointers, and -usable in C functions. - - -**Primitive vectors** contain elements of primitive type. -Primitive types can be recognised by the hash sign attached at -the end of value and/or type’s name, e.g. `3#` or `Int#`. You can read -more about them [here](https://downloads.haskell.org/~ghc/5.00/docs/set/primitives.html). - -## Fusion - -An optimisation framework provided in this package, fusion -is a technique that merges several functions into one and forces -it to produce only one outcome. Without fusion, your program might -generate intermediate results for each function separately and -stall its performance. \ No newline at end of file +**Strict boxed vectors** (`Data.Vector.Strict`) contain elements that are +[strictly evaluated](https://tech.fpcomplete.com/haskell/tutorial/all-about-strictness/). + +**Unboxed vectors** (`Data.Vector.Unboxed`) determine an array's representation +from its elements' type. For example, vector of primitive types (e.g. `Int`) will be +backed by primitive array while vector of product types by structure of arrays. +They are quite efficient due to the unboxed representation they use. + +**Storable vectors** (`Data.Vector.Storable`) are backed by pinned memory, i.e., +they cannot be moved by the garbage collector. Their primary use case is C FFI. + +**Primitive vectors** (`Data.Vector.Primitive`) are backed by simple byte array and +can store only data types that are represented in memory as a sequence of bytes without +a pointer, i.e., they belong to the `Prim` type class, e.g., `Int`, `Double`, etc. +It's advised to use unboxed vectors if you're looking for the performance of primitive vectors, +but more versality. + +## Stream Fusion + +An optimisation framework used by vectors, stream fusion is a technique that merges +several functions into one and prevents creation of intermediate data structures. For example, +the expression `sum . filter g . map f` won't allocate temporary vectors if +compiled with optimisations. \ No newline at end of file From ec9466b400b9d963beec4add4ef0037024b74760 Mon Sep 17 00:00:00 2001 From: Aleks Date: Sat, 28 Sep 2024 14:38:45 +0200 Subject: [PATCH 3/5] Shorter Intro --- vector/README.md | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/vector/README.md b/vector/README.md index 267859da..b9949235 100644 --- a/vector/README.md +++ b/vector/README.md @@ -1,15 +1,12 @@ The `vector` package [![Build Status](https://github.com/haskell/vector/workflows/CI/badge.svg)](https://github.com/haskell/vector/actions?query=branch%3Amaster) ==================== -This package includes various modules that will allow you to -work with vectors and use an optimisation framework called [*fusion*](#fusion). -In this context, vector is an `Int`-indexed array-like data structure with a simpler -API that can contain any Haskell value. Additionally, its equivalence -to C-style arrays and optimisation via fusion accelerates vector’s -performance and makes it a great alternative to list. By installing this -package, you’ll be able to work with [boxed, unboxed, storable, and primitive -vectors](#vectors-available-in-the-package) as well as their generic interface. - +Vector is a collection of efficient `Int`-indexed array implementations: +[boxed, unboxed, storable, and primitive vectors](#vectors-available-in-the-package) +(all can be mutable or immutable). The advantages of vectors include equivalence to +C-style arrays and simple interface. The package also features generic API that is +polymorphic in vector type. Additionally, vector implements [*stream fusion*](#stream-fusion), +a powerful optimisation framework that can help eliminate intermediate data structures. ## Table of Contents @@ -66,7 +63,7 @@ but more versality. ## Stream Fusion -An optimisation framework used by vectors, stream fusion is a technique that merges +An optimisation framework used by vectors, stream fusion is a technique that merges several functions into one and prevents creation of intermediate data structures. For example, the expression `sum . filter g . map f` won't allocate temporary vectors if compiled with optimisations. \ No newline at end of file From fb08c1fa50314544173cb063d87216e308139b51 Mon Sep 17 00:00:00 2001 From: Aleks Date: Sat, 28 Sep 2024 14:40:28 +0200 Subject: [PATCH 4/5] Mention that array is legacy --- vector/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/vector/README.md b/vector/README.md index b9949235..7bb88214 100644 --- a/vector/README.md +++ b/vector/README.md @@ -29,8 +29,9 @@ covers more ground. ## Vector vs Array Arrays are data structures that can store a multitude of elements -and allow immediate access to every one of them. Even though Haskell -has a built-in [Data.Array module](https://hackage.haskell.org/package/array-0.5.7.0), +and allow immediate access to every one of them. However, they are +often seen as legacy constructs that are rarely used in modern Haskell. +Even though Haskell has a built-in [Data.Array module](https://hackage.haskell.org/package/array-0.5.7.0), arrays might be a bit overwhelming to use due to their complex API. Conversely, vectors incorporate the array’s *O(1)* access to elements with a much friendlier API of lists. Since they allow for framework From ad6f5086dbf8ca4536fe8ee922092b833d6ed58d Mon Sep 17 00:00:00 2001 From: Aleks Date: Sun, 29 Sep 2024 17:33:56 +0200 Subject: [PATCH 5/5] Shorter intro --- vector/README.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/vector/README.md b/vector/README.md index 7bb88214..eca52fa2 100644 --- a/vector/README.md +++ b/vector/README.md @@ -3,9 +3,8 @@ The `vector` package [![Build Status](https://github.com/haskell/vector/workflow Vector is a collection of efficient `Int`-indexed array implementations: [boxed, unboxed, storable, and primitive vectors](#vectors-available-in-the-package) -(all can be mutable or immutable). The advantages of vectors include equivalence to -C-style arrays and simple interface. The package also features generic API that is -polymorphic in vector type. Additionally, vector implements [*stream fusion*](#stream-fusion), +(all can be mutable or immutable). The package features a generic API, +polymorphic in vector type, and implements [*stream fusion*](#stream-fusion), a powerful optimisation framework that can help eliminate intermediate data structures. ## Table of Contents