Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve README #501

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 85 additions & 2 deletions vector/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,89 @@
The `vector` package [![Build Status](https://github.com/haskell/vector/workflows/CI/badge.svg)](https://github.com/haskell/vector/actions?query=branch%3Amaster)
====================

An efficient implementation of `Int`-indexed arrays (both mutable and immutable), with a powerful loop optimisation framework.
This package includes various modules that will allow you to
work with vectors and use an optimisation framework called [*fusion*](#fusion).
In this context, vector is an `Int`-indexed array-like data structure with a simpler
API that can contain any Haskell value. Additionally, its equivalence
to C-style arrays and optimisation via fusion accelerates vector’s
performance and makes it a great alternative to list. By installing this
package, you’ll be able to work with [boxed, unboxed, storable, and primitive
vectors](#vectors-available-in-the-package) as well as their generic interface.
Comment on lines +4 to +11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This package includes various modules that will allow you to
work with vectors and use an optimisation framework called [*fusion*](#fusion).
In this context, vector is an `Int`-indexed array-like data structure with a simpler
API that can contain any Haskell value. Additionally, its equivalence
to C-style arrays and optimisation via fusion accelerates vector’s
performance and makes it a great alternative to list. By installing this
package, you’ll be able to work with [boxed, unboxed, storable, and primitive
vectors](#vectors-available-in-the-package) as well as their generic interface.
An efficient implementation of `Int`-indexed arrays (both mutable and immutable).
This package provides multiple representations: [boxed, unboxed, storable, and primitive
vectors](#vectors-available-in-the-package) and generic API which is polymorphic in
vector type. It also has powerful optimization framework based on [stream-fusion](#fusion)
which could eliminate intermediate data structures.

I think it would be better to cut down this paragraph a bit.


See [`vector` on Hackage](http://hackage.haskell.org/package/vector) for more information.

## Table of Contents

<!-- no toc -->
- [Installation](#installation)
- [Tutorial](#tutorial)
- [Vector vs Array](#vector-vs-array)
- [Vectors Available in the Package](#vectors-available-in-the-package)
- [Fusion](#fusion)

## Installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need Installation section? There's nothing nonstandard it works like any other package


If you use **cabal**, modify `package.cabal` so that its `build-depends`
section includes vector package:
```
build-depends: base ^>=4.17.2.1
, vector ==0.13.1.0
```
If you use **stack**, modify `package.yaml` so that its `depends`
section includes vector package:
```
dependencies:
- base >= 4.7 && < 5
- vector == 0.13.1.0
```


## Tutorial

A beginner-friendly tutorial for vectors can be found on
[MMHaskell](https://mmhaskell.com/data-structures/vector).


If you have already started your adventure with vectors,
the tutorial on [Haskell Wiki](https://wiki.haskell.org/Numeric_Haskell:_A_Vector_Tutorial)
covers more ground.

## Vector vs Array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need to compare with array. I think at this point it's mostly legacy and barely used


Arrays are data structures that can store a multitude of elements
and allow immediate access to every one of them. Even though Haskell
has a built-in [Data.Array module](https://hackage.haskell.org/package/array-0.5.7.0),
arrays might be a bit overwhelming to use due to their complex API.
Conversely, vectors incorporate the array’s *O(1)* access to elements
with a much friendlier API of lists. Since they allow for framework
optimisation via loop fusion, vectors emphasise efficiency and keep
a rich interface. Unless you’re confident with arrays, it’s
well-advised to use vectors when looking for a similar functionality.

## Vectors Available in the Package

**Boxed vectors** store each of its elements as a pointer to its value.
Because we cannot directly access the contents of a boxed vector, they
are slower in comparison to unboxed vectors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Boxed vectors** store each of its elements as a pointer to its value.
Because we cannot directly access the contents of a boxed vector, they
are slower in comparison to unboxed vectors.
**Lazy boxed vectors** (`Data.Vector`) can store any haskell value. Their elements are
stored as pointers to heap-allocated values and that extra indirection results in worse
performance.
**Strict boxed vectors** (`Data.Vector.Strict`) boxed vector which is strict in its elements.



**Unboxed vectors** store solely their elements’ values instead of pointers.
To be unboxed, the elements need to be constant in size. Since we can directly
access the contents of the unboxed vector, working with them is quite efficient.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case name lies.

Suggested change
**Unboxed vectors** store solely their elements’ values instead of pointers.
To be unboxed, the elements need to be constant in size. Since we can directly
access the contents of the unboxed vector, working with them is quite efficient.
**Unboxed vectors** (`Data.Vector.Unboxed`) is vector which determines representation
of an array from type of its element. For primitives (`Int`, `Double`, etc) it's backed by primitive
arrays and for tuples and product types by structure of arrays. Generally it uses unboxed
representation so it's quite efficient



**Storable vectors** are pinned, convertible to and from pointers, and
usable in C functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Storable vectors** are pinned, convertible to and from pointers, and
usable in C functions.
**Storable vectors** (`Data.Vector.Storable`) are backed by pinned memory. Their primary
use case is C FFI.



**Primitive vectors** contain elements of primitive type.
Primitive types can be recognised by the hash sign attached at
the end of value and/or type’s name, e.g. `3#` or `Int#`. You can read
more about them [here](https://downloads.haskell.org/~ghc/5.00/docs/set/primitives.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more precise to say that they can hold values that are instances of Prim. In fact it can't hold Int# — wrong kind.

Suggested change
**Primitive vectors** contain elements of primitive type.
Primitive types can be recognised by the hash sign attached at
the end of value and/or type’s name, e.g. `3#` or `Int#`. You can read
more about them [here](https://downloads.haskell.org/~ghc/5.00/docs/set/primitives.html).
**Primitive vectors** (`Data.Vector.Primitive`). This vector is backed by simple
byte array and can hold data types which are instances of `Prim` type class.
This is data types which are represented in memory as sequence of bytes without
pointer. Think `Int`, `Double`, etc. Usually it's better to use unboxed vectors since
they have same performance and more general.


## Fusion

An optimisation framework provided in this package, fusion
is a technique that merges several functions into one and forces
it to produce only one outcome. Without fusion, your program might
generate intermediate results for each function separately and
stall its performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically stream fusion as opposed to build/foldr fusion used for lists.

Suggested change
An optimisation framework provided in this package, fusion
is a technique that merges several functions into one and forces
it to produce only one outcome. Without fusion, your program might
generate intermediate results for each function separately and
stall its performance.
Vector uses stream fusion for optimizations. This is technique
allows to avoid creation of intermediate data structures. For example
following expression `sum . filter g . map f` will not allocate temporary vectors
if compiled with optimizations.

Loading