-
Notifications
You must be signed in to change notification settings - Fork 69
A Gentle Introduction to mdspan
ISO-C++ standard proposals are not the best tutorials for how to use their proposed features. This is intentional—there's enough content to be written in exploring the technical design issues that a tutorial-style introduction is out of place in such documents. However, it also means that proposals such as P0009 for mdspan
reach a level of maturity where they are ready for production implementation and use, but without a single tutorial-style introduction. Here's an attempt to do so for mdspan
.
In its simplest form, std::mdspan
1 is a straightforward extension of std::span
to enable multidimensional arrays.
(The std::span
class was added in C++20, and introductions to it can be found all over the internet.
Thus, this tutorial assumes basic familiarity with std::span
.)
You can create them from a pointer to the underlying data,
and its extents (the dimensions of the multidimensional array).
int* data = /* ... */
// View data as contiguous memory representing 4 ints
auto s = std::span(data,4);
// View data as contiguous memory representing 2 rows
// of 2 ints each
auto ms = std::mdspan(data,2,2);
In the example above, we use CTAD (constructor template argument deduction, a language feature introduced in C++17)
so that we don't have to name the template arguments of the mdspan
class. The compiler figures them out for us.
Accessing the data then happens via its []
operator, just as with span
or vector
.
for(size_t i=0; i<ms.extent(0); i++)
for(size_t j=0; j<ms.extent(1); j++)
ms[i, j] = i*1000 + j;
(Note that the mdspan
implementation in this repository uses ()
if the compiler doesn't
support multi-argument []
- a feature which was voted into C++23.)
mdspan
s extent
function takes an index to indicate which extent
the caller wants.
Just like span
, you can use both dynamic extents and static extents with mdspan
.
Unlike span
, mdspan
is not directly templated on individual extents, but packs all of them into an explicit extents
object.
An extents
object represents the "shape" of a multidimensional array.
It can encode any number and combination of extents as compile-time values
(that are part of the type of the extents
object),
and encodes the remaining extents as run-time values
(that must be given as arguments to the extents
constructor).
// Create an extents object with one run-time dimension 3,
// and one static (compile-time) extent 4.
// It represents the shape of a 3 x 4 array.
std::extents<int, std::dynamic_extent, 4> exts(3);
The extents
class does not only take the actual sizes (or extents) of each dimension, it also takes as an argument the index_type
,
the extents
class - and through it mdspan
- will use to store the extents, and do index computation.
The total number of arguments after the index_type
is also the rank
of the extents
object, and later the rank
of the mdspan
.
You can construct an mdspan
by providing the individual extents to the constructor (either all of them, or just the dynamic ones),
or handing in an extents
object.
int* data = /* ... */
int size = /* ... */
int rows = /* ... */
constexpr int cols = 4;
// An extents type using uint32_t to represent each extent,
// with a run-time number of rows, and 4 columns.
using ext_t = std::extents<uint32_t, std::dynamic_extent, cols>;
// When creating the mdspan, we only need to give run-time extents.
// The compile-time extents are already baked into the extents type.
auto ms = std::mdspan<int, ext_t>(data, rows);
assert(ms.extents(0) == rows);
assert(ms.extents(1) == cols);
static_assert(ms.static_extent(1) == cols);
static_assert(decltype(ms)::rank() == 2);
// If we don't specify the mdspan's template parameters,
// then we need to pass in all the extents.
// It will then store all extents as run-time values.
auto msd = std::mdspan(data, rows, cols);
assert(ms.extents(0) == rows);
assert(ms.extents(1) == cols);
static_assert(decltype(msd)::rank() == 2);
Note that dextents
is a convenient alias for an extents
type with all dynamic_extent
.
static_assert<std::is_same_v<std::dextents<uint32_t, 2>,
std::extents<uint32_t, std::dynamic_extent, std::dynamic_extent>>;
The design of mdspan
addresses a much broader variety of needs than span
. In many scenarios, it even makes more sense to use a one-dimensional mdspan
instead of a span
.
This broader scope of use cases is addressed through two customization points: LayoutPolicy and Accessor.
template <
class T,
class Extents,
class LayoutPolicy = std::layout_right,
class Accessor = std::default_accessor
>
class mdspan;
The LayoutPolicy
defines the data layout, that is, the function that turns a multi-dimensional index i,j,k,... into a one-dimensional offset.
The Accessor
defines how the array stores its elements,
and how to use the offset from the LayoutPolicy
to get a reference to an element of the array.
(For example, the default Accessor
takes a pointer p
and an offset i
, and returns p[i]
.)
You can think of these customizations like the same kind of thing as the Hash
template parameter on std::unordered_map
or the Allocator
template parameter on std::vector
. These are things that most of the time you don't have to touch, and that most algorithms shouldn't have to care about. (How many times have you written a function that takes a std::vector
and had to write a different version of the function for different allocators? Probably not that often, if ever.)
As stated above the LayoutPolicy
controls the mapping from a multi-dimensional index into an offset.
One of the reasons why controlling a layout mapping can be useful,
is to control an algorithm's data access pattern (how it strides through memory)
without changing the algorithm's loop structure.
For example, it's natural to think of a matrix-vector multiply algorithm as a loop over rows, with an inner loop which performs the inner reduction:
for(int i = 0; i < A.extent(0); i++) {
double sum = 0.;
for(int j = 0; j < A.extent(1); j++)
sum += A[i, j] * x[j];
y[i] = sum;
}
In most cases an optimal data access pattern is achieved if the stride-one access is on j
.
This also allows for vectorization of the inner loop.
However, consider the case where A.extent(1)
is small and known at compile time.
In that case it might actually be better to have the inner loop fully unrolled, and have a stride-one access on i
instead, so that the outer loop
can be vectorized.
for(int i = 0; i < A.extent(0); i++) {
y[i] = A[i, 0] * x[0] +
A[i, 1] * x[1] +
A[i, 2] * x[2] +
A[i, 3] * x[3];
}
With mdspan
the first use case can be achieved with layout_right
(which is the default) and the second use case might perform better with layout_left
:
mdspan<double, extents<uint32_t, dynamic_extent, 4>, layout_left> A(A_ptr, N);
for(int i = 0; i < A.extent(0); i++) {
y[i] = A[i, 0] * x[0] +
A[i, 1] * x[1] +
A[i, 2] * x[2] +
A[i, 3] * x[3];
}
Another important use case where layouts are needed is for reusing functionality for a subset of data.
For example one might want to perform an operation on a single column of some multi-dimensional array.
That column, however, may not be a contiguous array. layout_stride
can deal with that in many common cases:
mdspan A(A_ptr, N, M);
for(int c = 0; c < A.extent(1); c++) {
extents sub_ext(A.extents(0));
array strides{A.stride(0)};
layout_right::mapping sub_map(sub_ext, strides);
mdspan col_c(&A(0,c), sub_map);
foo_1d_mdspan(col_c);
}
A more advanced tutorial will cover the topics of layout and access customization, but for now, it's enough to know that two simple ones are provided: std::layout_right
and std::layout_left
, where the right-most and left-most indices are fast-running, respectively. When discussing matrices, these are often described as C-style layout (or row-major) and Fortran-style (or column-major), respectively.
We say that mdspan is a "view." What does this mean, and how does it relate to ownership?
"Ownership" refers to who frees the memory. "Nonowning" means "it doesn't free the memory."
By convention, const T*
and T*
are nonowning. span<T>
and span<const T>
are nonowning. However, vector<T>
is owning. When the vector goes away, it frees memory. Standard containers are owning.
"View" means that if you have a view X, and you make a copy of X called Y, then X and Y refer to the same elements. Views don't copy elements. Copying views should also be cheap; it shouldn't depend on the number of elements being viewed. (Ranges has some subtle definition of "view" that permits some exceptions, but this gets at the essence.)
Pointers are views. span is a view. mdspan is a view. vector is not a view; it's a container. If you make a copy of the vector, it will copy all the elements. The elements in the new vector are different elements ("different" in a similar sense as meant by "regular type") than the old vector's elements.
std::shared_ptr
is a view, EVEN THOUGH it expresses shared ownership. This matters because mdspan is a view, BUT whether it is owning depends on the accessor. With default_accessor, the mdspan is non-owning. However, you could define an accessor whose data_handle_type
is the following.
struct shared_ownership_data_handle {
shared_ptr<element_type[]> base;
// This mdspan's data start at base[offset].
// This exists because shared_ptr<T[]> has no "offset constructor"
// that expresses shared ownership of an array through a subset.
size_t offset;
};
That would make different instances of mdspan share ownership of the allocation. When the last "shared owner" falls out of scope, the memory would be deallocated.