-
Notifications
You must be signed in to change notification settings - Fork 70
A Gentle Introduction to mdspan
ISO-C++ standard proposals are not the best tutorials for how to use their proposed features. This is intentional—there's enough content to be written in exploring the technical design issues that a tutorial-style introduction is out of place in such documents. However, it also means that proposals such as P0009 for mdspan
reach a level of maturity where they are ready for production implementation and use, but without a single tutorial-style introduction. Here's an attempt to do so for mdspan
.
In its simplest form, std::mdspan
1 is a straightforward extension of std::span
to enable multidimensional arrays.
Using CTAD you can create them from a pointer to the underlying data, and its extents:
int* data = /* ... */
// View data as contiguous memory representing 4 ints
auto s = std::span(data,4);
// View data as contiguous memory representing 2 rows
// of 2 ints each
auto ms = std::mdspan(data,2,2);
Accessing the data then happens via its []
operator (note the mdspan
implementation in this repository uses the ()
if the compiler doesn't
support multi argument []
- a feature which was voted into C++23).
for(size_t i=0; i<ms.extent(0); i++)
for(size_t j=0; j<ms.extent(1); j++)
ms[i, j] = i*1000 + j;
std::span
is in C++20, and introductions to it can be found all over the internet. Thus, this tutorial assumes basic familiarity with std::span
.
Just like span
, you can use both dynamic extents and static extents with mdspan
.
Unlike span
, mdspan
is not directly templated on individual extents, but packs all of them into an explicit extents
object.
// Create an extents objects with one runtime dimension and one static extent.
std::extents<std::dynamic_extent,4> exts(3);
You can construct an mdspan
by providing the individual extents to the constructor (either all of them, or just the dynamic ones),
or handing in an extents
object.
int* data = /* ... */
int size = /* ... */
auto s = std::span<int, std::dynamic_extent>(data, size);
int rows = /* ... */
constexpr int cols = 4;
using ext_t = std::extents<std::dynamic_extent, cols>;
auto ms =
std::mdspan<int, ext_t>(data, rows);
Note that there is a convenient alias for extents
with all dynamic_extent
.
static_assert<std::is_same_v<std::dextents<2>,
std::extents<std::dynamic_extent, std::dynamic_extent>>;
The design of mdspan
addresses a much broader variety of needs than span
. In many scenarios, it even makes more sense to use a one-dimensional mdspan
instead of a span
.
This broader scope of use cases is addressed through two customization points: LayoutPolicy and Accessor.
template <
class T,
class Extents,
class LayoutPolicy = std::layout_right,
class Accessor = std::accessor_basic
>
class mdspan;
These two customization points define the data layout (i.e., turning a bunch of indices into an offset) and data access (e.g., turning a pointer and an offset into a reference) respectively. You can think of these customizations like the same kind of thing as the Hash
template parameter on std::unordered_map
or the Allocator
template parameter on std::vector
. These are things that most of the time you don't have to touch, and that most algorithms shouldn't have to care about—how many times have you written a function that takes a std::vector
and had to write a different version of the function for different allocators? (Probably not that often, if ever.)
As stated above the LayoutPolicy
controls the mapping from a multi-dimensional index into an offset.
One of the reasons why controlling a layout mapping can be useful, is to control the data access pattern not via the algorithmic loops structure, but rather through the way the data is layed out.
For example, its natural to think of a matrix-vector multiply algorithm as a loop over rows, with an inner loop which performs the inner reduction:
for(int i = 0; i < A.extent(0); i++) {
double sum = 0.;
for(int j = 0; j < A.extent(1); j++)
sum += A[i, j] * x[j];
y[i] = sum;
}
In most cases an optimal data access pattern is achieved if the stride-one access is on j
, which also allows for vectorization of the inner loop.
However, consider the case, where A.extent(1) is small and known and compile time.
In that case it might actually be better to have the inner loop fully unrolled, and have a stride-one access on i
instead, so that the outer loop
can be vectorized.
for(int i = 0; i < A.extent(0); i++) {
y[i] = A[i, 0] * x[0] +
A[i, 1] * x[1] +
A[i, 2] * x[2] +
A[i, 3] * x[3];
}
With mdspan
the first use case can be achieved with layout_right
(which is the default) and the second use case might perform better with layout_left
:
mdspan<double, extents<dynamic_extent, 4>, layout_left> A(A_ptr, N);
for(int i = 0; i < A.extent(0); i++) {
y[i] = A[i, 0] * x[0] +
A[i, 1] * x[1] +
A[i, 2] * x[2] +
A[i, 3] * x[3];
}
Another important use-case, where layouts are needed is for reusing functionality for a subset of data.
For example one might want to perform an operation on a single column of some multi-dimensional array.
That column, however may not be a contiguous array. layout_stride
can deal with that in many common cases:
mdspan A(A_ptr, N, M);
for(int c = 0; c < A.extent(1); c++) {
extents sub_ext(A.extents(0));
array strides{A.stride(0)};
layout_right::mapping sub_map(sub_ext, strides);
mdspan col_c(&A(0,c), sub_map);
foo_1d_mdspan(col_c);
}
A more advanced tutorial will cover the topics of layout and access customization, but for now, it's sufficient to know that two simple ones are provided: std::layout_right
and std::layout_left
, with the right-most and left-most indices are fast-running, respectively. When discussing matrices, these are often described as C-style layout (or row-major) and Fortran-style (or column-major), respectively.