diff --git a/proposals/p1839.md b/proposals/p1839.md new file mode 100644 index 0000000000000..ccf2471543edf --- /dev/null +++ b/proposals/p1839.md @@ -0,0 +1,287 @@ +# Multidimensional arrays + + + +[Pull request](https://github.com/carbon-language/carbon-lang/pull/1839) + + + +## Table of contents + +- [Problem](#problem) +- [Background](#background) +- [Proposal](#proposal) +- [Details](#details) +- [Rationale](#rationale) +- [Alternatives considered](#alternatives-considered) + + + +## Problem + +Multidimensional arrays are actively used in numerical methods, machine +intelligence and data science. This is one feature than makes modern Fortran +more attractive than C++ when it comes to a choice of compiled language: +currently, C++ lacks support of multidimensional arrays. Having Carbon implement +this would give it a major boost in the eyes of the scientific community. + +Nested arrays may look as a good alternative of multidimensional arrays but +their performance may be not effective due to splitting in memory. + +## Background + +Multidimensional array is an array with more than two dimensions which is +continuous in memory. + +Multidimensional array may be stored in memory in +[row- or column- major order](https://en.wikipedia.org/wiki/Row-_and_column-major_order). + +## Proposal + +We should add support of multidimensional arrays in Carbon via syntax extension +for making code clean and simplier for reading and writing. + +```carbon + var a: [f64; 3, 4]; + var values: i64 = 0; + for(a_i: auto in a[:,...]) { + for(a_ij: auto in a_i[:,...]) { + a_ij = values++; + } } +``` +or +```carbon + var a: [f64; 3, 4]; + var values: i64 = 0; + for(i: auto in (0:2)) { + for(j: auto in (0:3)) { + a[i,j] = values++; + } } +``` + +## Details + +### Definition + +#### Automatic allocation + +Arrays can be automatically allocated: +```carbon + var x: [i32; :, :]; +``` +For avoiding Undefined Behavior, `x` has shape `(0, 0)`. + +Array may be defined: +1. via **assignment**: + ```carbon + var x: [i32; :, :]; + var y: [i32; :, :] = ((0, 1, 2), (3, 4, 5)); + x = y; + ``` +2. via **memory allocation**: + ```carbon + var x: [i32; :, :]; + allocate(x, /*shape=*/(3, 2)); + ``` + +If array was already allocated and then, `allocate` called, the runtime error is. + +#### Automatic deallocation + +Automatically allocated arrays are destroying at the end of scope. For example, +if such arrays belong to class object, they are destroying with class object. + +Manually, deallocation can be called using: +```carbon + var x: [i32; :, :]; + allocate(x, /*shape=*/(3, 2)); + deallocate(x); +``` +Calling of deallocation for non-allocated arrays leads to runtime error. + +### Operators + +Arrays may be modified in scalar and vector ways. + +1. Scalar way: + ```carbon + var x: [i32; 3, 2] = ((0, 1, 2), (3, 4, 5)); + var y: auto = -x; + // each to each elements are summarized + var z: auto = x + y; // z = ((0,0,0), (0,0,0)) + ``` + If shapes are inconsistent, runtime error is. + +2. Vector way: + ```carbon + var x: [i32; 3, 2] = ((0, 1, 2), (3, 4, 5)); + // multiply each element by 2 + x *= 2; // x = ((0, 2, 4), (6, 8, 10)); + ``` + +### Iterators (?) + +In multidimensional arrays, it may be useful to have _iterators_ (row-major +order): +```carbon + var a: [f64; 2, 3, 4]; + var it: auto = a[:, ...]; + for(i: auto in it) { ... } +``` +In this example, `i` presents `a[0,:,:]` and `a[1,:,:]` sequentially. +Also, iterator may use last dimension (column-major order): +```carbon + var a: [f64; 4, 3, 2]; + var it: auto = a[..., :]; + for(i: auto in it) { ... } +``` +In this example, `i` presents `a[:,:,0]` and `a[:,:,1]` sequentially. +In both cases, `i` is two dimensional array. + +`...` masks all dimensions. + +### Functions + +Usually, arrays uses as is. Below, sum of two arrays is: +```carbon + fn sum[T:! Type](x: T, y: T) -> T { + return x + y; + } +``` + +Function returning 1D array: +```carbon + fn arr1D[T:! Type](x: T, y: T) -> [T; :] { + return (x, y); + } +``` +or 2D array: +```carbon + fn arr2D[T:! Type](x: T, y: T) -> [T; :,:] { + return ((x, x), (y, y)); + } +``` + +Dimensions may be specified explicitly: +```carbon + fn arr1D[T:! Type](x: T, y: T) -> [T; 2] { + return (x, y); + } +``` + +Lowering dimensions: +```carbon + fn unarr[T:! Type](x: T[:], y: T[:]) -> T { + return sum(x + y); + } +``` + +#### Elemental functions + +These functions applied to each element sequentially. + +```carbon + el fn inc[T:! Type](x: T) -> T { + return x + 1; + } +fn Main() -> i32 { + var x: [i32, 3] = (0:2); + var y: i32 = 3; + x = inc(x); // similar to x = x + 1; + y = inc(y); + return 0; +} +``` +It is useful when function is more compilicated than increment. + +Using _iterators_, elemental function may be used for sub-dimensions: +```carbon +el fn conv[T:! Type](x: T) -> T { + return sum(x); +} +fn Main() -> i32 { + var x: [i32; 3, 4] = reshape((0:11),/*shape=*/(3, 4)); + var y: auto = conv(x[:,...]); // y = (6, 22, 38) + var z: auto = conv(x[...,:]); // z = (12, 15, 18, 21) + return 0; +} +``` + +### Standard library + +#### allocate +Allocates array: + ```carbon + var x: [i32; :, :]; + allocate(x, /*shape=*/(3, 2)); + ``` +#### deallocate +Deallocates array: + ```carbon + var x: [i32; :, :]; + allocate(x, /*shape=*/(3, 2)); + deallocate(x); + ``` +#### allocated +Returns status of allocation: + ```carbon + var x: [i32; :, :]; + allocated(x); // False + allocate(x, /*shape=*/(3, 2)); + allocated(x); // True + deallocate(x); + allocated(x); // False + ``` +#### shape +Returns shape of arrays: +```carbon + var s: [i32; 2] = shape(x); // s = (3, 2) +``` +#### size +Returns total array size: +```carbon + var l: i32 = size(x); // l = 6 +``` +With optional argument `dim` returns size in given dimension (indexing from 1): +```carbon + var l1: i32 = size(x, /*dim=*/1); // l1 = 3 + var l2: i32 = size(x, /*dim=*/2); // l2 = 2 +``` +#### reshape +Reshapes array: +```carbon + var x: [i32; 3, 2] = reshape(/*array=*/(0, 1, 2, 3, 4, 5), /*shape=*/(3, 2));/ +``` +#### transpose +Transposes array (without additional argument only for 2D): +```carbon + var x: [i32; 3, 2] = ((0, 1, 2), (3, 4, 5)); + var y: auto = transpose(x); // y = ((0, 3), (1, 4), (2, 5)) + var z: auto = shape(y); // z = (2, 3) +``` +Additional argument `dims` marks dimensions for transposing: +```carbon + var x: [i32; 2, 2, 2] = ( ((0, 1), (2, 3)), ((4, 5), (6, 7)) ); + var y: auto = transpose(x, /*dims=*/(1, 3)); + // y = ( ((0, 4), (2, 6)), ((1, 5), (3, 7)) ) +``` +#### sum +Sums all values in array: +```carbon + var x: [i32; 2, 2, 2] = ( ((0, 1), (2, 3)), ((4, 5), (6, 7)) ); + var y: auto = sum(x); // y = 28 +``` + +## Rationale + +This proposal should simplify to write High-Performance Compiting codes, +most of them is performance-critical software. Unfortunately, C++ code is not +affected. + +## Alternatives considered + +I'm under high impress of Fortran.