Shape<D> and StrideShape<D> should be merged #367

termoshtt · 2017-10-10T09:17:10Z

I am confusing around Shape<D>, StrideShape<D>, Dim, and &[Ix]. In my understanding, these can be grouped as follow:

Has stride information, i.e. these know how the values are aligned in the memory
- Shape<D>
- StrideShape<D>
Only the size of each dimensions
- Dim
- &[Ix]

And each of types in a group should be merged. I think StrideShape<D> and Dim are enough.
Since ArrayBase<S, D> already has stride: D, StrideShape<D> is appropriate, and it should be re-named Shape<D> instead. Associated to this, ShapeBuilder should be also simplified to generate StrideShape<D> (and renamed to IntoShape similar to IntoDimension).

What do you think?
If you agree, I will create PR :)

Thanks!

The text was updated successfully, but these errors were encountered:

bluss · 2017-10-10T17:08:32Z

Note that Shape and StrideShape are different in this way:

Shape: either c- or f-contiguous
StrideShape: custom strides

I will look closer at this later -- but merging them means either adding or removing features.

bluss · 2017-10-10T17:12:30Z

Refreshing my memory: We allow custom strides (StrideShape) when the user provides the elements for us (Array::from_shape_vec).

Consider the question: How does Array::from_elem work with a Stride Shape. How do we handle "holes" in the storage, assuming we have a non-Copy element type (that must have its destructor run correctly for each element when the array drops)?

termoshtt · 2017-10-11T03:44:47Z

Note that Shape and StrideShape are different in this way:

I think the what Shape can represent is also able to be represented using StrideShape.
For example, Shape { dim: [3, 4], is_c = True} corresponds to StrideShape { dim: [3, 4], stride: [1, 3], custom = False }. All shape: Shape<D> can be converted to StrideShape<D> like this, and a check of is_c is enough lightweight, stride[0] == 1 && stride[i] == dim[i-1] * ... * dim[0]. Hence we can replace Shape<D> by StrideShape<D> by introducing StrideShape::is_c(&self).

While Shape<D> has smaller memory than StrideShape<D>, but it cannot be a reason for splitting type for the shape of array since StrideShape<D> is also enough small. Single shape type makes the API design much simpler.

bluss · 2017-10-11T14:37:20Z

That's the simple part. See the question about custom strides in particular arrays that don't cover all elements in the base storage.

bluss · 2017-10-11T14:42:55Z

I think it seems good to force from_elem and zeros constructors to always make contiguous arrays. We can still improve the interface.

termoshtt · 2017-10-12T06:18:28Z

How does Array::from_elem work with a Stride Shape.

We can simply raise error that the given StrideShape is not compatible like following:

fn from_elem<Sh: IntoShape<D>>(shape: Sh, elem: A) -> Result<Self, NonContiguousStrideError> {
  if shape.is_custom() {
    Err(NonContiguousStrideError::new())
  }
  ... (current implementation)
}

My concern is that the interface about shape is very complicated, e.g. there are Sh: ShapeBuilder<D> and Sh: Into<StrideShape<D>> or ArrayBase::shape() returns &[Ix] instead of Shape<D>, and so on. I understand there are some algorithms which assumes contiguous memory alignment, but these should be managed by an error handling like above.

bluss · 2017-10-12T19:59:51Z

Sorry that I don't have so much time for this question during the week.

I share your concern for the complexity of the API here, and it's a concern I keep in mind with ndarray all the time. We can do so much with the type system, but we need find solutions that don't make the library too hard to understand or use. We have to find a balance, and you can find traces of this balance conversation all over the issues in this repo!

I will not lose this issue and I would want to discuss many other solutions than the proposed one, for example other ways of unifying the shape API or better documentation. But first let's try to finish the from_elem or equivalently zeros discussion (they use the shape the same way).

Array::zeros

(Equivalently from_elem, default etc use the shape in the same way)

The current shape API for Array::zeros is very nice. Let's look at its positive sides:

It only accepts contiguous shapes
No shape it accepts is an error
- Except for shapes whose element count wraps around type limits. That's a panic.

Ok, so far so good. A bit of infidelity there, a panic case, but it's supposedly in an uncommon case. It's pretty hard to run into.

Positive points:

The API guides the user. We can only enter c- and f-contiguous shapes to Array::zeros. Alternatives for the user are clear.
Ruling out error conditions at compile time. Can't use a custom stride shape with zeros, and it is enforced by the type system.

I think this question answers itself:

Should we change Array::zeros so that it can return an error? We know that the error it can return is one that the construction of the API rules out statically today.

termoshtt · 2017-10-13T02:46:20Z

The current shape API for Array::zeros is very nice.

I understand that you want to assure the contiguous memory layout by a type system using Shape<D>, and I agree that it is better.

Then, my concern changes: Does StrideShape<D> include Shape<D>?
As I said above, shape: Shape<D> can be always convertible to StrideShape<D>, and we can say that the StrideShape<D> is general shape type, and Shape<D> is special, contiguous shape type.
But your above comment "StrideShape: custom strides" means these are distinct types, i.e. a shape is Shape<D> or StrideShape<D>. Their naming also supports this view, but the implementation does not support it.

Sorry that I don't have so much time for this question during the week.

No problem, it is not a hurried issue. I thank your conscientious comments a lot.

bluss · 2017-10-22T13:13:31Z

Sure, StrideShape includes the shapes + strides that Shape can describe.

I've been wondering if we can make the API more clear without changing how this actually works. Maybe by for example using a more specific trait than something as indirect as the current Into<StrideShape<D>> as we use for Array::from_shape_vec.

termoshtt · 2017-10-25T10:08:51Z

My original concern is ArrayBase::shape, which returns &[Ix] instead of Shape<D>. This is due to the Shape<D> is not a static property of ArrayBase, i.e. we cannot obtain Shape<D> without checking its strides. In this sense the StrideShape<D> is a property of ArrayBase and we can get it without error check.

After discussion in this thread, I understood the roles of Shape<D> and StrideShape<D> are appropriate, but their names are misleading. Shape<D> is only used for generating a continuous array, and it describes the shape of ArrayBase only in a limited case. This is a special structure than StrideShape<D>, but its name is rather general.

My plan is here:

Rename structures
- Shape<D> -> ContShape<D>
- StrideShape<D> -> Shape<D>
- impl Into<Shape<D>> for ContShape<D> { /* use C-oder */ } (fixed)
impl PartialEq<Shape<D>> for ContShape<D>
Change APIs of ArrayBase
- ArrayBase::cont_shape(&self) -> Option<ContShape<D>>
- ArrayBase::shape_as_cont(&self) -> ContShape<D> // use C-order if stride of array is custom
- ArrayBase::shape(&self) -> Shape<D>
  - then a.shape() == a.shape_as_cont() is false if a has custom stride

For example, these three shape API is better in a following case:

Ruling out error conditions at compile time. Can't use a custom stride shape with zeros, and it is enforced by the type system.

In current API, ArrayBase::shape returns &[Ix], which drops the stride of the array.

fn some_func<A, S: Data<Elem=A>, D: Dimension>(x: &ArrayBase<S, D>) -> Array<A, D> {
  let mut a = Array::zeros(x.shape());
  // modify a ...
  a
}

In this case x and a may have different strides, but we cannot see it from this source.
Using my shape API:

fn some_func<A, S: Data<Elem=A>, D: Dimension>(x: &ArrayBase<S, D>) -> Array<A, D> {
  let mut a = Array::zeros(x.shape_as_cont());
  // modify a ...
  a
}

the stride is explicitly dropped with shape_as_cont.

bluss · 2017-10-25T10:10:42Z

OK, so the naming of methods dim and shape seem to be part of the problem.

bluss · 2017-10-25T10:12:19Z

Add raw_dim to the mix, if you also take these into account, what's the best solution?

termoshtt · 2017-10-25T10:30:35Z

Add raw_dim to the mix

Can we remove dim()? At a glance, the Dim::Pattern can be replaced with impl Deref<(Ix, Ix)> for Ix2 { ... } and so on. If it is true, we should use ArrayBase::dim(&self) -> &D, and remove raw_dim

bluss · 2017-10-25T12:48:33Z

Deref can not do custom conversions like that. In general it can only give references to interior fields, there is no tuple inside an ix2.

bluss · 2017-10-25T12:50:18Z

I'll try to propose something later today.

termoshtt · 2017-10-25T16:27:41Z

Deref can not do custom conversions like that

Sure, you are correct.

A remaining issue is the type of stride as discussed in #283
ArrayBase::stride returns &[Ixs], but which elements are dynamically casted from Ix, and stride in ArrayBase has type D which is usually Dim<[Ix; n]>.
Though there is ShapeBuilder::Strides associated type, to couple Dim and Stride will be better using Stride trait corresponding Dimension trait:

trait Dimension {
   type Strd: Stride<Dim = Self>;
   // current impl
}

trait Stride {
  type Dim: Dimension<Strd = Self>;
  // and other common methods
}

type Ixs2 = Dim<[Ixs; 2]>; // or define Strd<I> instead of Dim

impl Stride for Ixs2 {
  type Dim = Ix2;
  ...
}

Then we can implement ArrayBase::stride(&self) -> D::Strd, and make the stride in ArrayBase signed like: ArrayBase { ..., stride: D::Strd }.

bluss · 2017-10-26T18:11:06Z

Let's leave the strongly typed strides out of this if we can, so that we have a problem small enough to solve.

bluss · 2017-11-01T20:43:04Z

I simply haven't had time to come back to this yet.

zroug · 2018-09-01T22:24:38Z

I got confused by Dimension, Shape and StrideShape too and I also think that part of the problem is that the naming is not ideal. Especially I think that Shape sounds like it does mean what Dimension does. Thats because other popular libraries that I use, like Numpy or TensorFlow, use the name shape for what ndarray calls dimension.

My naming suggestions for StrideShape are either Layout as in memory layout or simply Strides. Shape could then be renamed to for example SimpleLayout/Strides or ContinuousLayout/Strides.

termoshtt mentioned this issue Oct 20, 2017

General array interface rust-math/fftw#41

Merged

jturner314 mentioned this issue Oct 27, 2018

Plan for dimension types #519

Open

LukeMathWalker mentioned this issue Mar 17, 2019

Are we ndarray yet? #597

Open

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shape<D> and StrideShape<D> should be merged #367

Shape<D> and StrideShape<D> should be merged #367

termoshtt commented Oct 10, 2017

bluss commented Oct 10, 2017

bluss commented Oct 10, 2017 •

edited

Loading

termoshtt commented Oct 11, 2017

bluss commented Oct 11, 2017

bluss commented Oct 11, 2017

termoshtt commented Oct 12, 2017

bluss commented Oct 12, 2017

termoshtt commented Oct 13, 2017 •

edited

Loading

bluss commented Oct 22, 2017

termoshtt commented Oct 25, 2017 •

edited

Loading

bluss commented Oct 25, 2017 •

edited

Loading

bluss commented Oct 25, 2017

termoshtt commented Oct 25, 2017

bluss commented Oct 25, 2017

bluss commented Oct 25, 2017

termoshtt commented Oct 25, 2017 •

edited

Loading

bluss commented Oct 26, 2017

bluss commented Nov 1, 2017

zroug commented Sep 1, 2018

Shape<D> and StrideShape<D> should be merged #367

Shape<D> and StrideShape<D> should be merged #367

Comments

termoshtt commented Oct 10, 2017

bluss commented Oct 10, 2017

bluss commented Oct 10, 2017 • edited Loading

termoshtt commented Oct 11, 2017

bluss commented Oct 11, 2017

bluss commented Oct 11, 2017

termoshtt commented Oct 12, 2017

bluss commented Oct 12, 2017

Array::zeros

termoshtt commented Oct 13, 2017 • edited Loading

bluss commented Oct 22, 2017

termoshtt commented Oct 25, 2017 • edited Loading

bluss commented Oct 25, 2017 • edited Loading

bluss commented Oct 25, 2017

termoshtt commented Oct 25, 2017

bluss commented Oct 25, 2017

bluss commented Oct 25, 2017

termoshtt commented Oct 25, 2017 • edited Loading

bluss commented Oct 26, 2017

bluss commented Nov 1, 2017

zroug commented Sep 1, 2018

bluss commented Oct 10, 2017 •

edited

Loading

termoshtt commented Oct 13, 2017 •

edited

Loading

termoshtt commented Oct 25, 2017 •

edited

Loading

bluss commented Oct 25, 2017 •

edited

Loading

termoshtt commented Oct 25, 2017 •

edited

Loading