Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC (C=collaborator): multidimensional comprehensions that use CartesianRange iterators #10523

Closed
timholy opened this issue Mar 15, 2015 · 17 comments
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request parser Language parsing and surface syntax

Comments

@timholy
Copy link
Member

timholy commented Mar 15, 2015

It would be sweet to be able to create a 3-dimensional array like this:

A = rand(5, 10, 5)
B = [A[I]+7 for I in eachindex(A)]

Looks like comprehensions are performed in the scheme code, and I've never gotten around to (and probably won't get around to) teaching myself lisp (too many other things to do). Anyone interested in collaborating on implementing this? I don't know how much I can do on the julia side, but happy to help if I can.

Reference:
http://stackoverflow.com/questions/29050763/slicing-and-broadcasting-multidimensional-arrays-in-julia-meshgrid-example/29060865#29060865

@ihnorton ihnorton added help wanted Indicates that a maintainer wants help on an issue or pull request parser Language parsing and surface syntax labels Mar 19, 2015
@JeffBezanson
Copy link
Member

Linking to #14848 and #14782

With the upcoming changes, this will be easy to implement, perhaps by defining methods for Generator{CartesianRange}. However I think the big controversy is whether [x for i in y] should ever return a >1-d array. Especially difficult when eachindex returns a UnitRange. We need to figure out how to express the shape of the result of a comprehension.

@timholy
Copy link
Member Author

timholy commented Feb 5, 2016

Historically, eachindex has vacillated between always returning a CartesianRange or returning the fastest iterator for the type. These days it returns the fastest iterator (which I think is the right choice), but I might have filed this issue at a time when it always returned a CartesianRange.

Now, you could write the above as

B = [A[I]+7 for I in CartesianRange(size(A))]

which I think solves some of your concerns.

@timholy
Copy link
Member Author

timholy commented Mar 10, 2016

How are we doing on this? Would be nice to be able to include comprehensions in #15434.

@mschauer
Copy link
Contributor

Ideally, #14770 and

We need to figure out how to express the shape of the result of a comprehension

could be solved jointly.

@timholy
Copy link
Member Author

timholy commented Mar 10, 2016

I'm not sure I see why they are joint?

We also have a very good mechanism to express the shape of the comprehension (size(iter)), so I think this is really just a case where we have to decide whether to do it and, if so, implement it.

@mschauer
Copy link
Contributor

If there is a syntax for linear indexing, say for the sake of argument A[[i]] , then [[x for i in y]] is a natural notation for flat comprehensions and [x for i in y] for shaped comprehensions.

@timholy
Copy link
Member Author

timholy commented Mar 10, 2016

I thought the closest thing to consensus in that discussion was to keep linear indexing, just not "partial linear indexing." (i.e., A[i,j] when A is three-dimensional). And to not require any new syntax for linear indexing.

@eschnett
Copy link
Contributor

This doesn't work for arrays that are not one-based, or which use strided indexing. They may not be in Base, but it would be nice to support them anyway.

@JeffBezanson
Copy link
Member

I believe the shape of a comprehension should depend entirely on the iterable. Simplest is size([x for i in y]) == size(y), and n-d comprehensions use a product iterator, and the shape of a product iterator is the concatenation of the shapes of its component iterators. Then all we need is an easy way to reshape or linearize an iterator. Are reshape and vec acceptable for this?

@timholy
Copy link
Member Author

timholy commented Mar 10, 2016

In some situations it would be easy to implement reshape(iter). In general, though, I suspect we'd want to construct the iterator after reshaping the array itself. In other words,

[f(a) for a in reshape(A, sz)]
[f(A[I]) for I in eachindex(reshape(A, sz))]

can be expected to work in general. I'm less sure that

iter = some_iter_constructor(A)  # e.g., eachindex
[f(A[I]) for I in reshape(iter)]

will be something that will give good results in general.

I don't have access to the right machine now, but tonight I can push my WIP ReshapedArray branch so you can see what I'm planning. Briefly, reshape(A, sz) will return a view. It's pretty simple: if R = reshape(A, sz), it's a combination of supporting (1) the "slow" access method, leveraging #15357 to handle R[i,j,k], and (when possible) the (2) "fast" access method, using an iterator that accesses the parent directly:

eachindex(R::ReshapedArray) = ReshapedIterator(R)
getindex(R::ReshapedArray, I::ReshapedIndex) = R.parent[I.Iparent]  # I.Iparent is an index instance of the type favored for the parent

effectively turning iteration over a reshaped array into iteration over the parent.

@JeffBezanson
Copy link
Member

I agree there is something slightly off about wrapping an iterator just to change its shape. Maybe I should question whether anybody is actually bothered by, say, a product iterator being "N-d".

@mschauer
Copy link
Contributor

The solution might be to be very careful when giving an iterator a shape and rather introduce a new name for a shaped iterator variant. Then the principle size([x for i in y]) == size(y) can then be applied, but does not make as many difficulties. For example the product you mention cannot have a shape in general: e.g. product([1,2], countfrom(1)) cannot have a shape. But an additional product iterator with shape is of course a very meaningful thing and [x*y for x, y in tabulate(1:10, 1:10)] can be a table indeed.

@timholy
Copy link
Member Author

timholy commented Mar 11, 2016

I like N-d product iterators. Before talking about fancy reshape stuff, I should have also said what was surely obvious to you, that [f(a) for a in A] should yield an output with the same size as A.

@JeffBezanson
Copy link
Member

@mschauer Couldn't we have product([1,2], countfrom(1)) be IsInfinite and product(1:m,1:n) be HasShape?

@mschauer
Copy link
Contributor

Thinking about it again... yes, we could just do that.

@JeffBezanson
Copy link
Member

This is fixed now if you iterate over CartesianRange(size(A)).

@timholy
Copy link
Member Author

timholy commented Aug 15, 2016

Loving it already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request parser Language parsing and surface syntax
Projects
None yet
Development

No branches or pull requests

5 participants