Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for zero(::Array{Array}) #8759

Closed
wants to merge 1 commit into from
Closed

Add support for zero(::Array{Array}) #8759

wants to merge 1 commit into from

Conversation

jiahao
Copy link
Member

@jiahao jiahao commented Oct 22, 2014

The problem with the existing definition is that it tries to call zero(::Type{Array}), which fails because it doesn't know what size the eltype should be.

The solution here is simple - if the outer Array is not empty, call zero on its first element.

A similar problem occurs with one() but that is harder to fix.

The problem with the existing definition is that it tries to call zero(::Type{Array}), which fails because it doesn't know what size the eltype should be.

The solution here is simple - if the outer Array is not empty, call zero on its first element.

A similar problem occurs with one() but that is harder to fix.
@ivarne
Copy link
Member

ivarne commented Oct 22, 2014

-1

It's not obvious to me that zero(a::Array{Array{Float64,2},2}) should return an array with 4 references to the same array.

julia> A = [zeros(2,2) for i=1:2, j=1:2];
julia> a = zero(A);
julia> a[1][1] = 1;
julia> a
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

do-not-want-dog

Thanks for catching that.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

Could this be considered a bug in similar or fill!?

julia> A = [zeros(2,2) for i=1:2, j=1:2]
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> A[1,1][1,2]=1
1

julia> A
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 0.0  1.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> B=similar(A)
2x2 Array{Array{Float64,2},2}:
 #undef  #undef
 #undef  #undef

julia> fill!(B, zeros(2,2))
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> [pointer(a) for a in A]
4-element Array{Any,1}:
 Ptr{Float64} @0x00007fb3b2f440f0
 Ptr{Float64} @0x00007fb3b2f44140
 Ptr{Float64} @0x00007fb3b2f441e0
 Ptr{Float64} @0x00007fb3b2f44280

julia> [pointer(a) for a in B]
4-element Array{Any,1}:
 Ptr{Float64} @0x00007fb3b6b91480
 Ptr{Float64} @0x00007fb3b6b91480
 Ptr{Float64} @0x00007fb3b6b91480
 Ptr{Float64} @0x00007fb3b6b91480

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

@timholy, would this be a horrible change?

#multidimensional.jl:174-9
@ngenerate N typeof(A) function fill!{T,N}(A::AbstractArray{T,N}, x)
    @nloops N i A begin
        @inbounds (@nref N A i) = x #should this be deepcopy(x)?
    end
    A
end

@ivarne
Copy link
Member

ivarne commented Oct 22, 2014

To quote @StefanKarpinski it would be "bizarre", (if I understand correctly what he is talking about).

I'm not sure I agree with that statement though.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

I'm puzzled. I don't really see a use case for the current behavior of fill! to create an entire array with multiple references to the same object. It's like creating an entire suburban subdivision but each mailbox goes to the same house.

@andreasnoack
Copy link
Member

Or maybe even deepcopy in case x is a Matrix{Matrix{Float64}}​

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

To continue the second-guessing of @StefanKarpinski,

When you write fill!(arr, ChannVals()) you are asking to fill arr with the one value that is the result of evaluating ChannVals() once.

sounds to me like fill! should make multiple copies of the value, not the reference (which it currently does). But if you had (say) an iterator x, you don't necessarily want fill!(A, next(x)) to generate multiple values from different invocations of next(x), you just want to evaluate it once, then make multiple copies of the value you get.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

@andreasnoack that is a good point, the proposed change should be deepcopy(x) not copy(x). I'll edit my comment above.

@timholy
Copy link
Member

timholy commented Oct 22, 2014

I think this is a question for others to answer; I just "Cartesian-ified" an existing implementation of fill!.

This seems to be one of those "what do you mean by x?" questions.

@kmsquire
Copy link
Member

The same issue exists when providing a default value for a DefaultDict (in DataStructures.jl). The solution there was to allow a function to be passed in which would create the new object (mimicking Python). So here, for example, one way to do this would be to allow

fill!(B, ()->zeros(2,2))

The downside (here and in DefaultDict) is that it becomes challenging to fill something with an actual function. But it is doable:

fill!(B, ()->(()->zeros(2,2))))

@andreasnoack
Copy link
Member

Is it ever useful to fill an Array with the same reference in all elements?

Med venlig hilsen

Andreas Noack

2014-10-22 9:56 GMT-04:00 Kevin Squire notifications@github.com:

The same issue exists when providing a default value for a DefaultDict
(in DataStructures.jl). The solution there was to allow a function to be
passed in which would create the new object (mimicking Python). So here,
for example, one way to do this would be to allow

fill!(B, ()->zeros(2,2))

The downside (here and in DefaultDict) is that it becomes challenging to
fill something with an actual function. But it is doable:

fill!(B, ()->(()->zeros(2,2))))


Reply to this email directly or view it on GitHub
#8759 (comment).

@timholy
Copy link
Member

timholy commented Oct 22, 2014

Sure:

black = fill(RGB(0,0,0), 5, 5)
red = fill(RGB(1,0,0), 5, 5)
checkerboard = Array(Matrix{RGB}, 9, 9)
fill!(checkerboard, black)
for i = 1:2:length(checkerboard)
    checkerboard[i] = red
end

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

It doesn't look like the checkerboard example would break if we changed fill!(A, x) to do an implicit deepcopy(x) rather than direct assignment of x to each element. What would be interesting is if there were somehow a use case that took advantage of the assignment-of-reference semantics and would break if we changed it to assignment-of-value.

@timholy
Copy link
Member

timholy commented Oct 22, 2014

Throwing down the gauntlet, eh? (I'm not saying changing it would be wrong on average, but I do feel that whatever we choose will be wrong sometimes.)

component1 = [0 0 0; 1 1 1; 0 0 0]
component2 = [0 1 0; 0 1 0; 0 1 0]
img = rand(27, 27)
tiles = fill(component1, 9, 9)
for i = 1:2:27 tiles[i] = component2; end
for i = 1:iter
    # improve component1, component2
    nmf(...)
    # choose the best component to describe the ith tile
    tiles[i] = err1 < err2 ? component1 : component2
end

@StefanKarpinski
Copy link
Member

Adding implicit deepcopy anywhere seems like a bad idea – I'd prefer to know when I'm about to make a complete copy of everything in the entire world. Since deepcopy is a no-op on immutable that don't reference mutables, I'm not super concerned about that issue (ints, floats, RGB, etc.). The thing is that the behavior people really want here is probably for the expression that generates the value to be evaluated many times, not for the value to be copied many times. For example, if someone writes

fill!(Array(10), rand())

they probably wanted to call rand ten times. Of course, now it doesn't do what they wanted either, but it seems like the current behavior is simpler, even if it's sometimes unintuitive.

@andreasnoack
Copy link
Member

Since deepcopy is a no-op on immutable

It doesn't seem to be the case

julia> @time for i = 1:10^7;copy(1);end
elapsed time: 5.74e-7 seconds (0 bytes allocated)

julia> @time for i = 1:10^7;deepcopy(1);end
elapsed time: 2.670992386 seconds (3520000000 bytes allocated, 53.65% gc
time)

Med venlig hilsen

Andreas Noack

2014-10-22 10:58 GMT-04:00 Stefan Karpinski notifications@github.com:

Adding implicit deepcopy anywhere seems like a bad idea – I'd prefer to
know when I'm about to make a complete copy of everything in the entire
world. Since deepcopy is a no-op on immutable that don't reference
mutables, I'm not super concerned about that issue (ints, floats, RGB,
etc.). The thing is that the behavior people really want here is probably
for the expression that generates the value to be evaluated many times,
not for the value to be copied many times. For example, if someone
writes

fill!(Array(10), rand())

they probably wanted to call rand ten times. Of course, now it doesn't do
what they wanted either, but it seems like the current behavior is simpler,
even if it's sometimes unintuitive.


Reply to this email directly or view it on GitHub
#8759 (comment).

@StefanKarpinski
Copy link
Member

Ah, yes. Because deepcopy creates a dict to keep track of circularities. It's still a no-op, just a very expensive one. Good ol' deepcopy.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

@timholy just to clarify, the very issue I am wrestling with is in loops of the form

fill!(tiles, startingtile)
for i = 1:iter
    dostuff()
    tiles[i][j] = newval #I want to update the jth entry of the ith element but this doesn't do it
end

If the inner loop were to do tiles[i] = newval then the code would assign as intended. However if tiles were an Array of Arrays this code simply would not work as intended, because fill! creates a tiles array such that every single element is a reference to the same memory location, which is one initialized instance of startingtile. The assignment tiles[i][j]=newval then has the same semantics as tiles[:][j]=newval. I find it very hard to believe that users would ever want this behavior.

The thing is that the behavior people really want here is probably for the expression that generates the value to be evaluated many times, not for the value to be copied many times.

I see the issue; it happens to make no difference for my particular use case. However, I still see no justification for the current semantics of fill!: why would anyone ever want to create a Pleasantville-array with multiple copies of pointers to the same exact memory location?

@JeffBezanson
Copy link
Member

The system should not try to guess what needs to be copied, or how you're going to mutate things. Generic library functions can do what they want, and mutation has to be managed manually.

@JeffBezanson
Copy link
Member

The current fill! is basically optimized for functional-style programming. If you don't mutate things, it's a more efficient representation of the correct answer.

@timholy
Copy link
Member

timholy commented Oct 22, 2014

@jiahao, the point with my example was intended to be that you want to maintain just 2 components to describe all the tiles, so each element of the array must maintain a pointer to one or the other. Sorry if that wasn't clear.

I'm not saying there aren't good reasons to want the other thing sometimes, too. Just that there are use cases for the current behavior.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

If you don't mutate things, it's a more efficient representation of the correct answer.

So my question is, is this a premature optimization? Should fill!(A, mutablething) support the semantics of a mutable container of mutable objects? Or is this a "don't do that" gotcha?

It would be very sad if we go for the latter. It will mean that practically all the generic linear algebra code will get uglier because we will have to special case whether or not the element is mutable.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

@timholy

the point with my example was intended to be that you want to maintain just 2 components to describe all the tiles, so each element of the array must maintain a pointer to one or the other. Sorry if that wasn't clear.

IIUC then I should see that there are only two unique pointer addresses in the checkerboard code and they should be the address of black and red respectively, but there are 81:

black = fill(RGB(0,0,0), 5, 5)
bptr = pointer(black)
red = fill(RGB(1,0,0), 5, 5)
rptr = pointer(red)
checkerboard = Array(Matrix{RGB}, 9, 9)
fill!(checkerboard, black)
map(pointer, checkerboard[:]) |> unique |> length #81
sum(map(pointer, checkerboard[:]) .== bptr) #0
sum(map(pointer, checkerboard[:]) .== rptr) #0
for i = 1:2:length(checkerboard)
    checkerboard[i] = red
end
map(pointer, checkerboard[:]) |> unique |> length #81
sum(map(pointer, checkerboard[:]) .== bptr) #0
sum(map(pointer, checkerboard[:]) .== rptr) #0

or am I misunderstanding what pointer does?

@JeffBezanson
Copy link
Member

Why would linear algebra code mutate the elements of a matrix? Seems odd to me.

The problem is that once you get into stuffing implicit copys here and there, you're never done debating where they should and shouldn't go. For example when copy! moves elements from one array to another, should it copy the elements? I doubt there is an intuitive rule for knowing when things get copied behind your back and when they don't. "Never" is the simplest and most general option. It's easier to do extra copies yourself than to un-copy things that were copied for you.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

Why would linear algebra code mutate the elements of a matrix? Seems odd to me.

(I assume you meant elements of elements of a matrix.) This comes up every time you want to write in-place updating operations on tiled matrices (i.e. a matrix of matrices) which would overwrite the elements of the outer matrix with new elements (inner matrices). Usually we don't want out-to-place updating operations where you allocate new inner matrices, populate them and then reassign then as elements of the outer matrices, which would be the only thing you can do with immutable semantics.

@timholy
Copy link
Member

timholy commented Oct 22, 2014

@jiahao, bug in my example (non-concrete element declaration). Change to checkerboard = Array(Matrix{RGB{Float64}}, 9, 9) and the World Will Make Sense Again.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

@timholy Ah yes much better. I was beginning to go crazy for Not Getting The Point.

@JeffBezanson
Copy link
Member

Another way to say it: you can't expect all issues surrounding mutation to be solved for you by adding a few copys to the standard library. One such change might work today, but then later somebody will want it to be a deepcopy instead.

You're describing a function whose preconditions include "all inner matrices must be distinct objects". I'd argue that this isn't a reasonable precondition, because it's very hard for a caller to ensure. That use of mutation is an implementation detail of the function, so the function needs to bear full responsibility, not force it on callers.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

@JeffBezanson

The current fill! is basically optimized for functional-style programming.

Isn't functional programming all about purity of output function mapping? And isn't the current behavior antithetical to that desire because because its output is riddled with side effects associated with conflated memory addresses? Or am I misunderstanding something very fundamental about what "functional programming" is?

cc: @jakebolewski

@JeffBezanson
Copy link
Member

Yes, you're misunderstanding. Shared references are not "impure". Mutation is what's impure.

@StefanKarpinski
Copy link
Member

Predictably, I'm 100% with Jeff on this one (or is he 100% with me – after all, I called it bizarre first ;-) – it's easier to ask for a copy when you want one than to prevent a copy when you don't need one.

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

I will agree that copying behind your back is generally not a good thing.

I think I've finally zeroed in (as it were) on what is so confusing for me. It's the fact that the term 'value' in the description of the function

Base.fill!(A, x)

Fill the array "A" with the value "x"

means "compiler value", which if boxed is a pointer value, but if unboxed is a value much closer to what I think about as "numeric value".

@jiahao
Copy link
Member Author

jiahao commented Oct 22, 2014

Thanks everyone for humoring me as I stumble through yet another obvious-in-retrospect trajectory.

@jiahao jiahao closed this Oct 22, 2014
@StefanKarpinski
Copy link
Member

These conversations are well worth having since a non-negligible amount of the time they lead to a positive change in the language and usually lead to more clarity for the people having the discussion.

@timholy
Copy link
Member

timholy commented Oct 22, 2014

Real non-allocating linear algebra routines have two exclamation points. Much more impressive that way.

But more seriously, if one thinks about writing A_mul_B! in pure Julia code, and you want to handle the case you're considering now, you'd literally have to replace

for j = 1:J, i = 1:I
    s = 0.0
    for k = 1:K
        s += A[i,k]*B[k,j]
    end
    C[i,j] = s
end

with

tmp = similar(C[1,1])
for j = 1:J, i = 1:I
    c = C[i,j]  # C is pre-filled with zeros-matrices
    for k = 1:K
        A_mul_B!(tmp, A[i,k], B[k,j])
        add!(c, tmp)
    end
end

If you tried to make this generic so it works for either immutable c or mutable c, I'd worry about losing the ability to use CPU registers or incurring extra write cache-misses. It might make the generic algorithm unattractive from a performance perspective.

@timholy
Copy link
Member

timholy commented Oct 22, 2014

(Didn't hit refresh on the browser and so didn't realize the conversation had taken a different turn.)

@jiahao jiahao deleted the cjh/zero branch October 25, 2014 17:59
@ivarne
Copy link
Member

ivarne commented Nov 14, 2014

Just saw @amitmurthy publish code to julia-users with this exact mistake. Do we really expect our average user to have such a different intuition of what fill means than me, Jiaho and Amit?

@amitmurthy
Copy link
Contributor

Thanks @ivarne , corrected the mistake on julia-users.

@JeffBezanson
Copy link
Member

But that is not a mistake. Any objection to this is an objection to mutation. It would be better to get rid of mutation than to have to randomly copy things everywhere for no clear reason.

@ivarne
Copy link
Member

ivarne commented Nov 15, 2014

How can that use of fill not be a mistake? The question was about a Design object that is way too big to create arbitrary copies, and contains mutable arrays. Do you really think Marcel wasn't planning on mutating the arrays, and wouldn't be up for a big surprise when his simulation gives wrong results?

@amitmurthy
Copy link
Contributor

That use of fill was a mistake. But I am also in agreement with the current behavior, and just better documentation would have helped me avoid it.

@matthieugomez
Copy link
Contributor

I stumbled on the same behavior when creating an array of set

a = fill(Set{Int}(), 2)
push!(a[1], 1)
a
#2-element Array{Set{Int64},1}:
# Set{Int64}({1})
# Set{Int64}({1})

I think the current behavior will lead to unpleasant experiences / silent errors in user codes.

@matthieugomez
Copy link
Contributor

I used fill in my code only because Array returns undefined refs rather than empty sets. A solution may be for Array to return the outcome of the empty constructor if available, so that one can do

a = Array(Set{Int}, 2)
push!(a[1], 1)

This may be linked to #9147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants