Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check lengths in covector-vector products #36679

Merged
merged 12 commits into from
Jul 22, 2020

Conversation

MasonProtter
Copy link
Contributor

closes #36678

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Jul 15, 2020

From a Slack discussion @pablosanjose on this, in case anyone wants to chime in while we're fixing things with indexing:

Pablo says

Currently we have

julia> o = OffsetArray(collect(1:10), 11:20); v = collect(1:10); o' * v
385

but

julia> o = OffsetArray(rand(3,3), 2:4, 1:3); v = rand(3); o * v
ERROR: ArgumentError: offset arrays are not supported but got an array with index other than 1

Should we be checking to see if the two objects have different index styles as well as whether they have different lengths?

I say

I'm not really sure if that's desirable or not. I'd be tempted to say yes, the out of the box behaviour is the most sensible one and if OffsetArrays wants to disallow it, they can do so within the package

I suspect Base does that for mat-vec products because it makes the implementation simpler, not because it'd be nonsense to do so. In this case, handling an offset array doesn't really add any extra complication imo.

It does feel to me that contracting an offset array with a regular array is an indication that you're doing something wrong on a conceptual level, but I'm not really sure Base should enforce that or not if it doesn't make the implementation any easier.

Comment on lines +275 to +279
if lu == 0
zero(eltype(u)) * zero(eltype(v))
else
sum(uu*vv for (uu, vv) in zip(u, v))
end
Copy link
Member

@tkf tkf Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are manually computing the zero length case anyway, sum(...; init = zero(eltype(u)) * zero(eltype(v))) might be easier on the compiler and (potentially) improve the performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

Copy link
Contributor Author

@MasonProtter MasonProtter Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so this is actually problematic because it makes u' * v not work for any u or v for which zero(eltype(_)) isn't defined (i.e. things like arrays of arrays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to undo the suggested change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, this reads basically exactly (up to the * vs dot) as:

function dot(x::AbstractArray, y::AbstractArray)
lx = length(x)
if lx != length(y)
throw(DimensionMismatch("first array has length $(lx) which does not match the length of the second, $(length(y))."))
end
if lx == 0
return dot(zero(eltype(x)), zero(eltype(y)))
end
s = zero(dot(first(x), first(y)))
for (Ix, Iy) in zip(eachindex(x), eachindex(y))
@inbounds s += dot(x[Ix], y[Iy])
end
s
end

which I think is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zero(eltype(_)) isn't defined (i.e. things like arrays of arrays.

Actually I feel like I did a similar suggestion before here that was ended up useless exactly due to this... I should have learned from a mistake.

stdlib/LinearAlgebra/src/adjtrans.jl Outdated Show resolved Hide resolved
@dkarrasch
Copy link
Member

@MasonProtter You'll need to rename the nonrecursive functions also at their call sites. 😜

@marius311
Copy link
Contributor

Thanks for looking at this so quickly! One super minor thing, but you might consider using something other than nonrecursive in that function name, b/c while I know what you mean in terms of the function not recursing into dot immedaitely, the mathematical operation is still "recursive", i.e. [[1.,2]]' * [[3,4]] still gives 11. Seeing the word "non-recursive" there (and in the title of the original PR fwiw) really threw me off intially on if something fundamental about how Julia handles dot products had changed (which it hasnt).

I also wonder whether you might throw in to this PR my other suggestion about using instead:

*(u::AdjointAbsVec{<:Number}, v::AbstractVector{<:Number})

@dkarrasch
Copy link
Member

the mathematical operation is still "recursive", i.e. [[1.,2]]' * [[3,4]] still gives 11

It isn't:

julia> v = [rand(2,2) for _ in 1:2]
2-element Array{Array{Float64,2},1}:
 [0.4274917728716121 0.8685061037888708; 0.8947672491011569 0.4527571728561508]
 [0.20207963479107605 0.8146694999967596; 0.07323308037672915 0.5415494652667667]

julia> v'*v
2×2 Array{Float64,2}:
 1.02956   0.980679
 0.980679  1.91625

julia> dot(v,v)
2.9458110362426355

And that is a fundamental change, because we do change from recursive reduction to a non-recursive one, since the recursive one was agreed to be inconsistent behavior. We have dot for the recursive reduction.

@marius311
Copy link
Contributor

marius311 commented Jul 16, 2020

Right, good point, thanks. I took a stab at writing a News.md item that would have helped me understand all of this more quickly at the beginning (I have not been following any of this discussion). Feel free to use it if its right:

  • x'*y where x and y are AbstractVectors is now semantically equivalent to sum(vx'*vy for (vx,vy) in zip(x,y)). dot(x,y) remains semantically equivalent to sum(dot(vx,vy) for (vx,vy) in zip(x,y)) (previously, both cases used dot to recurse, so only the behavior of x'*y is changed to be more consistent).

Note the "is semantically equivalent" is taken straight out of the docstring for dot. It might not hurt to mention this there too.

@MasonProtter
Copy link
Contributor Author

@MasonProtter You'll need to rename the nonrecursive functions also at their call sites. 😜

🤦 I should stop trying to quickly commit things right before bed, sorry about that.

Right, good point, thanks. I took a stab at writing a News.md item that would have helped me understand all of this more quickly at the beginning (I have not been following any of this discussion).

Honestly, I think this is more of a bug-fix than a change in intended semantics. Note that this makes u' * v behave like transpose(conj(u)) * v. From that perspective, I'm not sure it really needs a news item, but I don't know.

@StefanKarpinski StefanKarpinski added this to the 1.5 milestone Jul 16, 2020
@marius311
Copy link
Contributor

The main motivation for me for a News item is that as it stands, anyone that has defined a custom vector type and implemented a custom dot(x::MyVector, y::MyVector) with the expectation that for scalar eltype's this would also make x' * y work, will now need to explicitly implement x' * y, since the default x' * y even for scalar eltypes no longer always uses dot. I just wanted to highlight this for others so they don't have to dig through the code these PRs like I did to understand why.

An alternative to a news item might be just to make scalar eltype's always use dot, which you could do I think with the thing I was suggesting, *(u::AdjointAbsVec{<:Number}, v::AbstractVector{<:Number}).

@MasonProtter
Copy link
Contributor Author

CI failures appear to be unrelated. @dkarrasch @tkf could you re-review? This is important to get in for 1.5.

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
@dkarrasch dkarrasch added the merge me PR is reviewed. Merge when all tests are passing label Jul 17, 2020
Comment on lines +283 to 284
*(u::AdjointAbsVec{<:Number}, v::AbstractVector{<:Number}) = dot(u.parent, v)
*(u::TransposeAbsVec{T}, v::AbstractVector{T}) where {T<:Real} = dot(u.parent, v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding that change in. Any reason not to do the equivalent thing in 284 as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I merged them, and now it looks very clean, actually. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's causing method ambiguities that I'm not in the mood now of resolving. I've reverted that. I think the *(::TransVector,::AbstractVector) is handled somewhere in matmul.jl.

@codecov-commenter
Copy link

Codecov Report

Merging #36679 into master will decrease coverage by 0.00%.
The diff coverage is 68.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #36679      +/-   ##
==========================================
- Coverage   86.12%   86.12%   -0.01%     
==========================================
  Files         350      350              
  Lines       65894    65917      +23     
==========================================
+ Hits        56752    56768      +16     
- Misses       9142     9149       +7     
Impacted Files Coverage Δ
base/abstractarray.jl 86.94% <ø> (ø)
base/reducedim.jl 97.19% <ø> (ø)
base/reflection.jl 87.66% <ø> (ø)
base/compiler/tfuncs.jl 85.48% <52.00%> (-0.88%) ⬇️
base/channels.jl 95.83% <100.00%> (+0.02%) ⬆️
base/file.jl 79.00% <100.00%> (ø)
base/loading.jl 82.59% <100.00%> (ø)
base/multidimensional.jl 86.57% <100.00%> (+0.01%) ⬆️
base/stream.jl 79.55% <100.00%> (+0.13%) ⬆️
stdlib/LinearAlgebra/src/lu.jl 97.97% <100.00%> (ø)
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f62363...f95c89b. Read the comment docs.

Comment on lines +270 to +275
function _dot_nonrecursive(u, v)
lu = length(u)
if lu != length(v)
throw(DimensionMismatch("first array has length $(lu) which does not match the length of the second, $(length(v))."))
end
if lu == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re OffsetArrays support, I think it's enough to check axes like this?

Suggested change
function _dot_nonrecursive(u, v)
lu = length(u)
if lu != length(v)
throw(DimensionMismatch("first array has length $(lu) which does not match the length of the second, $(length(v))."))
end
if lu == 0
function _dot_nonrecursive(u::AbstractVecOrMat, v::AbstractVecOrMat)
if !(axes(u, 1) == axes(v, 2) && axes(u, 2) == axes(v, 1))
throw(DimensionMismatch("dimensions of the first array $(axes(u)) and the second array $(axes(v)) are incompatible for a dot product."))
end
if isempty(u)

I think this is a kind of error that would have thrown if we use here promote_shape(u', v) (with a better error message). Since something like eachindex(a, b) already throws an error with something like eachindex(1:3, OffsetArray(1:3, -2:0)) simply based on axes, maybe it makes sense to check axes here?

Copy link
Contributor Author

@MasonProtter MasonProtter Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could just do

@inbounds sum(u[i]*v[i] for i in eachindex(u, v))

However, I'm still not actually convinced it's even desirable to throw an error here. I'm also not convinced we shouldn't throw an error, but I will point out that that in the current state of affairs transpose(u::OffsetVector) * v::Vector currently has the same behaviour as the implementation in this PR (i.e. no error).

I feel like it'd be good at least for 1.5 since it's so near, to just follow the lead of transpose(u) * v rather than being over eager about throwing errors.

julia> using OffsetArrays

julia> let u = OffsetArray([1,2,3], -1:1), v = [1,2,3]
           transpose(u) * v
       end
14

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rather inclined towards throwing an error, as in the matrix-vector case. If the user really wants to contract an o::OffsetArray with a v::Vector, it is not complicated to do OffsetArrays.no_offset_view(o)' * v. Perhaps no_offset_view should be exported.

@dkarrasch dkarrasch removed the merge me PR is reviewed. Merge when all tests are passing label Jul 21, 2020
@JeffBezanson
Copy link
Member

What's the status here? Would be good to get this resolved very soon.

@dkarrasch
Copy link
Member

What's the status here?

The remaining question is whether we should require that the two vectors have equal axes. Most functionality in LinearAlgebra does require one-based indexing and equal lengths, which seems to imply equal axes, but currently there is no such check here. So, adding that requirement (as suggested by consistency) means making something throw that currently doesn't.

@MasonProtter
Copy link
Contributor Author

Im pretty wary of the idea of throwing an error here since this change will have so little time for someone to try it out and complain if it causes a problem. I think being as conservative as possible, just fixing the bug caused from unequal length vectors is the right way to go.

@timholy
Copy link
Member

timholy commented Jul 22, 2020

To me it seems clear this should be an error; the only other alternative I can see would be that

julia> a = [1, 3]
2-element Array{Int64,1}:
 1
 3

julia> b = OffsetArray([3, 7], 2:3)
2-element OffsetArray(::Array{Int64,1}, 2:3) with eltype Int64 with indices 2:3:
 3
 7

should yield a'*b == 9. I do not favor this solution, however.

Right before releasing Juila 1 I tried to go through the entire code base and at least make stuff fragile (meaning, throw an error eagerly) with respect to non-1 indices if no one had yet gotten around to generalizing the operation. Looks like this is a case that got missed?

@JeffBezanson
Copy link
Member

Right now, the most important thing is to get something backportable, so I agree with Mason we should just fix the thing that changed in 1.5 and add further checks separately.

@timholy
Copy link
Member

timholy commented Jul 22, 2020

Agreed, what we do immediately does not have to be where we are heading.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Jul 22, 2020

Then I think this PR is ready to merge unless there are any last minute objections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dimension checking for vector' * vector is broken