Define dot product between vectors of arrays/numbers #22220

dalum · 2017-06-05T07:18:32Z

This is useful in quantum physics and potentially other fields, where equations containing sums of operators (matrices) can often be simplified by writing them as vector dot products instead. By allowing dot to work for vectors of matrices and numbers, it is possible to achieve a more direct correspondence between mathematical notation and a corresponding program. The application which motivated this PR was the Pauli vector which is often dotted with a magnetic field (B, a "normal" vector) to produce a sum of matrices:

σ = [[0 1; 1 0], [0 -im; im 0], [1 0; 0 -1]]
B = [1, 1, 1]
σ⋅B == σ[1] * B[1] + σ[2] * B[2] + σ[3] * B[3]

The reason for adding dot(::AbstractArray, ::AbstractArray) with conjugate transposition of the first argument is to preserve the correspondence between complex conjugation and conjugate transposition of numbers and matrices, where hermitian matrices are analogous to real numbers and antihermitian are analogous to imaginary numbers.

Make CoreIO thread-safer

Add a missing return in doInitialization

Load module on remote workers from master (v2)

turn on deprecation warnings for `type` and `immutable`

give a syntax error for repeated keyword args. fixes #16937

fix #21848, bug in widening done by limit_type_depth

fix #20575, syntax error for juxtaposing a string literal

Build fixes for ASAN

Use where syntax in umfpack.jl

downloads of nightlies

fix #21906, `ccall` causes unnecessary variable `Box`

Create show_supertypes function

Fix pool lookup function

This is undefined behavior in LLVM and can actually segfault on real hardware including x86 in certain cases. Also add missing endianess test since it is changed in this commit.

base/digits: allow negative bases

dalum · 2017-06-06T06:43:39Z

@TotalVerb But isn't it the case that the direct product of a 3 by 1 matrix with a 2 by 2 matrix is a 6 by 2 matrix? Multiplying this with a 2 by 6 matrix from the left should give a 2 by 2 matrix, not a number.

I am not arguing that this should be the behaviour of dot now, but of *.

TotalVerb · 2017-06-06T06:55:45Z

I don't understand what you mean about *. Could you give an example?

dalum · 2017-06-06T06:57:40Z

julia> A = [rand(2, 2) for _ in 1:3]
3-element Array{Array{Float64,2},1}:
 [0.684755 0.372397; 0.165985 0.0307143]
 [0.60091 0.894754; 0.0742568 0.532859] 
 [0.76396 0.898669; 0.159446 0.626533]  

julia> A' * A
3.896401569691669

julia> A' * A == sum(dot(Ai, Ai) for Ai in A) == dot(A, A)
true

julia> A' * A == sum(Ai' * Ai for Ai in A)
false

TotalVerb · 2017-06-06T07:01:52Z

I have to agree with you that A' * A returning a scalar is strange because it does recursively call dot, despite there being no indication of it from the syntax. Hmm. Maybe it should be changed to return a matrix.

stevengj · 2017-06-06T11:53:54Z

x'*x returns a scalar if x is a vector (this was changed after long discussion in JuliaLang/LinearAlgebra.jl#42) since x' is treated as an element of the dual space. If x is a vector, in contrast, x is treated as the adjoint of a linear operator and x'*x gives a matrix. If x is a vector of matrices, it makes sense to me that it be treated as a vector, since it can't act as a linear operator.

In any case, that seems like a somewhat separate issue?

dalum · 2017-06-06T12:38:46Z

I agree that x' * x should return a number, when x is a vector of numbers. But not when x is a vector of operators.

The notion of a vector of linear operators is, I would argue, a commonly established concept in physics, and probably applied mathematics too. The divergence of a vector field is exactly an operator which has the property of being a row vector and where the dot notation is used in a mathematical context exactly because of its similarities. The fact that it returns a real number when acting on a real vector field is coincidental, as it would in general return a complex number for a complex vector field. Or a quarternion for a quarternion vector field. Or a matrix for a matrix vector field.

Now, as pointed out by @andreasnoack in #22240, the fact that dot(::Matrix, ::Matrix) works at all right now is a bug, and thus so is the fact that x' * x returns a scalar when x is a Vector{Matrix{Number}}.

stevengj · 2017-06-06T12:52:59Z

@eveydee, the problem with interpreting a vector of matrices as a linear operator is that this is inconsistent with how Julia's * works:

julia> A = [rand(2,2) for i = 1:4]
4-element Array{Array{Float64,2},1}:
 [0.366133 0.289976; 0.123663 0.41303] 
 [0.737389 0.398943; 0.102958 0.326483]
 [0.737347 0.742327; 0.868316 0.846149]
 [0.410302 0.429769; 0.474719 0.750022]

julia> A * [3,4]
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
 [1] *(::Array{Array{Float64,2},1}, ::Array{Int64,1}) at ./linalg/rowvector.jl:184

If A were interpreted as a vector of linear operators (and I totally agree that this notation is possible), then A*x would give [a*x for a in A] here.

How the adjoint works really has to be consistent with how * and dot work. The key (defining!) property is that we must have dot(x,A*y) == dot(A'*x, y) for any cases where A' is treated as the adjoint of a linear operator.

andreasnoack · 2017-06-06T13:39:18Z

@eveydee, the problem with interpreting a vector of matrices as a linear operator is that this is inconsistent with how Julia's * works

That is because we have been sloppy when implementing * and it should most likely be considered a bug.

This is Julia 0.3, by the way

julia> x = [randn(2,2) for i in 1:3]
3-element Array{Array{Float64,2},1}:
 2x2 Array{Float64,2}:
 0.491661    -0.697108
 0.00846169  -0.195587
 2x2 Array{Float64,2}:
  0.242481   0.648218
 -0.567836  -0.224404  
 2x2 Array{Float64,2}:
 0.729065  -0.90256
 0.799737   2.11797      

julia> dot(x,x)
2x2 Array{Float64,2}:
 -0.263727  -2.7643
  2.26912    3.4786

and it got changed accidentally as mentioned in #22240. As dot works now, it is effectively vecdot. I'm proposing that dot and vecdot do different things and then you can pick the one that works for your problem. I don't really see what is non-generic about vecdot.

It is not that returning a matrix doesn't work. I already mentioned the limitation in #22220 (comment) but in most cases, it just works. E.g. here is Gram-Schmidt

julia> Base.:(*)(x::Vector{<:Array}, α::Array) = [xx*α for xx in x]

julia> mydot(x,y) = sum(xx'yy for (xx,yy) in zip(x,y))
mydot (generic function with 1 method)

julia> function myGS(V::Vector{Vector{Matrix{Float64}}})
                U = similar(V)
                U[1] = V[1]
                for j in 2:length(V)
                  U[j] = V[j] - sum(U[i]*inv(mydot(U[i], U[i]))*mydot(U[i], V[j]) for i in 1:j-1)
                end
                return U
              end
myGS (generic function with 1 method)

julia> V = [[randn(2,2) for i in 1:3] for i in 1:3];

julia> U = myGS(V);

julia> mydot(U[1], U[2])
2×2 Array{Float64,2}:
 3.60822e-16   0.0
 1.11022e-15  -1.77636e-15

julia> mydot(U[1], U[3])
2×2 Array{Float64,2}:
 -2.22045e-16  4.71845e-16
  8.88178e-16  7.77156e-16

julia> mydot(U[2], U[3])
2×2 Array{Float64,2}:
  1.11022e-15  -2.15106e-16
 -8.88178e-16  -7.63278e-17

dalum · 2017-06-06T13:57:02Z

@stevengj I understand what you are getting at, but your example highlights a completely different use-case, which I don't think has a good solution. This is not at all what was intended to be achieved, and I wouldn't expect people to assume that to work in any instance.

This PR/issue is about dotting vectors in spaces of equal dimensions, i. e., achieving compact notation by writing sums of (matrix) products as dot products of vectors. And it looks like things worked that way in Julia 0.3, based on @andreasnoack's comment.

stevengj · 2017-06-06T15:00:09Z

The basic question is whether one sees ' as the purely mechanical operation of "swapping rows and columns and conjugating" or as an adjoint operation reflecting a deeper notion of algebra. With RowVector I thought we were moving in the latter direction, which requires *, dot, and ' to have certain relationships.

@andreasnoack, the problem with myGS is that inv(mydot(U[i], U[i])) may fail because this is not an inner product and may not return something invertible for nonzero U[i].

TotalVerb · 2017-06-06T15:39:05Z

@stevengj To me, at least one of these behaviors ought to be wrong:

julia> A = [rand(2, 2) for _ in 1:3]
3-element Array{Array{Float64,2},1}:
 [0.843197 0.790402; 0.293537 0.717]   
 [0.842407 0.207064; 0.685534 0.616951]
 [0.679417 0.632756; 0.462531 0.390315]

julia> B = [rand(2, 2) for _ = 1:3, __ = 1:3]
3×3 Array{Array{Float64,2},2}:
 [0.77686 0.873093; 0.518896 0.314283]   …  [0.997188 0.951079; 0.00939076 0.37915]
 [0.418401 0.300341; 0.892219 0.776044]     [0.64111 0.536536; 0.0546541 0.607846] 
 [0.352997 0.667995; 0.242118 0.931147]     [0.352873 0.416895; 0.448481 0.515657] 

julia> A' * B
1×3 RowVector{Array{Float64,2},ConjArray{Array{Float64,2},1,Array{Array{Float64,2},1}}}:
 [2.12329 2.49799; 1.94104 2.24253]  …  [1.86831 2.30368; 1.35971 1.97476]

julia> A' * B[:,1]
4.365816195613023

If we want A' to be the adjoint in this case, then we need to fix matrix multiplication using it. Furthermore, collect(A') returns something that is not the adjoint of this vector, but instead a 1×3 matrix of matrices, which cannot be the adjoint. From these it looks like both covector-matrix multiplication and collect are interpreting it as a vector of operators, which is inconsistent with dot and covector-vector multiplication.

stevengj · 2017-06-06T15:45:24Z

@TotalVerb, neither of those behaviors seem wrong if we interpret A as simply an element of a vector space (not a linear operator since A*vector isn't defined), in which case A' is an element of the dual space (not an adjoint operator).

This is no different from a = rand(3); B = rand(3,3) where a'*B gives a row vector and a'*B[1,:] gives a scalar.

dalum · 2017-06-06T15:53:26Z

Why must the scalar be a number? If you have a matrix field, M and construct a vector in M^3, then your scalar is a matrix, not a number. Looking at Wikipedia, this seems to be referred to as a module, but I'm not familiar with the terminology here: https://en.wikipedia.org/wiki/Scalar_(mathematics)#Scalars_in_modules

TotalVerb · 2017-06-06T15:59:51Z

@stevengj ~~Note that the result of A' * B is a RowVector of matrices. If A' is an element of the dual space, then it should be a RowVector of scalars.~~

I suppose that what may be wrong here is again the printing of A' * B; it and collect are certainly printed in a misleading way, because the behavior of this adjoint is far from a "row vector of matrices" and should not be printed or collected that way.

TotalVerb · 2017-06-06T16:06:29Z

In other words, perhaps more clearly, to me there is a semantics difference between

A :: Matrix{Float64}
B :: Matrix{Float64}

[A, B]' == [A', B'] <as RowVector>

and

A :: Matrix{Float64}
B :: Matrix{Float64}

[A, B]' == [Dual(A), Dual(B)] <as RowVector>

The Dual operation is recursive (by that I mean, upon collection the elements of the original vector are themselves dual-ed), but the thing that recurses is not the adjoint operator x', because the behavior of that operator is incorrect on matrices (it does not return the dual as if x were simply a vector).

stevengj · 2017-06-06T17:13:56Z

@eveydee, scalars don't have to be numbers, but arbitrary square matrices don't form a field (or skew field) because nonzero matrices aren't invertible. And we can't tell from the type Matrix{Number} whether the matrices are even square, much less invertible. So, given the type information alone, you cannot infer that Vector{Matrix{Number}} is a vector space over anything other than Number.

stevengj · 2017-06-06T17:17:46Z

@TotalVerb, the point is that x' is ambiguous. It can mean an adjoint operator or a dual vector. This is a fact of life in mathematics. We are left with how to resolve it. Saying that it is the adjoint operator for x::Matrix and the dual vector for x::Vector, regardless of the element type, is mostly what we do now, and is what I have been advocating based on how we define *.

StefanKarpinski · 2017-06-06T17:45:44Z

Square matrices do form a ring, however, and matrices of matrices can be viewed as forming a module (c.f. vector space) with a "scalar" – i.e. ring – elements that are matrices themselves.

TotalVerb · 2017-06-06T17:45:52Z

@stevengj I understand that. But when ' is resolved to be a dual, the elements of that dual vector (as from collect) should also be dual elements, not adjoints as they currently are, no?

stevengj · 2017-06-06T17:51:58Z

@StefanKarpinski, but you can't infer from the type that the matrices are square, or of the same size, so why would you assume this interpretation if you saw a Vector{Matrix{Number}}?

StefanKarpinski · 2017-06-06T17:58:16Z

I wouldn't – it would have to be implied by an operation on them, like a dot product.

andreasnoack · 2017-06-06T17:58:39Z

@andreasnoack, the problem with myGS is that inv(mydot(U[i], U[i])) may fail because this is not an inner product and may not return something invertible for nonzero U[i].

Hey, I pointed out that first. I'm just saying that it's not a big deal. I don't think the consumers of this feature would have a problem with the potential singularity here. (Matrix consumers are used to dealing with potential singularity and most matrices are full rank anyway).

The main point here is that defining dot(x::Vector,y::Vector) = sum(xx'yy for (xx,yy) in zip(x,y)) wouldn't restrict you, or at least you haven't told us in what way, since you can use vecdot. If you enforce that dot is like vecdot then we don't have a choice between two different useful behaviors.

dalum · 2017-06-07T03:48:41Z

@stevengj If the matrices aren't of compatible sizes for a matrix product, you shouldn't be able to vecdot them, when the matrix inner product is defined as trace(X'Y) for X, Y matrices. In other words, when taking the dot products of Vector{Matrix{Number}}, the only case that the matrix inner product covers, which a matrix product does not, is when the vectors contain compatible matrices but of different sizes in each entry. This is because summing matrices of different sizes does not make sense, but summing the traces does:

julia> A = [rand(2, 2), rand(3, 2), rand(2, 3)]
julia> B = [rand(2, 2), rand(2, 3), rand(3, 2)]
julia> C = [rand(2, 2), rand(4, 1), rand(2, 3)]

julia> A'A
3.199407765200927
julia> sum(trace(a'a) for a in A) == A'A
true
julia> sum(a'a for a in A)  # The error here is the sum, not the matrix product
ERROR: DimensionMismatch("dimensions must match")

julia> A'B
5.432131012504026
julia> sum(trace(a'b) for (a,b) in zip(A,B)) == A'B
ERROR: DimensionMismatch("A has dimensions (3,2) but B has dimensions (3,2)")

julia> A'C
ERROR: DimensionMismatch("dot product arguments have lengths 6 and 4")

The fact that A'B even works here, I think is a bug. vecdot(A[2], B[2]) should be an error since A[2] and B[2] do not have compatible dimensions to do trace(A[2]'B[2]) (A[2]'B[2] is an error). It does work, however, and it returns a different result if we take the conjugate transpose of the first argument, which I don't think is the desired behaviour. One of these should fail:

julia> vecdot(A[2], B[2])
2.5458122251380035
julia> vecdot(A[2]', B[2])
2.358516933607439

A whole different issue is in the not so unlikely case that you want to take the kronecker product of the matrices. In quantum physics, this is the case for the spin orbit coupling, where you have a dot product between vectors of operators in angular momentum and spin spaces respectively. In this particular case the matrices are of different sizes, but for spin-spin couplings in general, the matrices may have identical sizes, so mere size-incompatibility cannot be used to infer which is the desired behaviour. I guess the correct way to treat this is to introduce a new krondot (kronecker dot) for this particular use case?

andreasnoack · 2018-01-16T21:43:43Z

JuliaLang/LinearAlgebra.jl#496 made me revisit this issue and I realized that I recently read a formalized argument for my view on dot(Vector{Matrix},Vector{Matrix}) in #22220 (comment).

For the, admittedly, few people interested, the (semi) inner product in https://arxiv.org/pdf/0905.3509.pdf is like our dot, since it maps to the C* algebra and not ℂ. The corresponding norm used in the paper would then be

norm(x::Vector{<:AbstractMatrix}) = norm(dot(x,x))

So the version of Cauchy-Schwarz in the paper can be expressed as

julia> norm(x::Vector{<:Matrix}) = norm(dot(x,x))
norm (generic function with 18 methods)

julia> (⩽)(A::Matrix,B::Matrix)= all(eigvals(B - A) .>= 0)
⩽ (generic function with 1 method)

julia> x = [rand(2,2) for i in 1:2]
2-element Array{Array{Float64,2},1}:
 [0.158424 0.747836; 0.541273 0.677574]
 [0.136556 0.956337; 0.468317 0.302034]

julia> y = [rand(2,2) for i in 1:2]
2-element Array{Array{Float64,2},1}:
 [0.454078 0.173277; 0.628935 0.317456]
 [0.270871 0.828004; 0.0367388 0.96777]

julia> (x⋅y)*(y⋅x) ⩽ norm(y)^2*(x⋅x)
true

JeffBezanson and others added 30 commits May 16, 2017 13:46

turn on deprecation warnings for type and immutable

198dabd

give a syntax error for repeated keyword args. fixes #16937

7b67544

fix #20575, syntax error for juxtaposing a string literal

1140fb1

fix #21906, ccall causes unnecessary variable Box

f8ea09d

Add a missing return in doInitialization

2446621

Make expm1(::Complex64) type stable. (#21896)

564c6e9

Merge pull request #21893 from JuliaLang/yyc/threads/core-print

020d417

Make CoreIO thread-safer

Merge pull request #21913 from JuliaLang/aa/llvm-eh-handler

4d72004

Add a missing return in doInitialization

Merge pull request #21695 from GregPlowman/glp/remote_loading

ce278c2

Load module on remote workers from master (v2)

Mark the backtrace test as broken on FreeBSD (#21917)

6b5a445

Merge pull request #21777 from JuliaLang/jb/deprecate_type_reswords

3ad7d16

turn on deprecation warnings for `type` and `immutable`

Merge pull request #21780 from JuliaLang/jb/fix16937

2892afc

give a syntax error for repeated keyword args. fixes #16937

Merge pull request #21907 from JuliaLang/jb/fix21848

4e8d199

fix #21848, bug in widening done by limit_type_depth

Merge pull request #21787 from JuliaLang/jb/fix20575

365206d

fix #20575, syntax error for juxtaposing a string literal

Merge pull request #21902 from JuliaLang/tb/makefile_asan

a10bc79

Build fixes for ASAN

Fix pool lookup function

03a7148

Merge pull request #21890 from musm/patch-3

ef718a3

Use where syntax in umfpack.jl

Use cached urls for appveyor and contrib/prepare_release.sh

03c3561

downloads of nightlies

fix #21923, regressions in circular type definitions

45522aa

Merge pull request #21911 from JuliaLang/jb/fix21906

7484568

fix #21906, `ccall` causes unnecessary variable `Box`

Merge pull request #21443 from JuliaLang/cv/show_supertypes

b067d72

Create show_supertypes function

Merge pull request #21924 from JuliaLang/yyc/gc/debug

243d186

Fix pool lookup function

Do not reinterpret to wrong alignment

942f06f

This is undefined behavior in LLVM and can actually segfault on real hardware including x86 in certain cases. Also add missing endianess test since it is changed in this commit.

Require unsafe_wrap and reinterpret to have proper alignment

25a6c1a

Use testsets and \approx in umfpack.jl (#21904)

e4a8146

Pass LD and LDFLAGS to more dependencies (#21764)

31c453b

re-enable ndigits(::Bool, b) (fix #21919)

7632cc8

ndigits functions: rename n -> x for consistency

6865a69

base/digits: allow negative bases

5fb71c0

Merge pull request #21692 from JuliaLang/rf/digits-neg

e405c63

base/digits: allow negative bases

dalum mentioned this pull request Jun 6, 2017

Add dot for complex matrices #22240

Closed

dalum mentioned this pull request Jun 16, 2017

Define new dot method for abstract vectors #22392

Merged

stevengj mentioned this pull request Nov 13, 2017

Poor performance for dot(::RowVector, ::RowVector) JuliaLang/LinearAlgebra.jl#487

Closed

andreasnoack mentioned this pull request Jan 3, 2018

recursive vecdot and vecnorm #25093

Closed

andreasnoack mentioned this pull request Jan 16, 2018

State of inner products in Base JuliaLang/LinearAlgebra.jl#496

Closed

MasonProtter mentioned this pull request Mar 19, 2020

(*)(::Adjoint{Vector}, ::Vector) should use a non-recursive dot product #35174

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define dot product between vectors of arrays/numbers #22220

Define dot product between vectors of arrays/numbers #22220

dalum commented Jun 5, 2017

dalum commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017

dalum commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017

stevengj commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

andreasnoack commented Jun 6, 2017

dalum commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017

TotalVerb commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017

StefanKarpinski commented Jun 6, 2017

TotalVerb commented Jun 6, 2017

stevengj commented Jun 6, 2017

StefanKarpinski commented Jun 6, 2017

andreasnoack commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 7, 2017 •

edited

Loading

andreasnoack commented Jan 16, 2018 •

edited

Loading

Define dot product between vectors of arrays/numbers #22220

Define dot product between vectors of arrays/numbers #22220

Conversation

dalum commented Jun 5, 2017

dalum commented Jun 6, 2017 • edited Loading

TotalVerb commented Jun 6, 2017

dalum commented Jun 6, 2017 • edited Loading

TotalVerb commented Jun 6, 2017

stevengj commented Jun 6, 2017 • edited Loading

dalum commented Jun 6, 2017 • edited Loading

stevengj commented Jun 6, 2017 • edited Loading

andreasnoack commented Jun 6, 2017

dalum commented Jun 6, 2017 • edited Loading

stevengj commented Jun 6, 2017

TotalVerb commented Jun 6, 2017 • edited Loading

stevengj commented Jun 6, 2017 • edited Loading

dalum commented Jun 6, 2017 • edited Loading

TotalVerb commented Jun 6, 2017 • edited Loading

TotalVerb commented Jun 6, 2017 • edited Loading

stevengj commented Jun 6, 2017 • edited Loading

stevengj commented Jun 6, 2017

StefanKarpinski commented Jun 6, 2017

TotalVerb commented Jun 6, 2017

stevengj commented Jun 6, 2017

StefanKarpinski commented Jun 6, 2017

andreasnoack commented Jun 6, 2017 • edited Loading

dalum commented Jun 7, 2017 • edited Loading

andreasnoack commented Jan 16, 2018 • edited Loading

dalum commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017 •

edited

Loading

TotalVerb commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

andreasnoack commented Jun 6, 2017 •

edited

Loading

dalum commented Jun 7, 2017 •

edited

Loading

andreasnoack commented Jan 16, 2018 •

edited

Loading