-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define dot product between vectors of arrays/numbers #22220
Conversation
Make CoreIO thread-safer
Add a missing return in doInitialization
Load module on remote workers from master (v2)
turn on deprecation warnings for `type` and `immutable`
give a syntax error for repeated keyword args. fixes #16937
fix #21848, bug in widening done by limit_type_depth
fix #20575, syntax error for juxtaposing a string literal
Build fixes for ASAN
Use where syntax in umfpack.jl
downloads of nightlies
fix #21906, `ccall` causes unnecessary variable `Box`
Create show_supertypes function
Fix pool lookup function
This is undefined behavior in LLVM and can actually segfault on real hardware including x86 in certain cases. Also add missing endianess test since it is changed in this commit.
base/digits: allow negative bases
@TotalVerb But isn't it the case that the direct product of a 3 by 1 matrix with a 2 by 2 matrix is a 6 by 2 matrix? Multiplying this with a 2 by 6 matrix from the left should give a 2 by 2 matrix, not a number. I am not arguing that this should be the behaviour of |
I don't understand what you mean about |
julia> A = [rand(2, 2) for _ in 1:3]
3-element Array{Array{Float64,2},1}:
[0.684755 0.372397; 0.165985 0.0307143]
[0.60091 0.894754; 0.0742568 0.532859]
[0.76396 0.898669; 0.159446 0.626533]
julia> A' * A
3.896401569691669
julia> A' * A == sum(dot(Ai, Ai) for Ai in A) == dot(A, A)
true
julia> A' * A == sum(Ai' * Ai for Ai in A)
false |
I have to agree with you that |
In any case, that seems like a somewhat separate issue? |
I agree that The notion of a vector of linear operators is, I would argue, a commonly established concept in physics, and probably applied mathematics too. The divergence of a vector field is exactly an operator which has the property of being a row vector and where the dot notation is used in a mathematical context exactly because of its similarities. The fact that it returns a real number when acting on a real vector field is coincidental, as it would in general return a complex number for a complex vector field. Or a quarternion for a quarternion vector field. Or a matrix for a matrix vector field. Now, as pointed out by @andreasnoack in #22240, the fact that |
@eveydee, the problem with interpreting a vector of matrices as a linear operator is that this is inconsistent with how Julia's julia> A = [rand(2,2) for i = 1:4]
4-element Array{Array{Float64,2},1}:
[0.366133 0.289976; 0.123663 0.41303]
[0.737389 0.398943; 0.102958 0.326483]
[0.737347 0.742327; 0.868316 0.846149]
[0.410302 0.429769; 0.474719 0.750022]
julia> A * [3,4]
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
[1] *(::Array{Array{Float64,2},1}, ::Array{Int64,1}) at ./linalg/rowvector.jl:184 If How the adjoint works really has to be consistent with how |
That is because we have been sloppy when implementing This is Julia 0.3, by the way julia> x = [randn(2,2) for i in 1:3]
3-element Array{Array{Float64,2},1}:
2x2 Array{Float64,2}:
0.491661 -0.697108
0.00846169 -0.195587
2x2 Array{Float64,2}:
0.242481 0.648218
-0.567836 -0.224404
2x2 Array{Float64,2}:
0.729065 -0.90256
0.799737 2.11797
julia> dot(x,x)
2x2 Array{Float64,2}:
-0.263727 -2.7643
2.26912 3.4786 and it got changed accidentally as mentioned in #22240. As It is not that returning a matrix doesn't work. I already mentioned the limitation in #22220 (comment) but in most cases, it just works. E.g. here is Gram-Schmidt julia> Base.:(*)(x::Vector{<:Array}, α::Array) = [xx*α for xx in x]
julia> mydot(x,y) = sum(xx'yy for (xx,yy) in zip(x,y))
mydot (generic function with 1 method)
julia> function myGS(V::Vector{Vector{Matrix{Float64}}})
U = similar(V)
U[1] = V[1]
for j in 2:length(V)
U[j] = V[j] - sum(U[i]*inv(mydot(U[i], U[i]))*mydot(U[i], V[j]) for i in 1:j-1)
end
return U
end
myGS (generic function with 1 method)
julia> V = [[randn(2,2) for i in 1:3] for i in 1:3];
julia> U = myGS(V);
julia> mydot(U[1], U[2])
2×2 Array{Float64,2}:
3.60822e-16 0.0
1.11022e-15 -1.77636e-15
julia> mydot(U[1], U[3])
2×2 Array{Float64,2}:
-2.22045e-16 4.71845e-16
8.88178e-16 7.77156e-16
julia> mydot(U[2], U[3])
2×2 Array{Float64,2}:
1.11022e-15 -2.15106e-16
-8.88178e-16 -7.63278e-17 |
@stevengj I understand what you are getting at, but your example highlights a completely different use-case, which I don't think has a good solution. This is not at all what was intended to be achieved, and I wouldn't expect people to assume that to work in any instance. This PR/issue is about dotting vectors in spaces of equal dimensions, i. e., achieving compact notation by writing sums of (matrix) products as dot products of vectors. And it looks like things worked that way in Julia 0.3, based on @andreasnoack's comment. |
The basic question is whether one sees @andreasnoack, the problem with |
@stevengj To me, at least one of these behaviors ought to be wrong: julia> A = [rand(2, 2) for _ in 1:3]
3-element Array{Array{Float64,2},1}:
[0.843197 0.790402; 0.293537 0.717]
[0.842407 0.207064; 0.685534 0.616951]
[0.679417 0.632756; 0.462531 0.390315]
julia> B = [rand(2, 2) for _ = 1:3, __ = 1:3]
3×3 Array{Array{Float64,2},2}:
[0.77686 0.873093; 0.518896 0.314283] … [0.997188 0.951079; 0.00939076 0.37915]
[0.418401 0.300341; 0.892219 0.776044] [0.64111 0.536536; 0.0546541 0.607846]
[0.352997 0.667995; 0.242118 0.931147] [0.352873 0.416895; 0.448481 0.515657]
julia> A' * B
1×3 RowVector{Array{Float64,2},ConjArray{Array{Float64,2},1,Array{Array{Float64,2},1}}}:
[2.12329 2.49799; 1.94104 2.24253] … [1.86831 2.30368; 1.35971 1.97476]
julia> A' * B[:,1]
4.365816195613023 If we want |
@TotalVerb, neither of those behaviors seem wrong if we interpret This is no different from |
Why must the scalar be a number? If you have a matrix field, |
@stevengj I suppose that what may be wrong here is again the printing of |
In other words, perhaps more clearly, to me there is a semantics difference between
and
The |
@eveydee, scalars don't have to be numbers, but arbitrary square matrices don't form a field (or skew field) because nonzero matrices aren't invertible. And we can't tell from the type |
@TotalVerb, the point is that |
Square matrices do form a ring, however, and matrices of matrices can be viewed as forming a module (c.f. vector space) with a "scalar" – i.e. ring – elements that are matrices themselves. |
@stevengj I understand that. But when ' is resolved to be a dual, the elements of that dual vector (as from |
@StefanKarpinski, but you can't infer from the type that the matrices are square, or of the same size, so why would you assume this interpretation if you saw a |
I wouldn't – it would have to be implied by an operation on them, like a dot product. |
Hey, I pointed out that first. I'm just saying that it's not a big deal. I don't think the consumers of this feature would have a problem with the potential singularity here. (Matrix consumers are used to dealing with potential singularity and most matrices are full rank anyway). The main point here is that defining |
@stevengj If the matrices aren't of compatible sizes for a matrix product, you shouldn't be able to julia> A = [rand(2, 2), rand(3, 2), rand(2, 3)]
julia> B = [rand(2, 2), rand(2, 3), rand(3, 2)]
julia> C = [rand(2, 2), rand(4, 1), rand(2, 3)]
julia> A'A
3.199407765200927
julia> sum(trace(a'a) for a in A) == A'A
true
julia> sum(a'a for a in A) # The error here is the sum, not the matrix product
ERROR: DimensionMismatch("dimensions must match")
julia> A'B
5.432131012504026
julia> sum(trace(a'b) for (a,b) in zip(A,B)) == A'B
ERROR: DimensionMismatch("A has dimensions (3,2) but B has dimensions (3,2)")
julia> A'C
ERROR: DimensionMismatch("dot product arguments have lengths 6 and 4") The fact that julia> vecdot(A[2], B[2])
2.5458122251380035
julia> vecdot(A[2]', B[2])
2.358516933607439 A whole different issue is in the not so unlikely case that you want to take the kronecker product of the matrices. In quantum physics, this is the case for the spin orbit coupling, where you have a dot product between vectors of operators in angular momentum and spin spaces respectively. In this particular case the matrices are of different sizes, but for spin-spin couplings in general, the matrices may have identical sizes, so mere size-incompatibility cannot be used to infer which is the desired behaviour. I guess the correct way to treat this is to introduce a new |
JuliaLang/LinearAlgebra.jl#496 made me revisit this issue and I realized that I recently read a formalized argument for my view on For the, admittedly, few people interested, the (semi) inner product in https://arxiv.org/pdf/0905.3509.pdf is like our norm(x::Vector{<:AbstractMatrix}) = norm(dot(x,x)) So the version of Cauchy-Schwarz in the paper can be expressed as julia> norm(x::Vector{<:Matrix}) = norm(dot(x,x))
norm (generic function with 18 methods)
julia> (⩽)(A::Matrix,B::Matrix)= all(eigvals(B - A) .>= 0)
⩽ (generic function with 1 method)
julia> x = [rand(2,2) for i in 1:2]
2-element Array{Array{Float64,2},1}:
[0.158424 0.747836; 0.541273 0.677574]
[0.136556 0.956337; 0.468317 0.302034]
julia> y = [rand(2,2) for i in 1:2]
2-element Array{Array{Float64,2},1}:
[0.454078 0.173277; 0.628935 0.317456]
[0.270871 0.828004; 0.0367388 0.96777]
julia> (x⋅y)*(y⋅x) ⩽ norm(y)^2*(x⋅x)
true |
This is useful in quantum physics and potentially other fields, where equations containing sums of operators (matrices) can often be simplified by writing them as vector dot products instead. By allowing
dot
to work for vectors of matrices and numbers, it is possible to achieve a more direct correspondence between mathematical notation and a corresponding program. The application which motivated this PR was the Pauli vector which is often dotted with a magnetic field (B
, a "normal" vector) to produce a sum of matrices:The reason for adding
dot(::AbstractArray, ::AbstractArray)
with conjugate transposition of the first argument is to preserve the correspondence between complex conjugation and conjugate transposition of numbers and matrices, where hermitian matrices are analogous to real numbers and antihermitian are analogous to imaginary numbers.