Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that the matrix of (Sq)Mahalanobis is positive-definite? #174

Closed
devmotion opened this issue Aug 21, 2020 · 4 comments · Fixed by #221
Closed

Check that the matrix of (Sq)Mahalanobis is positive-definite? #174

devmotion opened this issue Aug 21, 2020 · 4 comments · Fixed by #221

Comments

@devmotion
Copy link
Member

Currently, upon creation of (Sq)Mahalanobis(Q) it is not checked if Q is actually positive (semi-)definite, and hence in particular symmetric, as shown in the following example:

julia> x = rand(2);

julia> sqmahalanobis(x, -x, Matrix(-I, 2, 2))
-1.0767684441054513

julia> mahalanobis(x, -x, Matrix(-I, (2, 2)))
ERROR: DomainError with -1.0767684441054513:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(::Symbol, ::Float64) at ./math.jl:33
 [2] sqrt at ./math.jl:573 [inlined]
 [3] Mahalanobis at /home/david/.julia/dev/Distances/src/mahalanobis.jl:88 [inlined]
 [4] mahalanobis(::Array{Float64,1}, ::Array{Float64,1}, ::Array{Int64,2}) at /home/david/.julia/dev/Distances/src/mahalanobis.jl:91
 [5] top-level scope at REPL[33]:1

However, the implementation of pairwise!(::SqMahalanobis, X, Y) is based on the identity (x-y)'*Q*(x-y) = x'*Q*x + y'*Q*y - x'*Q*y - y'*Q*x = x'*Q*x + y'*Q*y - 2 * x'*Q*y which holds only if Q is symmetric.
This leads to the surprising behaviour

julia> X = rand(2, 3);

julia> Y = rand(2, 4);

julia> Q = rand(2, 2);

julia> pairwise(SqMahalanobis(Q), X, Y; dims=2)
3×4 Array{Float64,2}:
 -0.0807376   0.0288264  -0.0246195  0.00500487
 -0.00275356  0.0291969   0.088852   0.0368224
 -0.0131345   0.0303736   0.0619867  0.0329054

julia> pairwise(SqMahalanobis(Array(Q')), X, Y; dims=2)
3×4 Array{Float64,2}:
 0.0284595   -0.0490699  0.0981837   0.000274926
 0.00392172  -0.154261   0.0615561  -0.0740337
 0.0106349   -0.144796   0.0579041  -0.0660179

julia> map(Iterators.product(eachcol(X), eachcol(Y))) do (x, y)
           z = x - y
           dot(z, Q, z)
       end
3×4 Array{Float64,2}:
 -0.026139     -0.0101218  0.0367821   0.0026399
  0.000584081  -0.0625319  0.0752041  -0.0186056
 -0.0012498    -0.0572112  0.0599454  -0.0165563

julia> map(Iterators.product(eachcol(X), eachcol(Y))) do (x, y)
           z = x - y
           dot(z, Q', z)
       end
3×4 Array{Float64,2}:
 -0.026139     -0.0101218  0.0367821   0.0026399
  0.000584081  -0.0625319  0.0752041  -0.0186056
 -0.0012498    -0.0572112  0.0599454  -0.0165563

although

julia> sqmahalanobis(X[:, 1], Y[:, 1], Q)
-0.026139038969147838

julia> sqmahalanobis(X[:, 1], Y[:, 1], Array(Q'))
-0.026139038969147838

Hence, due to the missing check and the inconsistencies in pairwise!, I am a bit worried that users might not be aware of these assumptions and their impact on pairwise! and might silently obtain incorrect values (see also JuliaGaussianProcesses/KernelFunctions.jl#154 (comment)).

Maybe it would be best to check if the matrix of (Sq)Mahalanobis is positive-definite in the constructor? Or at least mention this assumption prominently in the README? Alternatively, one could neglect the mathematical motivation completely and make the results consistent by not assuming symmetry in the pairwise! computations, i.e., one could just calculate x'*Q*x + y'*Q*y - x'*Q*y - y'*Q*x.

@alanderos91
Copy link

I encountered this issue because I tried to take advantage of sqmahalanobis in evaluating several quadratic forms with Q symmetric but not necessarily positive definite. As demonstrated in the previous post, pairwise! will truncate negative entries resulting from x'*Q*x in that case. It's probably fair to say that's user error on my part.

A warning in the README or documentation would be much appreciated. For example,

For distances of which a major part of the computation is a quadratic form (e.g. Euclidean, CosineDist, Mahalanobis), the performance can be drastically improved by restructuring the computation and delegating the core part to GEMM in BLAS. The use of this strategy can easily lead to 100x performance gain over simple loops (see the highlighted part of the table above).
Note: The restructured computations rely on certain assumptions that may produce different behavior compared to the looped version.

  • Mahalanobis, SqMahalanobis: ...

I'm not sure about enforcing positive (semi)definiteness. Checks like isposdef may fail when Q is ill-conditioned and the caller is in a better position to evaluate that property based on context.

@devmotion
Copy link
Member Author

However, the implementation of pairwise!(::SqMahalanobis, X, Y) is based on the identity (x-y)'Q(x-y) = x'Qx + y'Qy - x'Qy - y'Qx = x'Qx + y'Qy - 2 * x'Qy which holds only if Q is symmetric.

As mentioned above, the computations in Distances are valid if Q is symmetric. In contrast to positive (semi-)definiteness, this should be easy to check.

@dkarrasch
Copy link
Member

As mentioned above, the computations in Distances are valid if Q is symmetric.

Not exactly: we do take the max with 0 in pairwise, so we do assume positive-definiteness (and in particular symmetry) silently. I think we should, at least, have a symmetry check during construction. Since we do have the truncation and the Metric supertype, we should actually also even check positive-definiteness. That reminds me of a construction we have in LinearMaps.jl: we could have fields issymmetric with default issymmetric(Q) (and similar for isposdef), which can be overwritten by the user to avoid the check. It remains to decide what to do with the type of the *Mahalanobis metrics, but I don't think these play an important role since they don't participate in the generic colwise and pairwise. So a comment in the code might be enough.

@devmotion
Copy link
Member Author

Not exactly: we do take the max with 0 in pairwise, so we do assume positive-definiteness (and in particular symmetry) silently.

Yeah, I changed this in #180 some time after I had opened the issue here, so initially only symmetry was assumed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants