-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Evaluation distance to set #1023
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1023 +/- ##
========================================
- Coverage 95.19% 95% -0.2%
========================================
Files 100 100
Lines 11279 11544 +265
========================================
+ Hits 10737 10967 +230
- Misses 542 577 +35
Continue to review full report at Codecov.
|
IMO for this to be generally useful it needs to support tolerances in some form. |
Agreed, I wasn't sure yet how to propagate them, |
I would prefer a function distance(x::T, set::LessThan{T}) where {T}
return max(zero(T), x - set.upper)
end
function distance(x::T, set::Integer) where {T}
return abs(x - round(Int, x))
end
function distance(x::Vector{T}, set::SOS1{T}) where {T}
_, index = findmax(abs.(x))
return sqrt(sum(x[i]^2 for i = 1:length(x) if i != index))
end Feasibility checkers could then use these distances and some notion of tolerances. |
a notion of distance makes sense for convex sets, integer sets get more ambiguous |
Sure. But if we have a notion of belonging to a set with a tolerance, then we need some corresponding distance to check against. For some sets, e.g., SOSII, it isn't immediately obvious what a good distance is. L2 norm of the elements excluding the two largest adjacent absolute values? We might even consider returning a vector to feasibility, rather than the norm. So a PSD constraint is the vector of eigenvalues That would let the feasibility checker choose an appropriate norm as well. In the SOSII case, it isn't apparent if you want to use the L2 norm. |
That can be the distance from Then, as odow seems to suggest, which norm to use could be up to the user. |
One (probably obvious) point, maybe useful for the docs: for a constraint Thus, here we are really addressing the question of "how close is this constraint to being satisfied". (which is 100% fine by me) |
just to be sure: you mean we deal with distance in the image space, and not in the input space right? |
One more corner case, in the case of dimension mismatch, I simply returned false for set belonging. When using a distance, this would be +Inf I guess |
+1 |
Why not throw a |
Not a fan of proliferating thrown errors all over the place, but yeah seems appropriate with this change |
I've started a re-write up to |
note here that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer that we define a distance_from_set(x, set)
function and include them along-side the set definitions in set.jl
. Then it's obvious which sets have distances and which don't.
If x
is inside the set, the distance should be 0. We shouldn't measure distance from infeasibility.
@odow the _distance function remains internal and computed the unsigned distance to avoid checking if >= 0 in every single method. |
I understand the current split. I'm suggesting a public method for each set, with no re-direction to the zero-checker. Maybe a few more lines of code, but easier to understand for each set in isolation. |
Ok yes I see your point, will do that. I still think we should keep the non-checked version, which would be comparable to a |
The zero check is essentially free. No one should be using this in any performance-critical code, and even if they are, |
the bother with this is mostly that it means maintaining boilerplate over all sets instead of just at one place. That's why one function could do the dimension check, possibly throwing an error, call the set-specific method and return the cleaned-up version, otherwise these three steps have to be copy-pasted all over the place |
weird sets can define their distance in whatever "best" way for them (indicator constraints are an example) |
Open questions for distances to set: |
Yes, presumably we have a fallback like: function distance_to_set(::AbstractDistanceFunction, x, s)
return distance_to_set(DefaultDistance(), x, s)
end Note that |
Another issue comes from complementarity constraints:
|
The main reason why complementary bounds are defined separately is to avoid issues like the following, where declare bounds twice. model = Model()
@variable(model, 1 <= x <= 2)
@constraint(model, 2x - 1 ⟂ { x >= 0}) It seems reasonable that we wouldn't implement the |
Yes agreed, even though it makes Complements a bit of a corner case for many things outside of distance, because of this non-self-contained constraint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need to nail down the theoretical concept of the distance we are measuring, particularly for some of the conic sets with strict inequalities.
For some sets you can get closed-form expressions for the minimum L2 distance (e.g., #1023 (comment)), and I think we should use these where possible.
src/sets.jl
Outdated
@@ -1,6 +1,8 @@ | |||
# Sets | |||
|
|||
# Note: When adding a new set, also add it to Utilities.Model. | |||
import LinearAlgebra | |||
using LinearAlgebra: dot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why import dot
but not norm2
? Just prefix everything with LinearAlgebra.
src/sets.jl
Outdated
distance_to_set(v, s) | ||
|
||
Compute the distance of a value to a set. | ||
For some vector-valued sets, can return a vector of distances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still an issue.
|
||
Distance function used to evaluate the distance from a point to a set. | ||
New subtypes of `AbstractDistance` must implement fallbacks for sets they don't cover and implement | ||
`distance_to_set(::Distance, v, s::S)` for sets they override the distance for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were providing the AbstractDistance
fallback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still an issue.
mistake on my part this should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were providing the AbstractDistance fallback?
we can provide one d(::AbstractDistance, v, s) = d(::DefaultDistance, v, s) but not be more specific because of method ambiguity
src/sets.jl
Outdated
t = v[1] | ||
xs = v[2:end] | ||
result = LinearAlgebra.norm2(xs) - t | ||
return max(result, zero(result)) # avoids sqrt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the comment # avoids sqrt
for?
I have a general unresolved concern over our choice of distance. It seems like this is the "epigraph violation" distance, which is not the same as the Euclidean distance to the set. What is the precedent for how other solvers measure constraint feasibility? (@chriscoey / @lkapelevich how does Hypatia?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally use the epigraph violation d = x1 - norm(x[2:end], 2)
, mainly because
- it's easy to compute
- it tells by how much the conic constraint should be "relaxed" for the point to be feasible (where "relaxed" means that you shift the cone's apex)
- if the SOC constraint is written as
x2^2 + ... xn^2 <= x1^2
, the epigraph violation is the square root of this quadratic constraint's absolute violation.
Looks like COSMO uses euclidean distance for computing projections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about sets like the exponential cone needing domain constraints like y > 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the comment # avoids sqrt for?
legacy comment, just removed. For domain constraints it can be an open question, because some parts of the distance may be non-computable / infinite or complex if the domain is not respected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so for conic sets "Epigraph violation or Inf
if point fails feasibility bounds" seems like an easily computable and sensible definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you call it EpigraphDistance or something like...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For norm two distances you can use what proximal solvers do:
https://web.stanford.edu/~boyd/papers/pdf/prox_algs.pdf page 183 (section 6.3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, SOC is easy, PSD you just need eigen decomposition, Power and Exponential are well posed even with the extra constraints, but need a Newton...
src/sets.jl
Outdated
u = v[2] | ||
xs = v[3:end] | ||
return LinearAlgebra.norm2( | ||
(max(-t, zero(t)), max(-u, zero(u)), max(dot(xs,xs) - 2 * t * u)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to think this through a little. What space is this a distance in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core issue here is that the cone is characterized by three constraints.
With vector-distance, this would have been:
max(-u, 0)
max(-v, 0)
max(-2tu + norm_squared(x) , 0)
since we don't have that, the l2-norm of these three epigraph violations is the thing computed here.
return d | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this function is correct. Something like:
function distance_to_set(::DefaultDistance, v::AbstractVector{T}, ::SOS1{T}) where {T <: Real}
_check_dimension(v, s)
_, i = findmax(abs.(v))
return LinearAlgebra.norm2([v[j] for j = 1:length(v) if j != i])
end
My comment here isn't actually very helpful but: Hypatia doesn’t use a notion of distance to set anywhere. It’s tough because there are many notions of distance and the right one depends on context. Also not all conic sets are typically defined by epigraph/hypographs, e.g. the PSD cone or the set of nonnegative polynomials or the copositive cone. |
Also for many sets the Euclidean distance can be hard to compute (eg for exponential cone in general, though some "cases" are just violation on variable nonnegativity) |
I will put it here, so it does not gets lost in the inner conversation: For norm two distances you can use what proximal solvers do: So, SOC is easy, PSD you just need eigen decomposition, Power and Exponential are well posed even with the extra constraints, but need a Newton... |
Many of these projections are implemented here: https://github.com/kul-forbes/ProximalOperators.jl/tree/master/src/functions |
Considering this PR is approaching the longest PR in the history of MOI (which @matbesancon also holds first and second place in #709 and #877), maybe this is a sign it should be in a package first. Then @matbesancon can experiment with the different set definitions in peace, not worry about pulling in heavy dependencies for the proximal operators or the SVD, and see what works. Edit: I will note, beginning the PR with "Fairly straightforward," was an understatement 😆 |
do you mean the distances themselves or the advanced features? Dear 2-month-ago self, what what you done? |
I mean the distances, the feasibility checker, everything. Call it This approach worked for |
I should have a skeleton more or less there for Friday's meeting |
So, with the heart broken, we can close this PR which is replaced by https://github.com/matbesancon/MathOptSetDistances.jl |
Fairly straightforward, just like
eval_variables
evaluates the value of a function with a given value mapping of the variables, this PR adds the evaluation of values belonging to a set.I reused the
Base.in
function, which seemed appropriate here since it's not used elsewhere with this signature.One thing to work out as pointed by @mtanneau is the propagation of absolute and relative tolerance