Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ScalarNonlinearFunction support #2059

Merged
merged 35 commits into from
May 29, 2023
Merged

Add ScalarNonlinearFunction support #2059

merged 35 commits into from
May 29, 2023

Conversation

odow
Copy link
Member

@odow odow commented Dec 8, 2022

Overview

This PR is an implementation of the idea in #846, and is based on our work at the JuMP-level: jump-dev/JuMP.jl#3106.

Preview links

There are two key bits of documentation to read. The actual implementation is very straight forward:

One complicating issue is how to deal with user-defined function. I've added a new attribute:

I've also added ListOfSupportedNonlinearOperators <: AbstractModelAttribute and UnsupportedNonlinearOperator <: UnsupportedError

Solver support

Ipopt is adds support for ScalarNonlinearFunction in jump-dev/Ipopt.jl#346.

What to do with NLPBlock

There's an open question about what we do with the existing NLPBlock stuff. The easiest answer is just to leave it, and say that solvers should support both, although they can error if both appear in the same model.

It'll be a harder decision at the JuMP-level, but we can wait until this is merged to figure that out.

Blockers

There's nothing major blocking this from my end. It's actually pretty lightweight. The harder part is JuMP and documenting the two different nonlinear approaches. But this can be added without breaking anything because it is strictly a feature addition.

Needs:

  • Tests to improve code coverage

@odow
Copy link
Member Author

odow commented Dec 13, 2022

Here's a current demo with the Ipopt PR:

julia> import Ipopt

julia> const MOI = Ipopt.MOI
MathOptInterface

julia> function main()
           model = Ipopt.Optimizer()
           x = MOI.add_variable(model)
           con_f = MOI.ScalarNonlinearFunction{Float64}(:^, Any[x, 2])
           MOI.add_constraint(model, con_f, MOI.LessThan(1.0))
           MOI.set(model, MOI.ObjectiveSense(), MOI.MIN_SENSE)
           my_f(x) = (x - 0.5)^2
           MOI.set(model, MOI.UserDefinedFunction(:my_f, 1), (my_f,))
           obj_f = MOI.ScalarNonlinearFunction{Float64}(:my_f, Any[x])
           MOI.set(model, MOI.ObjectiveFunction{typeof(obj_f)}(), obj_f)
           MOI.optimize!(model)
           return MOI.get(model, MOI.VariablePrimal(), x)
       end
main (generic function with 1 method)

julia> main()
This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        1
Number of nonzeros in Lagrangian Hessian.............:        2

Total number of variables............................:        1
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        1
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        1

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  2.5000000e-01 0.00e+00 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  6.2500000e-02 0.00e+00 4.90e-01  -1.7 2.50e-01    -  1.00e+00 1.00e+00f  1
   2  1.1441994e-04 0.00e+00 1.74e-03  -1.7 2.39e-01    -  1.00e+00 1.00e+00h  1
   3  7.8304570e-06 0.00e+00 2.89e-04  -2.5 6.50e-02    -  1.00e+00 1.00e+00h  1
   4  1.5182438e-08 0.00e+00 2.74e-05  -3.8 2.72e-03    -  1.00e+00 1.00e+00h  1
   5  1.6255746e-12 0.00e+00 5.28e-08  -5.7 1.29e-04    -  1.00e+00 1.00e+00h  1
   6  2.8086636e-18 0.00e+00 6.35e-12  -8.6 1.29e-06    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 6

                                   (scaled)                 (unscaled)
Objective...............:   2.8086636097379302e-18    2.8086636097379302e-18
Dual infeasibility......:   6.3507046478545526e-12    6.3507046478545526e-12
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.5091160166026672e-09    2.5091160166026672e-09
Overall NLP error.......:   2.5091160166026672e-09    2.5091160166026672e-09


Number of objective function evaluations             = 7
Number of objective gradient evaluations             = 7
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 7
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 7
Number of Lagrangian Hessian evaluations             = 6
Total seconds in IPOPT                               = 0.045

EXIT: Optimal Solution Found.
0.4999999983240932

Things are plumbed end-to-end, including user-defined functions.

I'm still undecided on how to implement nonlinear parameters and expressions.

Copy link
Contributor

@dourouc05 dourouc05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor comments in the code. I'm wondering what the API would look like to rewrite such functions using bridges, for instance for CP. (Or this issue has been discussed elsewhere and I'm not aware of it.)

src/Nonlinear/parse.jl Outdated Show resolved Hide resolved
src/attributes.jl Outdated Show resolved Hide resolved
@odow
Copy link
Member Author

odow commented Dec 14, 2022

I'm wondering what the API would look like to rewrite such functions using bridges, for instance for CP.

I think, instead of a bridge, it'd likely be a transformation layer, like Dualization.Optimizer.

We don't really have tools for bridging based on the content of a constraint, only on the type, so you'd likely need to be able to see the entire problem in order to construct a valid transformation.

@blegat
Copy link
Member

blegat commented Dec 15, 2022

We don't really have tools for bridging based on the content of a constraint, only on the type, so you'd likely need to be able to see the entire problem in order to construct a valid transformation.

I don't think it's about content vs type. If a transformation can be done independently for each constraint then it can be done by a bridge.

For the non-breaking path, I don't need we need to be non-breaking for the solvers. Once we have the NL API set up, we can drop support for NLPBlock in Ipopt and tag Ipopt v2.0. What we need is to continue supporting solvers requiring NLPBlock from MOI until MOI v2.0.
One way we could do that is to have an MOI layer that transforms ScalarNonlinearFunction into NLPBlock.
In MOI.instantiate, if supports(::Optimizer, ::NLPBlock) is true then we add this layer.

I still think transforming from NL expression to gradient/hessian oracle dealing with AD, UserDefinedFunction, etc... is best done by LazyBridgeOptimizer instead of each NL solver as detailed in #846. Do you still agree with this long term goal ?

@odow
Copy link
Member Author

odow commented Dec 15, 2022

Do you still agree with this long term goal ?

No. Computing derivatives really needs the entirety of the problem in one go, whereas the bridges work on a constraint-by-constraint basis. Dealing with the derivatives is not complicated (look how small the Ipopt PR is jump-dev/Ipopt.jl#346) because we've abstracted all of the helpers into MOI.Nonlinear.Model.

@blegat
Copy link
Member

blegat commented Dec 16, 2022

Computing derivatives really needs the entirety of the problem in one go, whereas the bridges work on a constraint-by-constraint basis.

Why computing derivatives cannot be done on a constraint-by-constraint basis ?

@odow
Copy link
Member Author

odow commented Dec 16, 2022

Why computing derivatives cannot be done on a constraint-by-constraint basis ?

Because that's a muuuuuch larger change to our AD engines. It also makes it a lot more complicated to deal with sparsity etc. We don't really want to have to deal with 10,000 constraints, each with 10,000 instances of an AD tape. JuMP's reverse diff amortizes a lot of allocations over the constraints. It also makes it hard to have shared common subexpressions.

@mlubin
Copy link
Member

mlubin commented Dec 16, 2022

Shared sub-expressions is a good reason. Sparsity of the hessian of the lagrangian is another one, although this could probably be refactored if we really wanted to.

@blegat
Copy link
Member

blegat commented Dec 17, 2022

Could an MOI layer that does the transformation from MOI.ScalarNonlinearFunction to MOI.NLPBlock be a good first step ? I would be needed for retro-compatibility anyway and we can see if it can produce oracles as MOI.AbstractScalarFunction later and if it can be done as bridged even later

@odow odow marked this pull request as draft January 11, 2023 19:53
@frapac
Copy link
Contributor

frapac commented Jan 11, 2023

we can see if it can produce oracles as MOI.AbstractScalarFunction later and if it can be done as bridged even later

To me, the question is: what would be the use case for this?

Usually, the nonlinear solver requires only access to the full Jacobian and to the full Hessian, discarding the individual constraints. Seeing all the constraints as a single multivalued function has also its advantages, such as:

  • we can evaluate the full Jacobian J efficiently using forward-mode AD to evaluate p vector products J v, with p a number of colors associated to the sparsity pattern of the Jacobian. If we look at the individual constraints, evaluating the full Jacobian would require m evaluations of reverse-mode AD to evaluate the gradients of the m constraints one by one.
  • we can evaluate the Hessian of the Lagrangian efficiently using forward-over-reverse AD, using the same coloring trick to compress the number of Hessian-vector product required.

@odow
Copy link
Member Author

odow commented Jan 24, 2023

We should check and confirm support for complex-valued nonlinear functions (polynomials of complex variables)

@odow odow changed the title WIP: explore ScalarNonlinearFunction support Add ScalarNonlinearFunction support Jan 30, 2023
@odow odow marked this pull request as ready for review January 31, 2023 21:06
@odow
Copy link
Member Author

odow commented Feb 1, 2023

Here's a proof of concept of PATHSolver solving a nonlinear complementarity problem with VectorNonlinearFunction-in-Complements: chkwon/PATHSolver.jl#82

Still an open question as to what the proper definition is. We can also revert the commit and start it in a separate PR if we decide to merge ScalarNonlinearFunction first.

src/functions.jl Outdated Show resolved Hide resolved
@chriscoey
Copy link
Contributor

Still an open question as to what the proper definition is.

For VNF here you mean? This seems like the obvious approach to me - what alternatives are there?

@odow
Copy link
Member Author

odow commented Feb 1, 2023

what alternatives are there

Some sort of vector-valued expression tree? Not sure. You could imagine A * x being VectorNonlinearExpr(:*, Any[A, x]).

@odow odow changed the title Add ScalarNonlinearFunction support Add {Vector,Scalar}NonlinearFunction support Feb 2, 2023
@odow
Copy link
Member Author

odow commented Feb 15, 2023

#2094 is a resolution to the "how to we deal with parameters" question. Ipopt would declare support for MOI.Parameter, and then it could add parameters to the MOI.Nonlinear.Model object, and replace any parameter variables in the expression with the Nonlinear.ParameterIndex. It doesn't resolve the nonlinear subexpression question, but perhaps that's for the solver or MOI.Nonlinear.Model to deal with.

@odow
Copy link
Member Author

odow commented Feb 19, 2023

Parameter is added, so I'll have a go at updating this PR to use it.

Copy link
Member

@mlubin mlubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to merge.

In terms of breaking this in the future, I think we should do our best to stick to semver. So let's test this out and see what needs to change. Then MOI 2.0 can include any breaking changes to this API, plus deleting the old NLPBlockData interface?

@odow
Copy link
Member Author

odow commented May 28, 2023

So let's test this out and see what needs to change. Then MOI 2.0 can include any breaking changes to this API, plus deleting the old NLPBlockData interface?

Sounds good. The actual interface is pretty minimal, so I think there are probably a bunch of missing methods and things that we'll need to add, rather than breaking changes.

@odow
Copy link
Member Author

odow commented May 29, 2023

I'll merge now, but hold off from tagging for a couple of days to see if anything else pops up once I update other other packages to MathOptInterface#master.

@odow odow merged commit a189a89 into master May 29, 2023
@odow odow deleted the od/nlp-expr branch May 29, 2023 01:58
@odow odow mentioned this pull request Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project: next-gen nonlinear support Issues relating to nonlinear support
Development

Successfully merging this pull request may close these issues.

6 participants