User supplied sparse Hessian is converted to dense matrix. #1106

rsenne · 2024-10-27T13:57:59Z

Currently I'm trying to solve an optimization problem wherein I have a Hessian I know to be sparse. Thus, I have coded my analytical Hessian to return a sparse matrix for use in Newton's method. I have determined that in fact, despite returning a sparse Hessian, it will end up being represented as a dense matrix. I've determined this as the optimization routine ends up spending a lot of time performing a cholesky decomposition in update_state!:

function update_state!(d, state::NewtonState, method::Newton)
    # Search direction is always the negative gradient divided by
    # a matrix encoding the absolute values of the curvatures
    # represented by H. It deviates from the usual "add a scaled
    # identity matrix" version of the modified Newton method. More
    # information can be found in the discussion at issue #153.
    T = eltype(state.x)

    if typeof(NLSolversBase.hessian(d)) <: AbstractSparseMatrix
        state.s .= .-(NLSolversBase.hessian(d)\convert(Vector{T}, gradient(d)))
    else
        state.F = cholesky!(Positive, NLSolversBase.hessian(d))
        if typeof(gradient(d)) <: Array
            # is this actually StridedArray?
            ldiv!(state.s, state.F, -gradient(d))
        else
            # not Array, we can't do inplace ldiv
            gv = Vector{T}(undef, length(gradient(d)))
            copyto!(gv, -gradient(d))
            copyto!(state.s, state.F\gv)
        end
    end
    # Determine the distance of movement along the search line
    lssuccess = perform_linesearch!(state, method, d)

    # Update current position # x = x + alpha * s
    @. state.x = state.x + state.alpha * state.s
    lssuccess == false # break on linesearch error
end

An example of a MRE is as follows:

using Optim, SparseArrays, LinearAlgebra

# Example problem: minimize f(x) = (x₁ - 1)² + 100(x₂ - x₁²)²
function rosenbrock_f(x)
    return (x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2
end

function rosenbrock_g!(g, x)
    g[1] = 2 * (x[1] - 1) - 400 * x[1] * (x[2] - x[1]^2)
    g[2] = 200 * (x[2] - x[1]^2)
    return g
end

function rosenbrock_h!(h, x)
    # Return sparse Hessian directly
    return h .= sparse([1, 1, 2, 2],  # rows
                 [1, 2, 1, 2],   # cols
                 [2 - 400 * x[2] + 1200 * x[1]^2,  # values
                  -400 * x[1],
                  -400 * x[1],
                  200], 
                 2, 2)  # matrix dimensions
end

# Initial point
x0 = [-1.0, 1.0]

# Set up the optimization problem
od = TwiceDifferentiable(rosenbrock_f, rosenbrock_g!, rosenbrock_h!, x0)

# Run the optimization
result = optimize(od, x0, Newton())

This is obviously a silly example as the Hessian is not sparse in this example, but it still will produce the results I'm seeing i.e., the Hessian will be represented as dense, and so as opposed to a direct linear solve will perform a cholesky decomposition.

The text was updated successfully, but these errors were encountered:

pkofod · 2024-10-29T08:07:56Z

Thank you. This would be because

od = TwiceDifferentiable(rosenbrock_f, rosenbrock_g!, rosenbrock_h!, x0)

does not accurately reflect your hessian format. does it help if you read the test here? https://github.com/JuliaNLSolvers/NLSolversBase.jl/blob/master/test/sparse.jl We should write this somewhere.

rsenne · 2024-10-29T12:27:24Z

Yes reading that test is very helpful, I ended up reading through the NLSolversBase code and came to a similar conclusion. I definitely agree having this written somewhere would be good. I'd be happy to add this to the documentation.

The only other interesting aspect of this problem I was having is that, when I finally figured out how to properly format the problem, the convergence was acting in a bizarre way, where the optimization would return as successful, but the actual solution would be identical to the initial solution. The problem i was working with was strongly convex, so this was pretty strange. I got around it by setting a strict convergence criteria using Optim.Options() which fixed it. For some additional context you might enjoy this discourse thread..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User supplied sparse Hessian is converted to dense matrix. #1106

User supplied sparse Hessian is converted to dense matrix. #1106

rsenne commented Oct 27, 2024

pkofod commented Oct 29, 2024

rsenne commented Oct 29, 2024 •

edited

Loading

User supplied sparse Hessian is converted to dense matrix. #1106

User supplied sparse Hessian is converted to dense matrix. #1106

Comments

rsenne commented Oct 27, 2024

pkofod commented Oct 29, 2024

rsenne commented Oct 29, 2024 • edited Loading

rsenne commented Oct 29, 2024 •

edited

Loading