Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add Hessians for ScaledInterpolation and tests #269

Merged
merged 9 commits into from
Nov 24, 2018
Merged

[RFC] Add Hessians for ScaledInterpolation and tests #269

merged 9 commits into from
Nov 24, 2018

Conversation

dkarrasch
Copy link
Contributor

@dkarrasch dkarrasch commented Nov 16, 2018

This adds hessian functionality to ScaledInterpolations. The inference issue seems to be due to symmatrix, as far as I can tell, but inference hasn't been tested on hessian for BSplineInterpolation either, so that may not be too bad.

The current implementation avoids unnecessary (by symmetry) rescaling operations and allocation of arrays (as would have been necessary in any recursive implementation I could come up with).

EDIT: (sort of) fixes #268.

@codecov-io
Copy link

codecov-io commented Nov 16, 2018

Codecov Report

Merging #269 into master will decrease coverage by 0.29%.
The diff coverage is 26.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #269     +/-   ##
=========================================
- Coverage   47.84%   47.54%   -0.3%     
=========================================
  Files          21       21             
  Lines        1045     1060     +15     
=========================================
+ Hits          500      504      +4     
- Misses        545      556     +11
Impacted Files Coverage Δ
src/extrapolation/extrapolation.jl 42.85% <0%> (-2.91%) ⬇️
src/scaling/scaling.jl 45.04% <36.36%> (-0.96%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3909cfb...877e59f. Read the comment docs.

* fixed rescale_hessian
* added tests
Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice overall. I think you can basically use broadcasting to do the rescaling, and I think that will fix the inference. (It's hopefully much easier, too.)

I didn't realized we were failing to test inference on hessian, but it is inferrable. While you're at it, could you perhaps add an @inferred or two to the relevant code? If you notice places where it is not, you can use @test_broken unless you happen to care enough about making it inferrable (i.e., it has a huge performance penalty for real-world code that you personally run) to fix it.

@boundscheck (checkbounds(Bool, sitp, xs...) || Base.throw_boundserror(sitp, xs))
xl = maybe_clamp(sitp.itp, coordslookup(itpflag(sitp.itp), sitp.ranges, xs))
h = hessian(sitp.itp, xl...)
return symmatrix(rescale_hessian_components(itpflag(sitp.itp), sitp.ranges, Tuple(h), size(h, 1)))
Copy link
Member

@timholy timholy Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need to use symmatrix again here? The calculation of the interpolation is expensive (4^N where N is the dimensionality), but the rescaling is only N^2. So you probably don't need to worry much about efficiency here. (The value of symmatrix is that you don't have to do redundant computations, but boy was it tricky to write.)

flags = getrest(flags)
ranges = Base.tail(ranges)
else
s1 = rescale_gradient_components(flags, ranges, Tuple(h[(idx-1)*n+idx:idx*n]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is presumably a major contributor to non-inferrability.

end

function rescale_hessian_components(flags, ranges, h, n)
hs = ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "build-up-the-output" pattern is not usually inferrable, whereas the "process the head and then move on to the tail" pattern is. Reason: inference has to be able to prove that it will converge, and if arguments keep getting bigger then inference starts to worry that it is a non-terminating problem and will bail by returning Any. (Better to finish and yield a non-inferrable result than to never terminate.)

@dkarrasch
Copy link
Contributor Author

dkarrasch commented Nov 16, 2018

Thanks, @timholy, for the review and the comments! If I understood correctly, then it would be okay to go through the flags, to see which ones have NoInterp, collect the range steps of all other axes into steps and do

h ./ steps ./ steps'

? After I saw how carefully you avoid any vector allocation in the gradient rescaling, I thought this might be crucial, and did not go the "easy" way. The remaining question is how to make absolutely sure that the result is numerically symmetric. This is the reason why I tried to operate only on the lower triangle and built the final result by means of symmatrix. I'll take another look after the weekend.

Edit: I think the collecting-the-steps-of-the-ranges can be done in a "process the head and then move on to the tail" pattern. Once one has this, it's really just the line above.

@timholy
Copy link
Member

timholy commented Nov 16, 2018

Yes, I think that's right. And yes, you do have to pick out the NoInterp ones, but fortunately that's removing them from steps rather than h which is hopefully pretty straightforward.

With regards to symmetry, ooh, I see your point, there could be a roundoff issue. Would h ./ (steps .* steps') be safe?

@dkarrasch
Copy link
Contributor Author

I'm not 100% sure about the symmetry, but tests pass now, including inference! 🎉 I haven't get to checking for missing inference tests, yet.

works if the point is inbounds, throws an informative error otherwise
@dkarrasch
Copy link
Contributor Author

The current status is as follows:

  • Inference for hessian works whenever there is no NoInterp axis, but doesn't otherwise; comments are very welcome.
  • I added a hessian(::Extrapolation,...) which works as expected when the point in question is in bounds, and throws an informative error otherwise; this fixes my original issue, since I created an Extrapolationobject "accidentally" via the convenience function CubicSplineInterpolation, but did not intend to evaluate out of bounds.
  • Somehow hessian gives poor results at the boundaries of the interval in the test; that's why the test is restricted to xs[6:end-5] or so. This cannot be due to the rescaling and should be investigated further, perhaps in a different issue?

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work!

With regards to the accuracy at the boundary, presumable this is not a problem but it could be worth checking. The logic here is that as you get closer to the boundary, your choice of boundary condition grows ever more important, and there is no reason that you should recapitulate an arbitrary analytic function if that function doesn't satisfy the same behavior at the boundary. If you really want to be sure that this is behaving properly, options are:

  • choose a function that has the same boundary conditions as the interpolation
  • compare against ForwardDiff.hessian(itp, x).

But you're not required to do either, I'd merge this as-is. Perhaps we should fix the StaticArrays issue first, though.


function rescale_hessian_components(flags, ranges, h)
steps = SVector(get_steps(flags, ranges))
return h ./ (steps .* steps')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out this is the source of non-inferrability, and it's really the fault of StaticArrays. You can replicate the problem with

using StaticArrays, Test
h = SMatrix{1,1}([1.0])
steps = SVector(1.0)
testinf(h, steps) = h ./ (steps .* steps')

julia> testinf(h, steps)
1×1 SArray{Tuple{1,1},Float64,2,1}:
 1.0

julia> @inferred testinf(h, steps)
ERROR: return type SArray{Tuple{1,1},Float64,2,1} does not match inferred return type Any
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope at none:0

I can look into fixing this.

(This also implies that NoInterp is a red herring.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if y in (1,2)
@test_broken h = @inferred(Interpolations.hessian(sitp, x, y))
h = Interpolations.hessian(sitp, x, y)
@test ≈(h[1], -f(x,y), atol=0.05)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as-is, but just FYI you can also write @test h[1] ≈ -f(x,y) atol=0.05 if you like that better. (This file probably uses the "call" form for historical reasons, from before that syntax was possible, and was updated by search/replace.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm familiar with this syntax, but didn't want to change style halfway. I have changed it now to this "modern" form for better readability in the test files related to this PR.

@@ -67,6 +67,25 @@ end
end
end

@inline function hessian(etp::AbstractExtrapolation{T,N}, x::Vararg{Number,N}) where {T,N}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, thanks for adding it.

@dkarrasch
Copy link
Contributor Author

Many thanks, @timholy. Regarding the boundary accuracy, what confused me is the fact that the true functions do match the boundary conditions of the interpolant:

xs = -pi:2pi/10:pi
f1(x) = sin(x)
f2(x) = cos(x)
f3(x) = sin(x) .* cos(x)
f(x,y) = y == 1 ? f1(x) : (y == 2 ? f2(x) : (y == 3 ? f3(x) : error("invalid value for y (must be 1, 2 or 3, you used $y)")))
ys = 1:3
A = hcat(map(f1, xs), map(f2, xs), map(f3, xs))
itp = interpolate(A, (BSpline(Cubic(Periodic(OnGrid()))), NoInterp()))
sitp = scale(itp, xs, ys)

I realized that this is a boundary-only issue after plotting the interpolation-hessian versus the analytical one. In the interior, they match really well, but close to the periodic boundary, there is some ridiculous zigzag-pattern, way off of what it should be.

@dkarrasch
Copy link
Contributor Author

I'll push updated (inference) tests when the StaticArrays fix is tagged.

@timholy
Copy link
Member

timholy commented Nov 19, 2018

Just FYI this is proving more difficult than expected (see JuliaLang/METADATA.jl#19433 and JuliaLang/METADATA.jl#19596). I think I know what I need to fix in StaticArrays to break the logjam.

@timholy
Copy link
Member

timholy commented Nov 22, 2018

The new StaticArrays is finally tagged. Thanks for your patience.

@dkarrasch
Copy link
Contributor Author

Inference in the NoInterp case still fails... :-( I had a look at code_warntype, but couldn't make much sense out of it. It seems, as if the type ambiguity occurs in the hessian(::BSpline) call, which can't be because tests in the normal hessian test file confirm that the type is correctly inferred.

@dkarrasch
Copy link
Contributor Author

Hallelujah, tests pass!!! Thanks for your patience and support, @timholy.

@timholy timholy merged commit d8fbd11 into JuliaMath:master Nov 24, 2018
@timholy
Copy link
Member

timholy commented Nov 24, 2018

Thanks for a first-rate contribution, @dkarrasch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hessian of CubicSplineInterpolation broken?
3 participants