Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes initial value for reduce in logsumexp #82

Merged
merged 4 commits into from
Dec 16, 2019

Conversation

itsdfish
Copy link
Contributor

@itsdfish itsdfish commented Dec 9, 2019

This PR request fixes the initial value for reduce in logsumexp. This error was discovered in the following issue in Turing (see final comment).

src/basicfuns.jl Outdated
@@ -191,7 +191,7 @@ function logsumexp(X)
reduce(logaddexp, X)
end
function logsumexp(X::AbstractArray{T}; dims=:) where {T<:Real}
u = reduce(max, X, dims=dims, init=log(zero(T)))
u = reduce(max, X, dims=dims, init=zero(T))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right, X can be all negative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching the error. Do you know what the correct initial value should be?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not have init at all. Just check if X is empty and return log(zero(T)) then. Otherwise, keep going with the reduce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I made the changes you suggested and the results hold for both negative and positive values. Please review if you can.

@mohamed82008
Copy link

Would be nice to add a ForwardDiff test to catch such error if it happens again in the future.

@nalimilan
Copy link
Member

This isn't correct when dims != :. Can you describe the problem you're trying to fix (not knowing anything about what ForwardDiff is trying to do in that issue)? Anyway we'll need to add a test for it.

@mohamed82008
Copy link

mohamed82008 commented Dec 11, 2019

The problem is that log(ForwardDiff.Dual{Nothing}(0.0, 0.0)) is ForwardDiff.Dual{Nothing}(-Inf, NaN). The NaN then propagates to u when doing the reduce.

@mohamed82008
Copy link

ForwardDiff.gradient(logsumexp, rand(3)) now gives a vector of NaN.

@nalimilan
Copy link
Member

I guess we could use another way to get -Inf of the right type, e.g. convert(T, -Inf). Would that work? Or is there another solution which is OK for ForwardDiff?

@mohamed82008
Copy link

Yup init=convert(T, -Inf) works like a charm!

@mohamed82008
Copy link

I am curious though, why is the solution here not correct when dims != :? Could you please give an example?

@nalimilan
Copy link
Member

Yup init=convert(T, -Inf) works like a charm!

Cool. Actually this won't work for integer inputs, so better use something like oftype(log(zero(T)), -Inf). Would you update the PR to use that? It would also be nice to have a test or at least a comment explaining why we don't use the simplest syntax.

I am curious though, why is the solution here not correct when dims != :? Could you please give an example?

julia> reduce(max, zeros(0, 2), dims=1, init=log(zero(Int)))
1×2 Array{Float64,2}:
 -Inf  -Inf

julia> reduce(max, zeros(0, 2), dims=1)
ERROR: ArgumentError: reducing over an empty collection is not allowed

@mohamed82008
Copy link

mohamed82008 commented Dec 11, 2019

@nalimilan the reduce won't be called if isempty(X) is true which it is in case of zeros(0, 2). Oh but the shape will still be wrong and it is not type stable.

@mohamed82008
Copy link

mohamed82008 commented Dec 11, 2019

I don't have write access to this PR, so @itsdfish please use:

u = reduce(max, X, dims=dims, init=oftype(log(zero(T)), -Inf))

instead of

u = reduce(max, X, dims=dims, init=log(zero(T)))

@itsdfish
Copy link
Contributor Author

The updates seem to work on my end.

@nalimilan
Copy link
Member

nalimilan commented Dec 12, 2019

Thanks. One last check: is it really expected/correct that log(ForwardDiff.Dual{Nothing}(0.0, 0.0)) gives ForwardDiff.Dual{Nothing}(-Inf, NaN)? Where does that NaN come from? Maybe we should discuss this with ForwardDiff developers?

Anyway, if we make this change, we should add a comment to ensure somebody doesn't revert this without knowing the reason why it exists.

EDIT: I've found http://www.juliadiff.org/ForwardDiff.jl/stable/user/advanced/#Fixing-NaN/Inf-Issues-1, so this seems to be expected.

@itsdfish
Copy link
Contributor Author

itsdfish commented Dec 12, 2019

To make things easier, I added you both as collaborators in case you want to make changes.

@mohamed82008
Copy link

Perhaps merge this?

@nalimilan nalimilan merged commit 180a2ae into JuliaStats:master Dec 16, 2019
@nalimilan
Copy link
Member

#83

@nalimilan nalimilan mentioned this pull request Dec 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants