Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that binomlogpdf returns non-positive values, t distributions with infinite parameter are supported, and add integration tests #126

Merged
merged 6 commits into from
Sep 29, 2021

Conversation

devmotion
Copy link
Member

@devmotion devmotion commented Sep 28, 2021

It seems that #125 broke some tests in Distributions (e.g. https://github.com/JuliaStats/Distributions.jl/pull/1387/checks?check_run_id=3732360108) since binomlogpdf(5, 0.0, 0) and binomlogpdf(5, 1.0, 5) return values that are slightly above zero (around 4e-16). The PR sets an upper bound for these values and adds integration tests with Distributions.jl to ensure that we avoid such breakages in the future.

Edit: Additionally, the PR fixes the (log)pdf of a t distribution with infinite parameter which Rmath handles and Distributions tests for.

@codecov-commenter
Copy link

codecov-commenter commented Sep 28, 2021

Codecov Report

Merging #126 (7f08d4f) into master (e3fe89d) will increase coverage by 0.14%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #126      +/-   ##
==========================================
+ Coverage   37.41%   37.55%   +0.14%     
==========================================
  Files          12       12              
  Lines         417      418       +1     
==========================================
+ Hits          156      157       +1     
  Misses        261      261              
Impacted Files Coverage Δ
src/distrs/binom.jl 100.00% <100.00%> (ø)
src/distrs/tdist.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e3fe89d...7f08d4f. Read the comment docs.

@devmotion devmotion changed the title Ensure that binomlogpdf returns non-positive values and add integration tests Ensure that binomlogpdf returns non-positive values, t distributions with infinite parameter are supported, and add integration tests Sep 28, 2021
@andreasnoack
Copy link
Member

I'm not sure why those examples would give nonzero values. Aren't the two examples

julia> Distributions.betalogpdf(1, 6, 0.0) - log(6)
0.0

julia> Distributions.betalogpdf(6, 1, 1.0) - log(6)
0.0

@devmotion
Copy link
Member Author

This is what one gets with StatsFuns 0.9.10 in which betalogpdf uses the Rmath implementation. With the Julia implementation in StatsFuns 0.9.11 one gets the approx 4e-16 I mentioned above:

julia> Distributions.betalogpdf(1, 6, 0.0) - log(6)
4.440892098500626e-16

julia> Distributions.betalogpdf(6, 1, 1.0) - log(6)
4.440892098500626e-16

Apparently the Rmath implementation of the density of the Beta distribution handles the cases x = 0 and x = 1 explicitly and hence returns exactly log(6) for the logpdf: https://github.com/JuliaStats/Rmath-julia/blob/5c5dfd6baca358103fbb47cc03dc0ecee04fb1ff/src/dbeta.c#L71 and https://github.com/JuliaStats/Rmath-julia/blob/5c5dfd6baca358103fbb47cc03dc0ecee04fb1ff/src/dbeta.c#L76

Whereas the new Julia implementation returns -SpecialFunctions.logbeta(1, 6) and -SpecialFunctions.logbeta(6, 1) (the other terms are exactly zero) which is slightly different from log(6):

julia> -SpecialFunctions.logbeta(1, 6)
1.7917594692280554

julia> -SpecialFunctions.logbeta(6, 1)
1.7917594692280554

julia> log(6)
1.791759469228055

I guess the more general (and better?) fix would be to handle this special case (one argument equal to 1) in the implementation of SpecialFunctions.logbeta.

@devmotion
Copy link
Member Author

I opened a PR with the more general fix of logabsbeta (and beta): JuliaMath/SpecialFunctions.jl#349

@@ -0,0 +1,52 @@
name: IntegrationTest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It runs integration tests. Currently only with Distributions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same action is eg used by ChainRulesCore, DiffRules, and many Turing and SciML packages.

@andreasnoack
Copy link
Member

I've been back and forth on this tonight. I'm not sure it's appropriate to add the extra branches in logabsbeta since it doesn't really improve the precision for that function in isolation. Hence, I think the fix here might be better with the fix in this PR to ensure that probabilities are one at most.

Any idea why the PoissonBinomial tests are failing in Distributions. The tolerance there does look a bit tight.

@devmotion
Copy link
Member Author

I'm not sure it's appropriate to add the extra branches in logabsbeta since it doesn't really improve the precision for that function in isolation.

To me it seems it does improve the precision. Currently, we have e.g.

julia> beta(1.0, 200.0)
0.005000000000000002

julia> logabsbeta(1.0, 4.0)
(-1.3862943611198908, 1)

julia> -log(4.0)
-1.3862943611198906

whereas with the PR to SpecialFunctions one gets

julia> beta(1.0, 200.0)
0.005

julia> logabsbeta(1.0, 4.0)
(-1.3862943611198906, 1)

julia> -log(4.0)
-1.3862943611198906

However, I think the min(0, ...) fix in this PR is useful in addition since it guarantees that the resulting (log) probabilities are in the correct domain also for other values and even if any other floating point issues might occur.

@devmotion
Copy link
Member Author

devmotion commented Sep 28, 2021

Any idea why the PoissonBinomial tests are failing in Distributions. The tolerance there does look a bit tight.

I think it is just caused by slightly different values from the Julia implementation of binomlogpdf and/or floating point errors. The tests would pass with atol=1e-14, so I think we should just adjust the tolerances in Distributions.

@devmotion
Copy link
Member Author

This PR to Distributions fixes the remaining test errors: JuliaStats/Distributions.jl#1398

@andreasnoack
Copy link
Member

To me it seems it does improve the precision.

My point is that it's just one ULP so less than the general precision for the function and therefore not worth the branches. Those branches then happen to then give exact zeros in a downstream function but I don't thing that is really the right evaluation metric for these kinds of functions. Here, the simpler approach is to just cap the log-probabilities. Alternatively, we could add branches in the binomial pdf for the degenerate cases.

@andreasnoack andreasnoack merged commit f13b618 into master Sep 29, 2021
@andreasnoack andreasnoack deleted the dw/binom_special branch September 29, 2021 08:09
@devmotion
Copy link
Member Author

An additional advantage is that it is much faster for this special case but does not impact performance significantly in the other cases. With SpecialFunctions#master I get

julia> using SpecialFunctions, BenchmarkTools

julia> @btime beta(1.0, 200.0);
  47.907 ns (0 allocations: 0 bytes)

julia> @btime logabsbeta(1.0, 4.0);
  60.584 ns (0 allocations: 0 bytes)

julia> @btime beta(3.4, 200.0);
  66.080 ns (0 allocations: 0 bytes)

julia> @btime logabsbeta(3.4, 4.0);
  78.036 ns (0 allocations: 0 bytes)

whereas with the PR I get

julia> using SpecialFunctions, BenchmarkTools

julia> @btime beta(1.0, 200.0); # seems it is compiled away completely...
  0.019 ns (0 allocations: 0 bytes)

julia> @btime logabsbeta(1.0, 4.0);
  1.451 ns (0 allocations: 0 bytes)

julia> @btime beta(3.4, 200.0);
  67.098 ns (0 allocations: 0 bytes)

julia> @btime logabsbeta(3.4, 4.0);
  79.259 ns (0 allocations: 0 bytes)

The timings are consistent with

julia> @btime inv(200.0);
  0.018 ns (0 allocations: 0 bytes)

julia> @btime -log(4.0);
  1.450 ns (0 allocations: 0 bytes)

We also handle special cases in other functions, e.g., https://github.com/JuliaMath/SpecialFunctions.jl/blob/0c4181254cf664923c1504de599c5d1f2c243831/src/gamma.jl#L234-L235, hence to me it does not seem completely unusual to exploit this mathematical property and handle this case separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants