Is Mean for NegativeBinomial distribution correct? #455

dsyme · 2016-12-05T22:17:59Z

Playing around with distributions and it seems that the Mean of the NegativeBinomial distribution doesn't match the mean of samples.

open MathNet.Numerics
let b7 = Distributions.NegativeBinomial(2.0, 0.5) 

/// This gives about 4.0
[ for i in 0 .. 100 -> b7.Sample() ]  |> List.averageBy float

//This says 2.0
b7.Mean

cdrnet · 2016-12-05T22:39:00Z

That looks suspicious, thanks for reporting!

cdrnet · 2016-12-07T06:42:32Z

It turns out the mean is correct, but the samples return the number of trials (for reaching r successes) instead of the number of failures (until r successes are reached). Both are common definitions of the distribution, but they of course should be consistent within the implementation. We chose the second definition since it can be extended to all positive real numbers of r instead of just integers.

Quick workaround until it is fixed: subtract r (here: 2.0) from all the generated samples.

Unfortunately since r is real, we cannot actually subtract it from our samples since the discrete distribution interface assumes samples are integers. We may need to split this distribution if one accepting only integer r and thus also samples integers, and an extended one for real r with another interface.

thamilto · 2018-07-02T17:10:52Z

I'd like to point out that the quick work-around doesn't seem to work. In the example given by dsyme if I run this on my PC I get positive integer and 0 samples. I don't follow how this can be interpreted as "the number of trials (for reaching r successes)". Knocking off 2.0 from a 0 sample, regardless of types, doesn't seem to make sense.

Can someone elaborate?

vermorel · 2019-03-08T20:19:24Z

I might have pinpointed the problem in NegativeBinomial, the code is:

        static int SampleUnchecked(System.Random rnd, double r, double p)
        {
            var lambda = Gamma.SampleUnchecked(rnd, r, p);
            var c = Math.Exp(-lambda);
            var p1 = 1.0;
            var k = 0;
            do
            {
                k = k + 1;
                p1 = p1*rnd.NextDouble();
            }
            while (p1 >= c);
            return k - 1;
        }

The first code line should be replaced by

var lambda = Gamma.SampleUnchecked(rnd, r, (1 - p)/p);

Also, it would be nice to point-out in the code that the second part is actually drawing a Poisson sample (which could benefit from the optimized implementation in PoissonDistribution.cs which distinguish small lambdas from big lambdas).

tlatarche · 2021-08-17T08:14:57Z

Is there a plan to implement the fix suggested by @vermorel above? This would solve a similar issue I'm seeing.

EDIT: Correction - having given further thought it should be p/(1-p) not (1-p)/p

Arlofin · 2022-09-29T00:38:11Z

The sample method is simply incorrect, @tlatarche's proposal is correct.
Just to recall the relationships: With the chosen definition (r is number of successes, p is probability of success), the mean is given by (1-p)/p*r and the NB distribution can be expressed as Poisson distribution with parameter being a sample from the Gamma distribution with parameters r and p/(1-p).
The current sampling implementation can be interpreted as using instead a p', which is implicitly defined by p = p'/(1-p'). With a little bit of calculus, it can be shown that this leads to a sampling mean which differs by the true mean by a factor of 1-p, which is consistent with @dsyme's observation.

cdrnet added About Correctness BUG needs investigation labels Dec 5, 2016

cdrnet self-assigned this Dec 5, 2016

This was referenced Dec 7, 2016

Error in sampling from Negative Binomial #320

Open

Negative Binomial parameter wrong #263

Open

cdrnet added needs decision and removed needs investigation labels Dec 7, 2016

cdrnet modified the milestones: Numerics v4.0, Numerics v4.x Feb 8, 2018

cdrnet removed this from the Numerics v4.x milestone Jul 24, 2021

cdrnet removed the About Correctness label Jul 24, 2021

Arlofin mentioned this issue Sep 29, 2022

Fix and improve negative binomial distribution #960

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is Mean for NegativeBinomial distribution correct? #455

Is Mean for NegativeBinomial distribution correct? #455

dsyme commented Dec 5, 2016 •

edited

Loading

cdrnet commented Dec 5, 2016

cdrnet commented Dec 7, 2016

thamilto commented Jul 2, 2018 •

edited

Loading

vermorel commented Mar 8, 2019 •

edited

Loading

tlatarche commented Aug 17, 2021 •

edited

Loading

Arlofin commented Sep 29, 2022

Is Mean for NegativeBinomial distribution correct? #455

Is Mean for NegativeBinomial distribution correct? #455

Comments

dsyme commented Dec 5, 2016 • edited Loading

cdrnet commented Dec 5, 2016

cdrnet commented Dec 7, 2016

thamilto commented Jul 2, 2018 • edited Loading

vermorel commented Mar 8, 2019 • edited Loading

tlatarche commented Aug 17, 2021 • edited Loading

Arlofin commented Sep 29, 2022

dsyme commented Dec 5, 2016 •

edited

Loading

thamilto commented Jul 2, 2018 •

edited

Loading

vermorel commented Mar 8, 2019 •

edited

Loading

tlatarche commented Aug 17, 2021 •

edited

Loading