You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the documentation, pool.scalar() will assume an infinite sample (n = Inf) by default. But that doesn’t match the actual behaviour, which results in a degrees of freedom of NaN. Example:
library(mice)
pool.scalar(13:17, 3:7)$df#> [1] NaN
The expected result would be approx. the df one gets when one uses a very large n, e.g.:
pool.scalar(13:17, 3:7, n = 10^6)$df
#> [1] 28.44315
The bug is caused by the barnard.rubin() function (which pool.scalar() uses internally):
When dfcom = Inf, (dfcom + 1) / (dfcom + 3) in the dfobs <- line equals Inf/Inf, which is NaN (not 1), and it is still NaN when multiplied by dfcom * (1 - lambda). It should instead be Inf.
Since the factor dfobs / (dfold + dfobs) in the last line is 1 whenver dfobs is Inf, the correct behaviour would be to just output dfold whenever dfcom is Inf (and perhaps the default value dfcom = 999999 should be changed to dfcom = Inf). For the above example, the resulting value is (exactly) 28.44444…, which is in line with what you get with the large value n = 10^6 (28.44315).
Summary:
Whenever dfcom = Inf, barnard.rubin() should output dfold instead of dfold * dfobs / (dfold + dfobs).
The default and arbitrary value of dfcom = 999999 should be changed to dfcom = Inf.
The text was updated successfully, but these errors were encountered:
According to the documentation,
pool.scalar()
will assume an infinite sample (n = Inf
) by default. But that doesn’t match the actual behaviour, which results in a degrees of freedom ofNaN
. Example:The expected result would be approx. the
df
one gets when one uses a very largen
, e.g.:The bug is caused by the
barnard.rubin()
function (whichpool.scalar()
uses internally):When
dfcom = Inf
,(dfcom + 1) / (dfcom + 3)
in thedfobs <-
line equalsInf/Inf
, which isNaN
(not1
), and it is stillNaN
when multiplied bydfcom * (1 - lambda)
. It should instead beInf
.Since the factor
dfobs / (dfold + dfobs)
in the last line is 1 whenverdfobs
isInf
, the correct behaviour would be to just outputdfold
wheneverdfcom
isInf
(and perhaps the default valuedfcom = 999999
should be changed todfcom = Inf
). For the above example, the resulting value is (exactly)28.44444…
, which is in line with what you get with the large valuen = 10^6
(28.44315
).Summary:
dfcom = Inf
,barnard.rubin()
should outputdfold
instead ofdfold * dfobs / (dfold + dfobs)
.dfcom = 999999
should be changed todfcom = Inf
.The text was updated successfully, but these errors were encountered: