-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected behaviour from mice.mids when argument newdata is used #313
Comments
@gerkovink definitely not deliberate/expected behaviour. I'll have a look at it and post an update when I find the culprit. |
Having looked at this in some more detail, maybe it isn't quite as unexpected. The main problem arises from when random numbers are generated, and how many of them. Difference in results between
|
@stefvanbuuren @gerkovink what's your view on this? Would you be happy for me to create a PR, moving the setting of |
The original proposal foresees the use of two streams of random numbers (in steps 1 and 5). These weren't implemented (because it's tricky), but perhaps that idea could help to achieve exact reproduction. But should we do this? As long as the statistical properties do not change, it's probably not hugely important to have exact reproducibility of the imputed values. Do we have a case where "the problem" affects practice and interpretation? |
I vote that we don't do this. Everything is exactly reproducible in simulation given a session-dependent seed; this flags simulation purposes. For inferences, I agree with Stef: exact reproducibility of the code can be achieved again by setting a seed for the session and not being able to exactly reproduce the imputations by means of the |
Agree. Perhaps we can re-open if we obtain evidence that the issue affects practice. |
This is also my preferred option. The only change I would advocate for is to move the setting of the I will further try to finally try and create a short vignette for |
Yes, agree to move up |
Now merged. Thanks! |
@stefvanbuuren @prockenschaub
@paulinavonstackelberg and I seem to have stumbled upon some inconsistency in the usage of the
newdata
argument inmice.mids()
. Take e.g.Created on 2021-02-25 by the reprex package (v1.0.0)
There are two unexpected results and I would like to check if this is deliberate/expected behaviour:
imp1
andimp2
are not identical, yet they start from the same basis and have the same data. One is entered through the data/ignore argument. One though thenewdata
argument inmice.mids()
.imp2
andimp2b
are not identical, yetimp1
andimp1b
are.It seems that
mice.mids()
does not inherit the seed in themids
objectimp0
. If we fix it manually before runningmice.mids()
this inconsistency disappears, yet the regularmice.mids
object still differs from the one with thenewdata
argument:Created on 2021-02-25 by the reprex package (v1.0.0)
As it is currently programmed, using the
newdata
argument currently seems to yield some 'all-bets-are-offapproach to replicability. This makes one wonder if the inferences can then be trusted when
newdata` is used.The text was updated successfully, but these errors were encountered: