-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lrtest
for model families with dispersion
#261
Conversation
`lrtest` relied on the deviance rather than the log-likelihood, which is not correct for model families where a dispersion parameter needs to be taken into account. Scaling the deviance would be more efficient than computing the log-likelihood, but there is currently no generic API for this and this may not work for non-GLM models, so simply call `loglikelihood`. We could imagine defining a `likelihoodratio(m1, m2) = loglikelihood(m1) - loglikelihood(m2)` method that packages could override for performance, but this may not be worth it. Also make the check that more complex models have a strictly better fit than simpler nested ones less strict. The more complex model may have the same deviance, and due to approximations it may even have a slightly higher deviance.
@palday I've noticed that MixedModels.jl uses the deviance for |
src/lrtest.jl
Outdated
for i in 2:length(ll) | ||
if ((forward && ll[i-1] > ll[i]) || | ||
(!forward && ll[i-1] < ll[i])) && | ||
ll[i-1] ≉ ll[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually part of my earlier comment but got lost because the remainder of the comment was beefy. Should we allow the user to provide a tolerance for this check? We already have an atol
argument used for checking whether the models are nested, perhaps it would make sense to reuse that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah why not. I've pushed a commit to do that. BTW I discovered that the condition was backwards!
@nalimilan we should change (and will 😉), although currently we don't support GLMMs with a dispersion parameter: and so we don't have to worry about the comparison to the GLM deviance. (The problem here with deviance vs. loglikelihood is actually deeply intertwined with the problem of fitting GLMMs with a dispersion parameter.) For LMM, it doesn't matter since the "deviance" is indeed just the objective, which is -2 loglikelihood. (In practice, we can actually compute -2 loglikelihood directly and then use that to compute the loglikelihood.) |
Can you make a release if you're OK with the PR? I won't be on my computer for the next two weeks. |
lrtest
relied on the deviance rather than the log-likelihood, which is not correct for model families where a dispersion parameter needs to be taken into account. Scaling the deviance would be more efficient than computing the log-likelihood vialoglikelihood
, but there is currently no generic API for this and this may not work for non-GLM models.We could imagine defining a
likelihoodratio(m1, m2) = loglikelihood(m1) - loglikelihood(m2)
method that packages could override for performance, but this may not be worth it.Also make the check that more complex models have a strictly better fit than simpler nested ones less strict. The more complex model may have the same deviance, and due to approximations it may even have a slightly higher deviance.
Fixes JuliaStats/GLM.jl#490, #260.