-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArgumentError: F test is only valid for nested models #489
Comments
For simmilar data I have: should values for Subject: 15 should be dropped from results? (RegressionTables.jl, not working with this results because P is NaN)
|
This looks like two questions.
If you want them removed from the table, then you should remove the columns from the data before fitting. |
Hi! I can't remove data from a table. Also if I remove Subject: 15 this problem appears for subject 14 or 12 ets... Bioequivalence is a common task in clinical trials and now it can be done partially because we can't Main question: is it possible to avoid this for similar models:
If it is not possible and not in to-do, please, let's say - "yes, it is not possible", because some people have many unsuccessful attempts to do this. |
I think we're talking past each other:
In other words, there is no unexpected behavior. We are not going to change For the F-test, I will also note that the F-test (like the likelihood ratio test) is mathematically only well defined for nested models, so we're not going to add support for computing the F-test on non-nested models. I think the bigger latent issue is numerical tolerance and floating point error. In the case of the F-test, the nesting detection may fail for truly nested models because of floating point error. Likewise for collinearity/rank deficiency, a model may be numerically rank deficient, even if it is not "platonically" / truly rank deficient. For the nesting detection provided by the F-test, we provide the For linear regression, there are two options. One, if you know that your model is not rank deficient, then you can set |
Sorry, I really don't understand: For this models I have error:
But they are nested... or not? using |
Can you post your Note that nesting and rank deficiency is one of those problems that is well-defined in theory, but not always in practice due to the vagaries of floating point. |
|
Yes, I understand :) and want to explore this problem because I try to make a "recipe" for students how to switch from R to Julia in common tasks for bioequivalence and any other crossover design trial. I think that this model matrix not make quite correctly because we have |
May be this can help: https://github.com/PharmCat/edu/blob/main/ipynb/bioequivalence.ipynb |
The problem is definitely numerical instability. I have now tested this example on:
and am unable to replicate. I strongly suspect that some of the SIMD instructions on the Ryzen are causing this problem -- we've seen similar issues with AVX instructions on some Intel chips. If you set |
In this case results is definitly wrong:
Also for this model I have (same on Intel processor: Intel(R) Core(TM) i7-7600U):
for Intel make versioninfo() 10 min later... |
|
I think we should switch to using the QR decomposition by default in GLM.jl 2.0 as it seems to be more reliable. @palday If a predictor was dropped due to collinearity, in theory it adds no information at all, so we could consider that models are nested if their predictors are equal, including the dropped one even if its coefficient is 0? We could also add an argument to disable the nesting check, for people to use at their own risk. |
@nalimilan completely agree on the QR decomposition. I wanted to mock up how to do it for this example, but it was actually not nearly as polished as the Cholesky stuff and I had more pressing commitments. Regarding the dropped predictor ... it depends! Concretely, the rank deficiency/collinearity detection drops the predictor before any fitting happens and so actually fits the reduced model and just re-inserts the pivoted coefficients as zeros. So in a certain sense, the reduced model is the one being compared for nesting purposes. I mostly agree with this -- the full, unpivoted model isn't well-defined because one of the assumptions of OLS is that the model is of full column rank, so instead pivoting fits the closest well-defined model and inserts zeroed coefficients to highlight how to obtain this model from the originally specified model. The difficulty is that the theoretical and numerical assessments of "full column rank" can differ. There is another way to view this: dropping predictors effectively reduces the model degrees of freedom by the number of dropped predictors. Now when you consider the a likelihood-ratio or F-test, how does this impact the (denominator) degrees of freedom for the comparison?
Yes, and we could do this in a minor release. But I don't know if that will really solve the problem here since there appears to be other numerical issues before that point. |
I think for current case DOF will be correct. PS Example above is a basic crossover case. As factor |
I try to use GLM for standart bioequivalence testing for crossover design to get ANOVA table (protocol defined test) using
ftest
and have:models:
Is any way to perfom ANOVA using GLM with crossover design?
Example data:
The text was updated successfully, but these errors were encountered: