You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I faced a surprising issue today when using GLM for fitting a basic cubic polynomial. See my script and plot below. For my dataset, if I fit a quadratic polynomial there are no issues, however when I move up to a cubic polynomial, the intercept term is removed (apparently detected to be collinear). The good news is I can set dropcollinear=false as a workaround to resolve the issue, however given such a basic dataset I was very surprised to see this issue in the first place. As another check, I threw this same dataset to R's lm function and it handled it no problem without issue.
using Plots
using DataFrames
using GLM
x=[0.0,16.54252507,36.85132953,58.06647333,85.62460607,123.8759051,174.8138864,238.2577034,312.5741294,385.2595299,451.7838571,523.0189254,575.7680507,621.6300705,677.641035]
y=[2.802571697,2.607979564,2.403339032,2.202060006,2.010878422,1.813030653,1.60853479,1.400739363,1.209780073,1.012102908,0.800961457,0.603279074,0.405503624,0.204338282,0.0]
data = DataFrame(x=x, y=y)
plt = scatter(x,y; label="data")
ols1 = lm(@formula(y ~ x + x^2), data)
plot!(plt, x, predict(ols1); label="quadratic fit")
ols2 = lm(@formula(y ~ x + x^2 + x^3), data)
plot!(plt, x, predict(ols2); label="cubic fit")
ols2fixed = lm(@formula(y ~ x + x^2 + x^3), data; dropcollinear=false)
plot!(plt, x, predict(ols2fixed); label="fixed cubic fit")
ols2 variable results:
ols2fixed variable resulets:
Potentially related issues: #426 and #420
...while this may be related to the previous issues, I feel this warranted a new issue given the severity of the problem with such a simple and standard use case.
The text was updated successfully, but these errors were encountered:
Ok, after reviewing #426 further, I think this is indeed a duplicate, sorry about that!
Also, I see that on the dev version, QR is available instead of cholesky. I tried the following and it does seem to resolve the issue with no need to drop the collinear check:
I think given the severity of the issue, it may make sense to make QR the default or turn off the dropcollinear by default, or something else entirely in order to improve the out of box alignment with other software (in R, python, excel, etc). My line of thinking is that the current defaults are just too conservative and it's apparently too easy to get false positives in the collinearity check...
I faced a surprising issue today when using GLM for fitting a basic cubic polynomial. See my script and plot below. For my dataset, if I fit a quadratic polynomial there are no issues, however when I move up to a cubic polynomial, the intercept term is removed (apparently detected to be collinear). The good news is I can set
dropcollinear=false
as a workaround to resolve the issue, however given such a basic dataset I was very surprised to see this issue in the first place. As another check, I threw this same dataset to R'slm
function and it handled it no problem without issue.ols2
variable results:ols2fixed
variable resulets:Potentially related issues: #426 and #420
...while this may be related to the previous issues, I feel this warranted a new issue given the severity of the problem with such a simple and standard use case.
The text was updated successfully, but these errors were encountered: