Skip to content

Commit

Permalink
QR decomposition method in GLM. (#507)
Browse files Browse the repository at this point in the history
* PowerLink refers to a class of techniques that use a power function

* added a few more testcases

* Fixing doctests for PowerLink

* Making subheading smaller than heading.

* Rounding off a test case result to 2 digits

Test was failing in Julia Nightly as: 

1) GLM.linkinv(InverseLink(), 10) was 0.01 while GLM.linkinv(PowerLink(-1), 10) was 0.010000000000000002
2) GLM.linkinv(InverseSquareLink(), 10) was -10.01 while GLM.linkinv(PowerLink(-2), 10) was -0.010000000000000002

Rounding off to 2 digits should solve this. 

Note: These tests were passing in other versions of Julia.

* adding Optim dependancy to docs

* No need to have optim as a dependancy for the entire package. Only need it for the docs

* Rounding off GLM.linkinv(PowerLink(-1), 10) and GLM.linkinv(PowerLink(-2), 10 to 2 digits

* Rounding off GLM.mueta(PowerLink(-1), 10) to make the test work on Julia Nightly

* Using inexact equality comparison instead of rounding.

* Apply suggestions from code review

Correcting order of PowerLink, ProbitLink, SqrtLink.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Apply suggestions from code review

Using ≈ instead of isapprox.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Apply suggestions from code review

Using alphabetical order in references.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update docs/src/examples.md

Writing the example description in plain text.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update docs/src/examples.md

Removing extra space to be consistent with the style.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Adding the missing comma.

* Update src/glmtools.jl

Using a better variable name for 1 by lambda.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/glmtools.jl

Using inline function.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Apply suggestions from code review

Making the doctest code concise.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Follwing alphabetical order in references

* Adding suggested changes by nalimilan

* Update test/runtests.jl

Removing the R code.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update test/runtests.jl

Removing test of hashes.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* replaced `isapprox` function by `≈` symbol whereever possible, added few more metrics to test, also replaced all `=` by `≈` for real numbers

* Update src/glmfit.jl

Since we are storing the link - the `GLM.Link` function can be defined uniformly for all link functions

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/glmfit.jl

Changing the sentence to make it compact

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* updated test cases, added a few more metrics to check, move link member after d (distribution) and changed in corresponding constructor, remove updateμ! function for PowerLink and modified general updateμ! function.

* Rephrased the description of PowerLink

* Update docs/src/examples.md

Adding break line and removing full stop

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Removed `GLM.Link(mm::AbstractGLM) = mm.l`, redefine defination of PowerLink and updated test cases

* update test case

* Update src/glmtools.jl

The suggestion is committed.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* changed the `inverselink` function for PowerLink as suggested in PR

* Add files via upload

NIST datasets for testing linear models

* added qrpred

* use qrpred instead of cholpred

* removed tests which use chol

* added qr decomp with weights

* added dropcollinear for qr decomp

* added pivot cholesky for pivot qr

* added dof for qr

* added chol for qr

* updated test cases related to QR method

* added float conversion

* fixed test

* updated QR decompositions

* removed redundant functions

* updating documention

* fixed bug

* refactored DensePredQR constructor

* changed DensePredQR constructor to take Abstract types

* removed unused functions

* removed unused functions

* updated doc string

* conditional pivoting in QR

* conditional QR pivoting

* updated example of Lineam model with QR decomposotion

* removed nasty.csv file and commented code

* Update src/GLM.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/lm.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* some changes suggested by @nalimilan

* Changed pivoted_qr to pivoted_qr!

* Changed :stable to :qr and :fast to  cholesky

* Changed in predict function in lm for qr method, also added some testcases

* re arranged all tests related to linear models (Cholesky and QR

* An intermidiate commit after merging with master branch

* without predict with qr test case. need to re-write predict function in lm

* updated example.md file

* Added test cases with predictions for QR method

* Removed some comments

* removed comments, replace Rsub\I by inv(Rsub)

* Update test/runtests.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update test/runtests.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update test/runtests.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update test/runtests.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/GLM.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/lm.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/lm.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/lm.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/lm.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update test/runtests.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* updated test cases, qrpred method

* removed extra lines in lm file

* updated predict! function in lm in compact form

* Optimizing the performnce for full rank matrix

* changed outer constructor to inner

* Incorporated QR decomposition method in glm related functions

* NegativeBinomial Parameter estimation test is failing in a few system. Trying a different minstepfac value

* NegativeBinomial Parameter estimation test is failing in a few system. Trying another smaller different minstepfac value

* NegativeBinomial Parameter estimation test is failing in a few systemse

* added one more example to show when QR method works better

* update for doc test failure

* Fixing doc test

* Fixing doc test failure

* Fixing doc test failure

* Fixing for doc test failure

* updated example for :qr method

* Updated example for QR decomposition

* Updated example for QR decomposition

* Updated example for QR decomposition

* Updated  instead of generic error

* Trying to resolve the conflict

* Updated example to show QR performs better

* Update src/GLM.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Update src/linpred.jl

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

* Updated test cases. Test cases with cholesky and qr methods are written in compact form

* Updated test case due to different pivot in different systems

* replaced equality check to egality check in glmfit

* updated example and added doctest

* updated example without doctest

* Avoiding multiple copyies of design matrix

* added one more example in qr vs cholesky

* updated example, remove unused constructor and added one more test case

* updated example to pass doctest

* updated example to pass doctest

* updated example to pass doctest

* updated example to pass doctest

* example

* updated example with rdchem data

* Update docs/src/examples.md

Co-authored-by: Bogumił Kamiński <bkamins@sgh.waw.pl>

---------

Co-authored-by: Mousum <mousum@spotle.ai>
Co-authored-by: ayushpatnaikgit <ayushpatnaik@gmail.com>
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Co-authored-by: Ayush Patnaik <u6012645@anu.edu.au>
Co-authored-by: harsharora21 <harsharora1337@gmail.com>
Co-authored-by: Bogumił Kamiński <bkamins@sgh.waw.pl>
  • Loading branch information
7 people authored May 4, 2023
1 parent 9f5276e commit a8016bd
Show file tree
Hide file tree
Showing 7 changed files with 679 additions and 237 deletions.
123 changes: 123 additions & 0 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,129 @@ julia> round.(vcov(ols); digits=5)
0.38889 -0.16667
-0.16667 0.08333
```
By default, the `lm` method uses the Cholesky factorization which is known as fast but numerically unstable, especially for ill-conditioned design matrices. Also, the Cholesky method aggressively detects multicollinearity. You can use the `method` keyword argument to apply a more stable QR factorization method.

```jldoctest
julia> data = DataFrame(X=[1,2,3], Y=[2,4,7]);
julia> ols = lm(@formula(Y ~ X), data; method=:qr)
LinearModel
Y ~ 1 + X
Coefficients:
─────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
─────────────────────────────────────────────────────────────────────────
(Intercept) -0.666667 0.62361 -1.07 0.4788 -8.59038 7.25704
X 2.5 0.288675 8.66 0.0732 -1.16797 6.16797
─────────────────────────────────────────────────────────────────────────
```
The following example shows that QR decomposition works better for an ill-conditioned design matrix. The linear model with the QR method is a better model than the linear model with Cholesky decomposition method since the estimated loglikelihood of previous model is higher.
Note that, the condition number of the design matrix is quite high (≈ 3.52e7).

```
julia> X = [-0.4011512997627107 0.6368622664511552;
-0.0808472925693535 0.12835204623364604;
-0.16931095045225217 0.2687956795496601;
-0.4110745650568839 0.6526163576003452;
-0.4035951747670475 0.6407421349445884;
-0.4649907741370211 0.7382129928076485;
-0.15772708898883683 0.25040532268222715;
-0.38144358562952446 0.6055745630707645;
-0.1012787681395544 0.16078875117643368;
-0.2741403589052255 0.4352214984054432];
julia> y = [4.362866166172215,
0.8792840060172619,
1.8414020451091684,
4.470790758717395,
4.3894454833815395,
5.0571760643993455,
1.7154177874916376,
4.148527704012107,
1.1014936742570425,
2.9815131910316097];
julia> modelqr = lm(X, y; method=:qr)
LinearModel
Coefficients:
────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────
x1 5.00389 0.0560164 89.33 <1e-12 4.87472 5.13307
x2 10.0025 0.035284 283.48 <1e-16 9.92109 10.0838
────────────────────────────────────────────────────────────────
julia> modelchol = lm(X, y; method=:cholesky)
LinearModel
Coefficients:
────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────
x1 5.17647 0.0849184 60.96 <1e-11 4.98065 5.37229
x2 10.1112 0.053489 189.03 <1e-15 9.98781 10.2345
────────────────────────────────────────────────────────────────
julia> loglikelihood(modelqr) > loglikelihood(modelchol)
true
```
Since the Cholesky method with `dropcollinear = true` aggressively detects multicollinearity,
if you ever encounter multicollinearity in any GLM model with Cholesky,
it is worth trying the same model with QR decomposition.
The following example is taken from `Introductory Econometrics: A Modern Approach, 7e" by Jeffrey M. Wooldridge`.
The dataset is used to study the relationship between firm size—often measured by annual sales—and spending on
research and development (R&D).
The following shows that for the given model,
the Cholesky method detects multicollinearity in the design matrix with `dropcollinear=true`
and hence does not estimate all parameters as opposed to QR.

```jldoctest
julia> y = [9.42190647125244, 2.084805727005, 3.9376676082611, 2.61976027488708, 4.04761934280395, 2.15384602546691,
2.66240668296813, 4.39475727081298, 5.74520826339721, 3.59616208076477, 1.54265284538269, 2.59368276596069,
1.80476510524749, 1.69270837306976, 3.04201245307922, 2.18389105796813, 2.73844122886657, 2.88134002685546,
2.46666669845581, 3.80616021156311, 5.12149810791015, 6.80378007888793, 3.73669862747192, 1.21332454681396,
2.54629635810852, 5.1612901687622, 1.86798071861267, 1.21465551853179, 6.31019830703735, 1.02669405937194,
2.50623273849487, 1.5936255455017];
julia> x = [4570.2001953125, 2830, 596.799987792968, 133.600006103515, 42, 390, 93.9000015258789, 907.900024414062,
19773, 39709, 2936.5, 2513.80004882812, 1124.80004882812, 921.599975585937, 2432.60009765625, 6754,
1066.30004882812, 3199.89990234375, 150, 509.700012207031, 1452.69995117187, 8995, 1212.30004882812,
906.599975585937, 2592, 201.5, 2617.80004882812, 502.200012207031, 2824, 292.200012207031, 7621, 1631.5];
julia> rdchem = DataFrame(rdintens=y, sales=x);
julia> mdl = lm(@formula(rdintens ~ sales + sales^2), rdchem; method=:cholesky)
LinearModel
rdintens ~ 1 + sales + :(sales ^ 2)
Coefficients:
───────────────────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────────────────
(Intercept) 0.0 NaN NaN NaN NaN NaN
sales 0.000852509 0.000156784 5.44 <1e-05 0.000532313 0.00117271
sales ^ 2 -1.97385e-8 4.56287e-9 -4.33 0.0002 -2.90571e-8 -1.04199e-8
───────────────────────────────────────────────────────────────────────────────────────
julia> mdl = lm(@formula(rdintens ~ sales + sales^2), rdchem; method=:qr)
LinearModel
rdintens ~ 1 + sales + :(sales ^ 2)
Coefficients:
─────────────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
─────────────────────────────────────────────────────────────────────────────────
(Intercept) 2.61251 0.429442 6.08 <1e-05 1.73421 3.49082
sales 0.000300571 0.000139295 2.16 0.0394 1.56805e-5 0.000585462
sales ^ 2 -6.94594e-9 3.72614e-9 -1.86 0.0725 -1.45667e-8 6.7487e-10
─────────────────────────────────────────────────────────────────────────────────
```


## Probit regression
```jldoctest
Expand Down
10 changes: 10 additions & 0 deletions src/GLM.jl
Original file line number Diff line number Diff line change
Expand Up @@ -91,13 +91,23 @@ module GLM
pivoted_cholesky!(A; kwargs...) = cholesky!(A, RowMaximum(); kwargs...)
end

@static if VERSION < v"1.7.0"
pivoted_qr!(A; kwargs...) = qr!(A, Val(true); kwargs...)
else
pivoted_qr!(A; kwargs...) = qr!(A, ColumnNorm(); kwargs...)
end

const COMMON_FIT_KWARGS_DOCS = """
- `dropcollinear::Bool=true`: Controls whether or not a model matrix
less-than-full rank is accepted.
If `true` (the default) the coefficient for redundant linearly dependent columns is
`0.0` and all associated statistics are set to `NaN`.
Typically from a set of linearly-dependent columns the last ones are identified as redundant
(however, the exact selection of columns identified as redundant is not guaranteed).
- `method::Symbol=:cholesky`: Controls which decomposition method to use.
If `method=:cholesky` (the default), then the `Cholesky` decomposition method will be used.
If `method=:qr`, then the `QR` decomposition method (which is more stable
but slower) will be used.
- `wts::Vector=similar(y,0)`: Prior frequency (a.k.a. case) weights of observations.
Such weights are equivalent to repeating each observation a number of times equal
to its weight. Do note that this interpretation gives equal point estimates but
Expand Down
28 changes: 24 additions & 4 deletions src/glmfit.jl
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,8 @@ function nulldeviance(m::GeneralizedLinearModel)
X = fill(1.0, length(y), hasint ? 1 : 0)
nullm = fit(GeneralizedLinearModel,
X, y, d, r.link; wts=wts, offset=offset,
dropcollinear=isa(m.pp.chol, CholeskyPivoted),
dropcollinear=ispivoted(m.pp),
method=decomposition_method(m.pp),
maxiter=m.maxiter, minstepfac=m.minstepfac,
atol=m.atol, rtol=m.rtol)
dev = deviance(nullm)
Expand Down Expand Up @@ -341,7 +342,8 @@ function nullloglikelihood(m::GeneralizedLinearModel)
X = fill(1.0, length(y), hasint ? 1 : 0)
nullm = fit(GeneralizedLinearModel,
X, y, d, r.link; wts=wts, offset=offset,
dropcollinear=isa(m.pp.chol, CholeskyPivoted),
dropcollinear=ispivoted(m.pp),
method=decomposition_method(m.pp),
maxiter=m.maxiter, minstepfac=m.minstepfac,
atol=m.atol, rtol=m.rtol)
ll = loglikelihood(nullm)
Expand Down Expand Up @@ -559,6 +561,7 @@ function fit(::Type{M},
d::UnivariateDistribution,
l::Link = canonicallink(d);
dropcollinear::Bool = true,
method::Symbol = :cholesky,
dofit::Union{Bool, Nothing} = nothing,
wts::AbstractVector{<:Real} = similar(y, 0),
offset::AbstractVector{<:Real} = similar(y, 0),
Expand All @@ -575,7 +578,15 @@ function fit(::Type{M},
end

rr = GlmResp(y, d, l, offset, wts)
res = M(rr, cholpred(X, dropcollinear), nothing, false)

if method === :cholesky
res = M(rr, cholpred(X, dropcollinear), nothing, false)
elseif method === :qr
res = M(rr, qrpred(X, dropcollinear), nothing, false)
else
throw(ArgumentError("The only supported values for keyword argument `method` are `:cholesky` and `:qr`."))
end

return dofit ? fit!(res; fitargs...) : res
end

Expand All @@ -594,6 +605,7 @@ function fit(::Type{M},
offset::Union{AbstractVector, Nothing} = nothing,
wts::Union{AbstractVector, Nothing} = nothing,
dropcollinear::Bool = true,
method::Symbol = :cholesky,
dofit::Union{Bool, Nothing} = nothing,
contrasts::AbstractDict{Symbol}=Dict{Symbol,Any}(),
fitargs...) where {M<:AbstractGLM}
Expand All @@ -613,7 +625,15 @@ function fit(::Type{M},
off = offset === nothing ? similar(y, 0) : offset
wts = wts === nothing ? similar(y, 0) : wts
rr = GlmResp(y, d, l, off, wts)
res = M(rr, cholpred(X, dropcollinear), f, false)

if method === :cholesky
res = M(rr, cholpred(X, dropcollinear), f, false)
elseif method === :qr
res = M(rr, qrpred(X, dropcollinear), f, false)
else
throw(ArgumentError("The only supported values for keyword argument `method` are `:cholesky` and `:qr`."))
end

return dofit ? fit!(res; fitargs...) : res
end

Expand Down
Loading

0 comments on commit a8016bd

Please sign in to comment.