Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flag to LKJCorr to return the unpacked correlation matrix #7100

Merged
merged 2 commits into from
Jan 23, 2024

Conversation

velochy
Copy link
Contributor

@velochy velochy commented Jan 11, 2024

Description

Provide a wrapper around previous LKJCorr that reshapes the output to a full correllation matrix instead of the somewhat awkward vector of upper-triangular values.
This is based on the discussion with @ricardoV94 and @jessegrabowski in https://discourse.pymc.io/t/using-lkjcorr-together-with-mvnormal/13606/27 .

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pymc--7100.org.readthedocs.build/en/7100/

Copy link

welcome bot commented Jan 11, 2024

Thank You Banner
💖 Thanks for opening this pull request! 💖 The PyMC community really appreciates your time and effort to contribute to the project. Please make sure you have read our Contributing Guidelines and filled in our pull request template to the best of your ability.

sd = pm.Exponential("sd", 1.0, shape=3)

corr = pm.LKJCorr("corr", n=3, eta=2, return_matrix=True)
# pylint: enable=unpacking-non-sequence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# pylint: enable=unpacking-non-sequence

mv = pm.MvNormal("mv", mu, cov=sd * (sd * corr).T, size=4)
prior = pm.sample_prior_predictive(samples=10, return_inferencedata=False)

assert prior["mv"].shape == (10, 4, 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to directly assert the values of corr are correct

return _LKJCorr(name, eta=eta, n=n, **kwargs)
else:
c_vec = _LKJCorr(name + "_raw", eta=eta, n=n, **kwargs)
return pm.Deterministic(name, cls.vec_to_corr_mat(c_vec, n))
Copy link
Member

@ricardoV94 ricardoV94 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the name change and wrapping in a Deterministic by default, but not completely against it either

Copy link
Member

@ricardoV94 ricardoV94 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument for having the packed form is that it saves memory. This actually more than doubles it because we save both the packed form and the dense one. We could leave to the users to wrap in a Deterministic if they need to have it after sampling.

OTOH this is similar to what LKJ does by default now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check the discussion #6828 about setting internal deterministics/coords?

Reminds me I need to finish that PR

def lkjcorr_default_transform(op, rv):
return MultivariateIntervalTransform(floatX(-1.0), floatX(1.0))


# Thin wrapper around _LKJCorr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments like this have a tendency to get left behind. I would exclude or add it inside the class definition

Suggested change
# Thin wrapper around _LKJCorr

@ricardoV94
Copy link
Member

Thanks for picking this up @velochy!

@ricardoV94 ricardoV94 changed the title Add a flag to LKJCorr to let it return fully formed correllation matrix Add a flag to LKJCorr to let it return fully formed correlation matrix Jan 11, 2024
@ricardoV94 ricardoV94 changed the title Add a flag to LKJCorr to let it return fully formed correlation matrix Add a flag to LKJCorr to let it return unpacked correlation matrix Jan 11, 2024
@ricardoV94 ricardoV94 changed the title Add a flag to LKJCorr to let it return unpacked correlation matrix Add a flag to LKJCorr to return unpacked correlation matrix Jan 11, 2024
@ricardoV94 ricardoV94 changed the title Add a flag to LKJCorr to return unpacked correlation matrix Add a flag to LKJCorr to return the unpacked correlation matrix Jan 11, 2024
Copy link

codecov bot commented Jan 11, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (7bb2ccd) 92.21% compared to head (ab63678) 92.21%.
Report is 3 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #7100   +/-   ##
=======================================
  Coverage   92.21%   92.21%           
=======================================
  Files         101      101           
  Lines       16912    16901   -11     
=======================================
- Hits        15595    15586    -9     
+ Misses       1317     1315    -2     
Files Coverage Δ
pymc/distributions/multivariate.py 93.80% <77.77%> (+0.26%) ⬆️

... and 52 files with indirect coverage changes

implies a uniform distribution of the correlation matrices;
larger values put more weight on matrices with few correlations.
return_matrix : bool, default=False
If True, returns the full correllation matrix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If True, returns the full correllation matrix.
If True, returns the full correlation matrix.

@velochy velochy force-pushed the lkj_corr_return_matrix branch from 9a64569 to 98a7f48 Compare January 12, 2024 06:38
@velochy
Copy link
Contributor Author

velochy commented Jan 12, 2024

Made changes corresponding to all the comments so far.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! Left some minor suggestions


@classmethod
def dist(cls, n, eta, *, return_matrix=True, **kwargs):
# compute Cholesky decomposition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# compute Cholesky decomposition


def __new__(cls, name, n, eta, *, return_matrix=False, **kwargs):
if not return_matrix:
return _LKJCorr(name, eta=eta, n=n, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just call once and then branch in the return now that they're the same

return_matrix : bool, default=False
If True, returns the full correlation matrix.
False only returns the values of the upper triangular matrix excluding
diagonal in a single vector of length n(n-1)/2 for backwards compatibility
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
diagonal in a single vector of length n(n-1)/2 for backwards compatibility
diagonal in a single vector of length n(n-1)/2 for memory efficiency

@velochy velochy force-pushed the lkj_corr_return_matrix branch from 98a7f48 to ab63678 Compare January 12, 2024 10:43
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks like a great improvement in UX

@ricardoV94 ricardoV94 merged commit ef1f91f into pymc-devs:main Jan 23, 2024
22 checks passed
Copy link

welcome bot commented Jan 23, 2024

Congratulations Banner]
Congrats on merging your first pull request! 🎉 We here at PyMC are proud of you! 💖 Thank you so much for your contribution 🎁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: LKJCorr should return matrix rather than vector
4 participants