Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sampling with dense mass matrix #9

Closed
jdehning opened this issue Apr 17, 2020 · 14 comments
Closed

Add sampling with dense mass matrix #9

jdehning opened this issue Apr 17, 2020 · 14 comments
Labels
enhancement New feature or request

Comments

@jdehning
Copy link
Member

For a better performance, it could be usefull to sample with a full mass matrix, instead only of the diagonal. The newest not yet released version (3.9) of PyMC3 has it as option in pm.sample, but the different functions needed for it are already in earlier releases.

The goal would be to implement the sampling that works in PyMC 3.7 and 3.8 and test whether it is more performant (higher effective number of samples).
References to get started:
pymc-devs/pymc#3596
pymc-devs/pymc#3845
https://dfm.io/posts/pymc3-mass-matrix/

@jdehning jdehning added the enhancement New feature or request label Apr 17, 2020
@jdehning
Copy link
Member Author

The way to go would be:

  • Make a new example notebook, first as copy of the example_one_bundesland.ipynb
  • Test whether sampling with dense weight matrix is better. The effective sample size is the relevant statistic (pm.stats.ess)
  • If yes, test it for the example_bundeslaender.ipynb.ipynb
  • If it is also better, change the example_notebooks to use the full dense matrix.

@emilIftekhar
Copy link
Collaborator

I will try that.

@jdehning
Copy link
Member Author

Perfect

@emilIftekhar
Copy link
Collaborator

I tried it in the one_bundesland notebook in my fork: https://github.com/emilIftekhar/covid19_inference
One can open the netCDF output files with xarray.open_dataset()

@jdehning
Copy link
Member Author

Thanks, could you make some plots, to compare the different ess of the variables?
Perhaps a bar plot for every variable? At least for the ones that aren't near 500 samples.
Like this, it is very difficult to compare them...
Otherwise it would be perhaps the most reasonable to first test the latest master of pymc3, as they probably improved the version to the one that is published in the blog post.

@emilIftekhar
Copy link
Collaborator

How urgent do you need it? If it is ok, I would first tackle some of my other tasks today. I would probably get to this issue again tomorrow.

@jdehning
Copy link
Member Author

No, it isn't so urgent. And yes, this issue takes a bit of time to make it right

@emilIftekhar
Copy link
Collaborator

In order to try the module from master repo, I have created a new environment on my computer, cloned the repo and then installed it. But now my jupyter notebook has trouble importing from pymc3. Do you know what could have gone wrong?

@jdehning
Copy link
Member Author

jdehning commented Apr 25, 2020 via email

@emilIftekhar
Copy link
Collaborator

ImportError: cannot import name 'Model' from 'pymc3' (unknown location)

Where is my new pymc3 folder supposed to be? Maybe that is the problem. I cloned it into the main covid19_inference directory.

@jdehning
Copy link
Member Author

@emilIftekhar
Copy link
Collaborator

Ok reinstalled it, but the full mass matrix option does not seem to work yet.
It is giving me some LinAlgError.

`~/anaconda3/envs/githubPymc3/lib/python3.8/site-packages/scipy/linalg/decomp_cholesky.py in _cholesky(a, lower, overwrite_a, clean, check_finite)
37 c, info = potrf(a1, lower=lower, overwrite_a=overwrite_a, clean=clean)
38 if info > 0:
---> 39 raise LinAlgError("%d-th leading minor of the array is not positive "
40 "definite" % info)
41 if info < 0:

LinAlgError: 2-th leading minor of the array is not positive definite`

Should I get back to doing it manually or do you think it is worthwhile to keep trying it with the new module?

@jdehning
Copy link
Member Author

Mmh, it could also be that this error is due to our model. That some gradient can't be calculated. You could first try, whether a model without change points works. These are the tricky bits

@jdehning
Copy link
Member Author

With the new make_I_prior the correlation between variables is in general relatively low in the models. As such, one wouldn't gain much by a dense matrix. Closing it for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants