-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix AR(1) log likelihood #2285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix AR(1) log likelihood #2285
Conversation
| boundary = Normal.dist(0., tau=tau_e).logp | ||
|
|
||
| innov_like = Normal.dist(k * x_im1, tau=tau_e).logp(x_i) | ||
| return boundary(x[0]) + tt.sum(innov_like) + boundary(x[-1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this looks like a bug to me.
|
|
||
|
|
||
| def AR1_logpdf(value, k, tau_e): | ||
| return (sp.norm(loc=0,scale=1/np.sqrt(tau_e)).logpdf(value[0]) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 wants spaces around math operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah, this test is a bit odd as it basically repcliates the likelihood. statsmodels is an idea but we don't want that as a dependency, but could be optional. @junpenglao?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we should use statsmodels in the test. I am thinking more rewriting our AR1 logp similar to the (exact) unconditional maximum likelihood as in statsmodels. We should take this opportunity to extend our AR1 model into AR(n).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Sorry about the test. I didn't know the best way to write one without introducing additional dependencies.
- As it written now, the AR(1) likelihood calculation is conditional on
$x_0 = 0$ , which seems a bit arbitrary to me. It seems more natural to conditional on the initial observation. - You could use the unconditional distribution to calculate the likelihood of the initial observation (as in stats models), but this only makes sense if the process is stationary ($ |k| < 1 $). Maybe something like
initinGaussianRandomWalk? - I am trying to write a more general AR RV, but I'm getting a bit tripped up by creating the matrix of lagged observations in theano.
|
@eph Any progress? |
|
@twiecki I think this is waiting for after 3.1rc4 |
|
@junpenglao Definitely, but can start moving some things into the 3.2 branch when they are done. |
|
Is it ready? |
|
I think we can merge this and improve the test later. |
|
Don't you think we'll forget about test. Or do we need test here? |
|
pinging @eph |
|
I have written a more general (p lag) autoregressive model. Should I replace AR1 with this class (after tests are written of course). What do you want the tests written against? I know statsmodels was mentioned, but 1) the loglikelihood calculation there is not completely clear to me and 2) I thought we didn't want to keep that dependency. class AR(distribution.Continuous):
R"""
Autoregressive process with p lags.
.. math::
x_t = \rho_0 + \rho_1 x_{t-1} + \ldots + \rho_p x_{t-p} + \epsilon_t,
\epsilon_t \sim N(0,\sigma^2)
The innovation can be parameterized either in terms of precision
or standard deviation. The link between the two parametrizations is
given by
.. math::
\tau = \dfrac{1}{\sigma^2}
Parameters
----------
rho : tensor
Vector of autoregressive coefficients.
sd : float
Standard deviation of innovation (sd > 0).
tau : float
Precision of innovation (tau > 0).
constant: bool (optional, default = False)
Whether to include a constant.
"""
def __init__(self, rho, sd=None, tau=None,
constant=False, *args, **kwargs):
super(AR, self).__init__(*args, **kwargs)
tau, sd = get_tau_sd(tau=tau, sd=sd)
self.sd = tt.as_tensor_variable(sd)
self.tau = tt.as_tensor_variable(tau)
self.mean = tt.as_tensor_variable(0.)
rho = tt.as_tensor_variable(rho,ndim=1)
if constant:
self.p = rho.shape[0] - 1
else:
self.p = rho.shape[0]
#if self.p <= 0:
# raise ValueError('too few lags.')
self.constant = constant
self.rho = rho
def logp(self, value):
y = value[self.p:]
results,_ = scan(lambda l, obs, p: obs[p-l:-l],
outputs_info=None,sequences=[tt.arange(1,self.p+1)],
non_sequences=[value, self.p])
x = tt.stack(results)
if self.constant:
y = y - self.rho[0]
eps = y - self.rho[1:].dot(x)
else:
eps = y - self.rho.dot(x)
innov_like = Normal.dist(mu=0.0, tau=self.tau).logp(eps)
return tt.sum(innov_like) |
|
Yes, general AR would be great. Tests are difficult here, I wouldn't block on it. Perhaps just a regression test with fixed values? Or maybe a simple case where the result can be confirmed by hand? |
|
There are conflicts @eph |
pymc3/distributions/timeseries.py
Outdated
| ---------- | ||
| rho : tensor | ||
| Vector of autoregressive coefficients. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually do not use blank lines between
pymc3/distributions/distribution.py
Outdated
| def _repr_latex_(self, name=None, dist=None): | ||
| return None | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should no appear.
| dtype = 'int64' | ||
| if dtype != 'int16' and dtype != 'int64': | ||
| raise TypeError('Discrete classes expect dtype to be int16 or int64.') | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be deleted.
|
hi @eph, could you please edit the PR following the review? We should merge it after test pass. |
|
The notebook needs to be refined |
|
Hi @eph, hope you dont mind I step in to fix the conflict and the notebook. As for this PR, how should we process? The test should be further refined, and the general AR logp could be improved as well (without using |
|
Thanks for the contribution @eph! |
|
Hi -- sorry for dropping the ball at the end of this PR. Thanks for merging it in! My hope is to implement a vector autoregression next. How would build the lag matrix without looping ( |
Fixes #2284
I've never contributed to an open source project, so I'm not sure if a pull request if overkill here.