Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions regarding parallel implementation of BVAR #76

Open
thestockman27 opened this issue May 16, 2022 · 4 comments
Open

Questions regarding parallel implementation of BVAR #76

thestockman27 opened this issue May 16, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@thestockman27
Copy link

thestockman27 commented May 16, 2022

Hi there,

I am wondering if there has been any work done to build a function that rather than utilizing parallelization to construct several BVAR models, instead distributes the tasks of one model to multiple clusters. To clarify, distributing the MCMCs for each variable or hyperparameter. Is this feasible?

I am running several large back-tests and have found that my the MCMCs are simply too time-consuming for my current modelling effort, even at just 10k draws. I have even built out a connection to SPARK, but it was still too time consuming and became quite costly. (Yes, I recognize that may more properly serve as an indictment of the scale of my back-test process. Duly noted! I am considering refinement.)

Alternatively, would it be possible to bypass the MCMC process and instead derive the density of the posterior? I believe that the SAS proc VARMAX does just this.

Just spit-balling here...Thank you for all your work on this package!

@nk027
Copy link
Owner

nk027 commented May 17, 2022

Hey @thestockman27,
Great to hear that your working with the package!

First, if you want the full posterior (of this type of VAR) you will need some kind of sampling approach. Of course, there are other natural conjugate models for which the full posterior is available (let me know if that's what you need). If you want point estimators, there are many approximate approaches that could help (e.g. INLA, variational Bayes), but they're out-of-scope for the package. However, if point estimates of the AR coefficients are what you need, the starting draws may be enough for you, since the initial hyper-parameters are basically set via maximum likelihood.

Regarding the parallelisation -- there already is some in the the linear algebra operations (if the BLAS supports it), but MCMC is a fundamentally sequential algorithm where step i (for all variables) depends on i-1 and the variables depend on each other, so there's little room for parallelisation. However, if you have many (under--used) cores, you can run parallel chains and aggregate the draws! In that case, you can keep the number of burn-ins to a minimum (that shouldn't be too bad, a high number of burns is just very conservative). Two small notes to myself on that -- it would be nice to have:

  • a function to merge multiple draws from a parallel run
  • the option of setting other starting values for the hyper-parameters (it would be nice to spawn parallel processes after the chain has converged)

Best,

@nk027 nk027 added the enhancement New feature or request label May 17, 2022
@thestockman27
Copy link
Author

Thanks for the quick and thorough response.

Yes, I think point estimates for the AR coefficients would satisfy my immediate use case. I believe this occurs within bvar() via bv_ml() and calc_logml() and the initial values of the hyperparameters are generated via rmvn_proposal(). I'll admit that much of these calculations are beyond my current level of comprehension...

How would I then override the sampling, perhaps by setting ml_store and hyper_store equal to the initial draws of ml_draw() and hyper_draw()?

Merging chains run in parallel sounds like a neat idea.

@nk027
Copy link
Owner

nk027 commented May 18, 2022

So the most straightforward way would be to just use a standard Normal Inverse-Wishart prior, like so

N <- 100
P <- 4

# Coefficients ---
beta <- matrix(rgamma(P^2, 1, 1), P)
# They ought to be stationary
beta <- beta / (max(abs(eigen(beta)$values)))

# VAR data ---
D <- matrix(NA_real_, N, P)
D[1, ] <- rnorm(P)
for(i in seq(2, N)) {
  D[i, ] <- D[i - 1, ] %*% beta + rnorm(P)
}

X <- cbind(1, BVAR:::lag_var(D, lags = 1L)[-1, ])
Y <- D[-1, ]
plot.ts(Y)

# Priors ---
beta_p0 <- diag(P + 1) * 1e-8
beta_mu0 <- matrix(0, P + 1, P)
sigma_d0 <- 0.1

# Beta ---
# Posterior precision (inverse of variance)
beta_p1 <- crossprod(X) + beta_p0
# Posterior mean -- weighted combination of data and prior mean
beta_mu1 <- solve(beta_p1, (crossprod(X, Y) + beta_p0 %*% beta_mu0))

# Sigma ---
# Posterior shape
sigma_c1 <- 1 + N / 2
# Posterior rate
res <- Y - X %*% beta_mu1
sigma_d1 <- sigma_d0 + crossprod(res) / 2
# Mean of sigma
sigma_c1 / sigma_d1

If you absolutely want to use the GLP model in an empirical Bayes fashion you can just follow the steps until the optimisation (line #241) and then use the output of bv_ml() with those hyper-parameters -- that should contain all you need to calculate the point estimates.

Best,

@thestockman27
Copy link
Author

Hi there, hope you have had a nice summer.

I was able to implement your suggestion for point estimates using the initial draws calculated by optim. Seen here.

I have noticed that adjusting the seed via set.seed() will impact the results of my backtesting significantly. It would seem that the quasi-newton optimization does not reach the consistent estimates of the ML.

I am considering following the methodology of the VARMAX procedure from SAS where they utilize the Kalman Filter to calculate the exact ML.

I suppose I just wanted to run this by you, any thoughts? I would appreciate any suggestions/criticisms that could better guide me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants