Parallel Adaptive Nuts #3033

SteveBronder · 2021-03-24T23:03:18Z

Submission Checklist

Run unit tests: ./runTests.py src/test/unit
Run cpplint: make cpplint
Declare copyright holder and open-source license: see below

Summary

Okay so this PR has the cmdstan facing version of hmc_nuts_diag_e_adapt() (and I plan to also do dense in this PR). The signature differs from the original hmc_nuts_diag_e_adapt() in that the init, init_inv_metric, sample_writer, and diagnostic_writer are std::vector<>`'s of what the normal signature is.

My suggestion is that for cmdstan we should have all of these be std::vectors when they are created. Then for the algorithms that are not parallel yet we set vec_argument[0] as the input argument for the algorithm. That will let us get diag/dense matric adaptive nuts into cmdstan and then start adding parallel versions of all the other algorithms. Does that work for everyone?

Intended Effect

Allow multiple chains to be invoked for adaptive nuts with diag/dense metrics

How to Verify

Test added for parallel diag e adapt

STAN_NUM_THREADS=4 ./runTests.py ./src/test/unit/services/sample/hmc_nuts_diag_e_adapt_parallel_test.cpp

Side Effects

Documentation

Still need to add docs

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Steve Bronder

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

…4.1 (tags/RELEASE_600/final)

…daptive

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2021-03-25T05:03:23Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.33	3.43	0.97	-2.83% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.0	0.0% slower
eight_schools/eight_schools.stan	0.11	0.12	0.98	-1.96% slower
gp_regr/gp_regr.stan	0.16	0.16	0.98	-1.82% slower
irt_2pl/irt_2pl.stan	5.44	5.4	1.01	0.68% faster
performance.compilation	91.22	89.08	1.02	2.35% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.6	8.6	1.0	-0.03% slower
pkpd/one_comp_mm_elim_abs.stan	30.02	30.8	0.97	-2.6% slower
sir/sir.stan	122.21	123.24	0.99	-0.84% slower
gp_regr/gen_gp_data.stan	0.04	0.04	1.0	-0.03% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.24	3.02	1.07	6.86% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.37	0.39	0.96	-3.7% slower
arK/arK.stan	2.02	1.87	1.08	7.7% faster
arma/arma.stan	0.95	0.65	1.47	31.86% faster
garch/garch.stan	0.53	0.58	0.9	-10.89% slower
Mean result: 1.02817362089

Jenkins Console Log
Blue Ocean
Commit hash: 5055e21

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

bbbales2

Comments. I don't like having the 1 chain/n-chain code paths be separate.

src/stan/services/util/create_unit_e_dense_inv_metric.hpp

bbbales2 · 2021-03-25T13:46:05Z

src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp

-    Model& model, const stan::io::var_context& init,
-    const stan::io::var_context& init_inv_metric, unsigned int random_seed,
+    Model& model, const InitContext& init,
+    const InitMetricContext& init_inv_metric, unsigned int random_seed,


Are the types changing here? I don't see an advantage to templating this. Now the template hides even the base polymorphic type.

Ah they are! Notice that now when an initial inverse metric is not given we actually return back an stan::io::dump (or vector of). This is nice for the compiler because now it knows the real type coming in so devirtualization happens a lot easier.

I don't like it. It's so hard to read code when the types info disappears like here. I doubt the virtual function calls here are killing our performance. Both of these things are used once at initialization and that is it.

Is a var_context more informative? It's just an abstract base class so you still need to go look at the callee to see what's happening when this objects member functions are called. What if I included in the docs that this is either going to be a stan::io::dump or stan::io::json_context etc.?

Is a var_context more informative

Well it's still easier to search for (you can find that it's an abstract base class, and then you can find what inherits from it and whatnot) and this is how everything is kinda done in the rest of services.

I think the reason we'd go templating here is if we were doing something we couldn't just solve with the existing polymorphism, or it's just easier to code the templated thing.

If that's the case, let's do templating. If that's not the case, let's do templating it if you really want to. Not the biggest deal but I prefer the existing stuff.

bbbales2 · 2021-03-25T13:48:37Z

src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp

+    } catch (const std::domain_error& e) {
+      return error_codes::CONFIG;
+    }
+    util::run_adaptive_sampler(samplers, model, cont_vectors, num_warmup,


Similarly to the other code, I don't like how the 1 chain and N chain implementations are separate. It seems like they could be the same code and the 1 chain thing would just have 1 chain and that's fine.

We have two right now mostly because of cmdstan. Once we add the parallel version to cmdstan we can remove 1 chain version

SteveBronder · 2021-03-25T22:36:41Z

@bbbales2 I think this PR should have everything in it that we need to do the cmdstan stuff. Once we think the hmc_nuts_diag_e_adapt version looks good I can pretty easily update the other samplers and add tests for them

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2021-07-01T22:22:33Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.13	3.16	0.99	-0.91% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.97	-3.02% slower
eight_schools/eight_schools.stan	0.12	0.11	1.01	0.96% faster
gp_regr/gp_regr.stan	0.16	0.16	0.97	-2.92% slower
irt_2pl/irt_2pl.stan	5.89	5.96	0.99	-1.06% slower
performance.compilation	87.48	86.99	1.01	0.57% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.62	8.53	1.01	1.04% faster
pkpd/one_comp_mm_elim_abs.stan	29.23	30.63	0.95	-4.81% slower
sir/sir.stan	128.64	143.65	0.9	-11.67% slower
gp_regr/gen_gp_data.stan	0.04	0.04	1.01	0.81% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.02	2.98	1.01	1.43% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.4	0.4	0.99	-0.76% slower
arK/arK.stan	2.56	1.87	1.37	26.85% faster
arma/arma.stan	0.64	0.93	0.69	-43.98% slower
garch/garch.stan	0.64	0.64	1.0	0.47% faster
Mean result: 0.991987978995

Jenkins Console Log
Blue Ocean
Commit hash: c50cf1a

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2021-07-02T22:52:00Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.09	3.09	1.0	0.02% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.0	-0.45% slower
eight_schools/eight_schools.stan	0.11	0.11	1.0	0.13% faster
gp_regr/gp_regr.stan	0.16	0.16	1.01	0.66% faster
irt_2pl/irt_2pl.stan	5.93	5.87	1.01	1.06% faster
performance.compilation	89.16	86.73	1.03	2.73% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	9.06	8.56	1.06	5.51% faster
pkpd/one_comp_mm_elim_abs.stan	30.01	29.83	1.01	0.59% faster
sir/sir.stan	128.31	135.96	0.94	-5.96% slower
gp_regr/gen_gp_data.stan	0.03	0.03	0.98	-1.7% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.0	3.01	1.0	-0.35% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.39	0.39	1.01	0.5% faster
arK/arK.stan	2.53	1.88	1.35	25.89% faster
arma/arma.stan	0.64	0.92	0.69	-44.29% slower
garch/garch.stan	0.63	0.63	1.0	-0.1% slower
Mean result: 1.00509786129

Jenkins Console Log
Blue Ocean
Commit hash: 91d51e1

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Wds15 taking over

wds15 · 2021-07-06T18:58:18Z

The ball is in your court, right?

SteveBronder · 2021-07-06T19:53:56Z

Lol oh I was waiting on you, is there something in the review I missed?

wds15 · 2021-07-06T20:18:59Z

Sorry.. no, you did not miss things... I missed your update. The last thing to sort out is the chain id output labeling. See above.

Once we got that, then I need to check that pre-compilation of the services is still all doing its job. Quick to do by just timing the bernoulli example compile time.

Then we are good from my view.

stan-buildbot · 2021-07-07T15:57:10Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.35	3.13	1.07	6.64% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.96	-4.34% slower
eight_schools/eight_schools.stan	0.12	0.11	1.04	3.52% faster
gp_regr/gp_regr.stan	0.16	0.16	0.98	-1.91% slower
irt_2pl/irt_2pl.stan	5.87	5.87	1.0	-0.03% slower
performance.compilation	89.6	86.93	1.03	2.98% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.58	8.55	1.0	0.33% faster
pkpd/one_comp_mm_elim_abs.stan	29.39	30.23	0.97	-2.88% slower
sir/sir.stan	129.33	130.63	0.99	-1.0% slower
gp_regr/gen_gp_data.stan	0.03	0.03	1.01	0.6% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.99	3.01	0.99	-0.63% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.4	0.39	1.02	2.09% faster
arK/arK.stan	2.55	1.9	1.34	25.49% faster
arma/arma.stan	0.65	0.94	0.69	-45.63% slower
garch/garch.stan	0.64	0.64	1.0	-0.04% slower
Mean result: 1.00617287845

Jenkins Console Log
Blue Ocean
Commit hash: 569df85

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

SteveBronder · 2021-07-07T19:27:18Z

Alrighty @wds15 I also updated stan-dev/cmdstan#987 with this branch + the change for the init being set to 1.

I moved the chains argument to be under sample as for now that's the only one that uses it. I think the only other one that would use it would be variational and we can make chains an argument for variational once we have a version of VB that uses multiple chains. So calling is now like

examples/diamonds/diamonds sample num_samples=150 num_warmup=150 chains=8 \
 data file=examples/diamonds/diamonds2.json threads=8

wds15 · 2021-07-12T19:38:20Z

So one of my last tests is to compare the compile time with 2.27.0 is:

[21:35:41][sebi@sebastians-macbook-pro-1:~/work/cmdstan-2.27.0]$ time make examples/bernoulli/bernoulli

--- Translating Stan model to C++ code ---
bin/stanc  --o=examples/bernoulli/bernoulli.hpp examples/bernoulli/bernoulli.stan

--- Compiling, linking C++ code ---
clang++ -DSTAN_THREADS -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -include-pch stan/src/stan/model/model_header.hpp.gch -x c++ -o examples/bernoulli/bernoulli.o examples/bernoulli/bernoulli.hpp
clang++ -DSTAN_THREADS -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS               -Wl,-L,"/Users/sebi/work/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/sebi/work/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"      examples/bernoulli/bernoulli.o src/cmdstan/main.o        -Wl,-L,"/Users/sebi/work/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/sebi/work/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"   stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_idas.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_kinsol.a  stan/lib/stan_math/lib/tbb/libtbb.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc_proxy.dylib -o examples/bernoulli/bernoulli
rm -f examples/bernoulli/bernoulli.o

real	0m6.449s
user	0m5.979s
sys	0m0.449s

and with this branch I got

--- Translating Stan model to C++ code ---
bin/stanc  --o=examples/bernoulli/bernoulli-2.hpp examples/bernoulli/bernoulli-2.stan

--- Compiling, linking C++ code ---
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes     -DSTAN_THREADS -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -include-pch stan/src/stan/model/model_header_threads.hpp.gch -x c++ -o examples/bernoulli/bernoulli-2.o examples/bernoulli/bernoulli-2.hpp
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes     -DSTAN_THREADS -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS               -Wl,-L,"/Users/sebi/work/cmdstan/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/sebi/work/cmdstan/stan/lib/stan_math/lib/tbb"      examples/bernoulli/bernoulli-2.o src/cmdstan/main_threads.o        -Wl,-L,"/Users/sebi/work/cmdstan/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/sebi/work/cmdstan/stan/lib/stan_math/lib/tbb"   stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_idas.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_kinsol.a  stan/lib/stan_math/lib/tbb/libtbb.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc_proxy.dylib -o examples/bernoulli/bernoulli-2
rm -f examples/bernoulli/bernoulli-2.o

real	0m6.941s
user	0m6.432s
sys	0m0.458s

So possibly this branch does slow down things a tiny bit wrt to compilation (upon the second time to be clear), but I'd assume this is within noise.

SteveBronder · 2021-07-12T21:07:13Z

Yeah imo I'd expect possibly a tiny slowdown in compilation, but overall that seems to be fine.

You think this is ready to merge?

wds15 · 2021-07-13T07:19:35Z

From my memory this is good now. Let me do one last round over this.

This is my first bigger PR review for Stan, so that's why I take a bit more time. I'd guess we are good.

wds15

Sorry, did not get to finish the review. Leaving these comments for now on the doc.

wds15 · 2021-07-17T20:05:40Z

src/stan/services/sample/hmc_nuts_dense_e_adapt.hpp

+ * @param[in] random_seed random seed for the random number generator
+ * @param[in] init_chain_id first chain id. The pseudo random number generator
+ will advance by for each chain by an integer sequence from `init_chain_id` to
+ `num_chains`


the last chain id won't be num_chains, but init_chain_id+num_chains-1

wds15 · 2021-07-17T20:18:56Z

src/stan/services/util/create_rng.hpp

@@ -19,14 +19,17 @@ namespace util {
 * duplicated.
 *
 * @param[in] seed the random seed
- * @param[in] chain the chain id
+ * @param[in] init_chain_id the chain id
+ * @param[in] chain_num For multi-chain, the ch


But the doc "For multi-chain, the ch" is not meaningful to me. What about

@param[in] init_chain_id start of chain ids @param[in] chain_num chain id offset such that chain_id is init_chain_id+chain_num

Also... reading the comment here suggest to me that we should actually not allow for a chain id of 0, right??? (I mean the comment to the function)

wds15

Small laundry items, not more. Then we are good. Will look at the cmdstan num_thread thing now.

wds15 · 2021-07-19T19:08:39Z

src/stan/services/sample/hmc_nuts_dense_e_adapt.hpp

+        stepsize, stepsize_jitter, max_depth, delta, gamma, kappa, t0,
+        init_buffer, term_buffer, window, interrupt, logger, init_writer[0],
+        sample_writer[0], diagnostic_writer[0]);
+  } else {


The else branch is not needed. A simple if for the num_chains=1 case is sufficient (same for the other functions).

wds15 · 2021-07-19T19:09:42Z

src/stan/services/sample/hmc_nuts_dense_e_adapt.hpp

+        interrupt, logger, init_writer[0], sample_writer[0],
+        diagnostic_writer[0]);
+  } else {
+    std::vector<std::unique_ptr<stan::io::dump>> unit_e_metrics;


else is not needed.

wds15 · 2021-07-19T19:11:14Z

src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp

+ * @param[in] random_seed random seed for the random number generator
+ * @param[in] init_chain_id first chain id. The pseudo random number generator
+ will advance by for each chain by an integer sequence from `init_chain_id` to
+ `num_chains`


doc needs correction as before. The range is init_chain_id to init_chain_id+num_chains-1

wds15 · 2021-07-19T19:13:07Z

src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp

+ * @param[in,out] sample_writer std vector of Writers for draws of each chain.
+ * @param[in,out] diagnostic_writer std vector of Writers for diagnostic
+ information of each chain.
+ * @param[in] num_chains The number of chains to run in parallel. `init`,


num_chains doc string should appear after Model so that it matches the order of the arguments.

wds15 · 2021-07-19T19:13:31Z

src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp

+        init_buffer, term_buffer, window, interrupt, logger, init_writer[0],
+        sample_writer[0], diagnostic_writer[0]);
+  } else {
+    using sample_t = stan::mcmc::adapt_diag_e_nuts<Model, boost::ecuyer1988>;


else not needed

wds15 · 2021-07-19T19:17:21Z

src/stan/services/util/create_rng.hpp

@@ -19,14 +19,17 @@ namespace util {
 * duplicated.
 *
 * @param[in] seed the random seed
- * @param[in] chain the chain id
+ * @param[in] init_chain_id the chain id
+ * @param[in] chain_num For multi-chain, the ch


BTW, do we actually need to have create_rng take a init_chain_id and a num_chains? Why don't we leave the function as is and just pre-compute the chain_id's in the parallel sample functions? That seems simpler to me.

wds15 · 2021-07-19T19:17:48Z

src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp

+        interrupt, logger, init_writer[0], sample_writer[0],
+        diagnostic_writer[0]);
+  } else {
+    std::vector<std::unique_ptr<stan::io::dump>> unit_e_metrics;


no else here needed

wds15 · 2021-07-19T19:21:21Z

src/stan/services/util/run_adaptive_sampler.hpp

@@ -66,7 +71,7 @@ void run_adaptive_sampler(Sampler& sampler, Model& model,
  auto start_warm = std::chrono::steady_clock::now();
  util::generate_transitions(sampler, num_warmup, 0, num_warmup + num_samples,
                             num_thin, refresh, save_warmup, true, writer, s,
-                             model, rng, interrupt, logger);
+                             model, rng, interrupt, logger, chain_id, n_chain);


Sorry for the late catch...but why is it called n_chain here and not num_chain? num_chain looks more consistent with all the other variable conventions...would need to be addressed in the entire PR.

…removes unneeded else branch

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2021-07-20T01:56:23Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.08	3.11	0.99	-0.95% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.98	-1.65% slower
eight_schools/eight_schools.stan	0.11	0.11	1.04	3.84% faster
gp_regr/gp_regr.stan	0.16	0.16	1.0	0.01% faster
irt_2pl/irt_2pl.stan	5.95	5.9	1.01	0.79% faster
performance.compilation	88.93	86.8	1.02	2.39% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.57	8.55	1.0	0.25% faster
pkpd/one_comp_mm_elim_abs.stan	29.99	30.02	1.0	-0.09% slower
sir/sir.stan	128.5	128.07	1.0	0.33% faster
gp_regr/gen_gp_data.stan	0.03	0.03	1.01	1.46% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.02	3.04	0.99	-0.74% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.4	0.38	1.04	3.67% faster
arK/arK.stan	2.55	1.89	1.35	26.01% faster
arma/arma.stan	0.65	0.92	0.7	-42.0% slower
garch/garch.stan	0.64	0.63	1.01	0.86% faster
Mean result: 1.01078599896

Jenkins Console Log
Blue Ocean
Commit hash: a11a48a

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

wds15 · 2021-07-20T20:13:03Z

It's on me again?

SteveBronder · 2021-07-20T21:15:28Z

Yep go for it!

SteveBronder · 2021-07-23T19:16:24Z

@wds15 bump!

wds15 · 2021-07-23T20:54:40Z

Monday…sorry…

SteveBronder · 2021-07-23T22:25:47Z

All good!

wds15 · 2021-07-26T18:22:39Z

LGTM!

SteveBronder and others added 17 commits March 20, 2021 15:19

adds file_stream_writer and run_adaptive_sampler method for parallel

5edf4e2

fix grammar error for parallel run_adaptive_sampler

faa5eb0

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

751403f

…4.1 (tags/RELEASE_600/final)

adds test for parallel adaptive

a08ba35

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

a4c9679

…4.1 (tags/RELEASE_600/final)

include mutex

04836ef

make stream_writer not final

0608601

init threadpool

3ebbc5c

update generate_transitions and cleanup run_adaptive_sampler

a3bf12b

Merge commit '7eeaf3c58fdd1c40aa62ba7158106529f2fd9563' into HEAD

9056101

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

5c7b0bd

…4.1 (tags/RELEASE_600/final)

start diag_e_adapt parallel

117274f

adds tests for parallel adapter

99d64c1

Merge remote-tracking branch 'origin/develop' into feature/parallel-a…

35f88a7

…daptive

remove statics from softmax metric and cleanup tests

ebeb1f9

update to feature/parallel-adapt

142a94a

use normal diag_e_adapt if n_chain == 0

558eeb0

SteveBronder mentioned this pull request Mar 24, 2021

Parallel Run Adaptive Sampler #3028

Closed

3 tasks

SteveBronder changed the title ~~Feature/parallel nuts~~ [WIP] Parallel Adaptive Nuts Mar 24, 2021

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

5055e21

…4.1 (tags/RELEASE_600/final)

bbbales2 requested changes Mar 25, 2021

View reviewed changes

SteveBronder added 3 commits March 25, 2021 12:52

use parallel loop around run_adaptive_sampler

34114be

remove run_adaptive_sampler parallel version

8458782

update to remove parallel run_adaptive_sampler

38e168d

SteveBronder changed the base branch from feature/parallel-adaptive to develop March 25, 2021 22:30

SteveBronder marked this pull request as ready for review March 25, 2021 22:30

yashikno and others added 2 commits March 25, 2021 22:40

Merge commit 'f3bf21bc20271ebb9f7c9613bdb17c16d5cc0c1b' into HEAD

e72409b

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

a2daea0

…4.1 (tags/RELEASE_600/final)

SteveBronder and others added 2 commits July 2, 2021 17:59

update printing logic

c09ca13

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

91d51e1

…4.1 (tags/RELEASE_600/final)

update to remove +1 from chain number

569df85

wds15 reviewed Jul 17, 2021

View reviewed changes

wds15 requested changes Jul 19, 2021

View reviewed changes

SteveBronder and others added 3 commits July 19, 2021 15:32

Merge remote-tracking branch 'origin/develop' into feature/parallel-nuts

bbf4d37

Make sure to use num_chains instead of n_chain everywhere, fix docs, …

4f82e24

…removes unneeded else branch

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

a11a48a

…4.1 (tags/RELEASE_600/final)

wds15 approved these changes Jul 26, 2021

View reviewed changes

SteveBronder merged commit 2edd18c into develop Jul 26, 2021

SteveBronder mentioned this pull request Oct 6, 2021

Changed variational output in 2.28.0 stan-dev/cmdstan#1049

Closed

WardBrian mentioned this pull request Feb 24, 2023

create_rng with chain=0 appears to return biased first draw #3167

Closed

Parallel Adaptive Nuts #3033

Parallel Adaptive Nuts #3033

Conversation

SteveBronder commented Mar 24, 2021 • edited Loading

Submission Checklist

Summary

Intended Effect

How to Verify

Side Effects

Documentation

Copyright and Licensing

stan-buildbot commented Mar 25, 2021

bbbales2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveBronder Mar 25, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveBronder commented Mar 25, 2021

stan-buildbot commented Jul 1, 2021

stan-buildbot commented Jul 2, 2021

wds15 commented Jul 6, 2021

SteveBronder commented Jul 6, 2021 • edited Loading

wds15 commented Jul 6, 2021

stan-buildbot commented Jul 7, 2021

SteveBronder commented Jul 7, 2021

wds15 commented Jul 12, 2021

SteveBronder commented Jul 12, 2021

wds15 commented Jul 13, 2021

wds15 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wds15 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stan-buildbot commented Jul 20, 2021

wds15 commented Jul 20, 2021

SteveBronder commented Jul 20, 2021

SteveBronder commented Jul 23, 2021

wds15 commented Jul 23, 2021

SteveBronder commented Jul 23, 2021

wds15 commented Jul 26, 2021

SteveBronder commented Mar 24, 2021 •

edited

Loading

SteveBronder Mar 25, 2021 •

edited

Loading

SteveBronder commented Jul 6, 2021 •

edited

Loading