Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize */fun starting with a-chol2 #1732

Merged
merged 10 commits into from
Mar 2, 2020

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented Feb 21, 2020

Summary

Generalizes inputs to functions that start with a - chol2. chol2inv depends on mdivide_left, mdivide_left_tri and mdivide_left_tri_low so these were generalized as well.

Tests

assign() had different requirements wrt compile time size for Eigen matrices and Eigen blocks. I changed that so it accepts all assignments Eigen does (It previously did that for blocks. For matrices it required same compile time sizes). That also changed some tests.

Side Effects

None.

Checklist

  • Math issue Generalize matrix function signatures #1470

  • Copyright holder: Tadej Ciglarič
    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@t4c1 t4c1 changed the title Generalize prim/fun starting with a-chol2 Generalize */fun starting with a-chol2 Feb 24, 2020
Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just some clarifications around templating and Eigen::Ref usage

for (int j = 0; j < A.cols(); j++) {
for (int i = 0; i < A.rows(); i++) {
val_A(i, j) = A(i, j).val_;
deriv_A(i, j) = A(i, j).d_;
val_A(i, j) = A_ref(i, j).val_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the values and derivatives are getting copied anyway you probably don't need the intermediary Eigen::Ref, because that could also induce a copy depending on the input.

Also, you could use the .val() and .d() member functions here as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure if a do val_A = A.val(); deriv_A = A.d(), A would be evaluated twice. I think we can not avoid either a copy or two evaluations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha, makes sense

Copy link
Contributor Author

@t4c1 t4c1 Feb 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I think you have some experience with adding expressions to Eigen library. Do you think it would be possible to make some kind of expression that would make a fvar "Matrix" by referencing two matrices of doubles. So that we could directley evaluate a fvar expression into two matrices like this:

make_fvar_ref_matrix(val_A, deriv_A) = A;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I could do that. I think the easiest approach would be a custom CwiseBinaryOp, something like:

val_A.make_fvar(deriv_A)

I could just drop that into the current matrixbase addons. The tricky part will be the order of includes, since the fvar header will need to be included before the eigen_plugins.h header. but I can do some testing and see what works.

for (int j = 0; j < b.cols(); j++) {
for (int i = 0; i < b.rows(); i++) {
val_b(i, j) = b(i, j).val_;
deriv_b(i, j) = b(i, j).d_;
val_b(i, j) = b_ref(i, j).val_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val_b, deriv_b, and deriv_A only get used once below, so you don't need the copies here.

For example:

inv_A_mult_b = mdivide_left(val_A, b_ref.val());

I'll only comment here, but there are similar cases throughout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

mdivide_left(const T1& A, const T2& b) {
using T = typename value_type_t<T1>::Scalar;
constexpr int R1 = T1::RowsAtCompileTime;
constexpr int C1 = T1::ColsAtCompileTime;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need all of these since C1 and R2 should be the same for the matrices to be multiplicable, similarly R1 and C1 should be the same for the matrix to be square

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

const Eigen::Matrix<double, R1, C1> &A,
const Eigen::Matrix<fvar<T>, R2, C2> &b) {
template <typename T1, typename T2, require_eigen_t<T1>* = nullptr,
require_same_vt<double, T1>* = nullptr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these templates be collapsed to: require_eigen_vt<std::is_arithmetic, T1>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -56,10 +66,18 @@ inline Eigen::Matrix<fvar<T>, R1, C2> mdivide_left_tri_low(
return to_fvar(inv_A_mult_b, deriv);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the eventual goal is to propagate expression templates, it could be good to start collapsing some of these function calls:

 deriv = mdivide_left(val_A, deriv_b)
            - multiply(mdivide_left(val_A, deriv_A),
                       mdivide_left(val_A, val_b));

Copy link
Contributor Author

@t4c1 t4c1 Feb 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that, but don't expect any significant performance benefit in this case. The majority of time is spent in those O(n**3) LU decompositions in prim mdivide_left.

const Eigen::Matrix<double, R2, C2> &b) {
template <typename T1, typename T2, require_eigen_vt<is_var, T1> * = nullptr,
require_eigen_t<T2> * = nullptr,
require_same_vt<double, T2> * = nullptr>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require_eigen_vt here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -73,9 +73,6 @@ TEST(AgradFwdMatrixAssign, eigen_row_vector_fvar_double_shape_mismatch) {
Matrix<fvar<double>, Dynamic, Dynamic> zzz(3, 1);
zzz << 1, 2, 3;
EXPECT_THROW(assign(x, zzz), std::invalid_argument);

Matrix<fvar<double>, Dynamic, Dynamic> zzzz(1, 3);
EXPECT_THROW(assign(x, zzzz), std::invalid_argument);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky one. Size-wise the two objects are the same, but one is declared as a matrix and one a vector. This means that throwing in the math library will be dimension-based, but throwing in the Stan language will be type-dependent (although I'm not sure if assign is even exposed?). Not sure if that's an important distinction to make somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I think there is no reason to have different requirements for Eigen::Block and Eigen::Matrix. I assume it is better to allow assigning of matrices woth different compile time sizes. Do you think it is prefereble to fail assigments for blocks with different sizes? Or is there a reason for difference in handling of matrices and blocks?

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.84 4.81 1.01 0.74% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.98 -1.94% slower
eight_schools/eight_schools.stan 0.09 0.09 0.97 -3.11% slower
gp_regr/gp_regr.stan 0.22 0.22 1.0 0.36% faster
irt_2pl/irt_2pl.stan 6.07 6.07 1.0 0.05% faster
performance.compilation 87.82 86.88 1.01 1.06% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.28 7.29 1.0 -0.13% slower
pkpd/one_comp_mm_elim_abs.stan 20.68 20.54 1.01 0.64% faster
sir/sir.stan 89.42 91.88 0.97 -2.76% slower
gp_regr/gen_gp_data.stan 0.04 0.05 0.96 -3.94% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 2.98 0.99 -1.03% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.32 0.33 0.96 -4.08% slower
arK/arK.stan 1.73 1.72 1.01 0.55% faster
arma/arma.stan 0.73 0.77 0.94 -5.88% slower
garch/garch.stan 0.63 0.62 1.03 2.7% faster
Mean result: 0.989451970538

Jenkins Console Log
Blue Ocean
Commit hash: 71a6c03


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1
Copy link
Contributor Author

t4c1 commented Feb 28, 2020

@andrjohns This is ready for another review.

Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me, merge when ready

@t4c1 t4c1 merged commit f034117 into stan-dev:develop Mar 2, 2020
@t4c1 t4c1 deleted the generalize_prim_fun_a_chol2 branch November 30, 2020 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants