-
-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize */fun starting with a-chol2 #1732
Conversation
…stable/2017-11-14)
# Conflicts: # stan/math/rev/fun/mdivide_left.hpp # stan/math/rev/fun/mdivide_left_tri.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just some clarifications around templating and Eigen::Ref
usage
for (int j = 0; j < A.cols(); j++) { | ||
for (int i = 0; i < A.rows(); i++) { | ||
val_A(i, j) = A(i, j).val_; | ||
deriv_A(i, j) = A(i, j).d_; | ||
val_A(i, j) = A_ref(i, j).val_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the values and derivatives are getting copied anyway you probably don't need the intermediary Eigen::Ref
, because that could also induce a copy depending on the input.
Also, you could use the .val()
and .d()
member functions here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure if a do val_A = A.val(); deriv_A = A.d()
, A would be evaluated twice. I think we can not avoid either a copy or two evaluations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha, makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I think you have some experience with adding expressions to Eigen library. Do you think it would be possible to make some kind of expression that would make a fvar
"Matrix" by referencing two matrices of double
s. So that we could directley evaluate a fvar expression into two matrices like this:
make_fvar_ref_matrix(val_A, deriv_A) = A;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I could do that. I think the easiest approach would be a custom CwiseBinaryOp
, something like:
val_A.make_fvar(deriv_A)
I could just drop that into the current matrixbase addons. The tricky part will be the order of includes, since the fvar
header will need to be included before the eigen_plugins.h
header. but I can do some testing and see what works.
stan/math/fwd/fun/mdivide_left.hpp
Outdated
for (int j = 0; j < b.cols(); j++) { | ||
for (int i = 0; i < b.rows(); i++) { | ||
val_b(i, j) = b(i, j).val_; | ||
deriv_b(i, j) = b(i, j).d_; | ||
val_b(i, j) = b_ref(i, j).val_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val_b
, deriv_b
, and deriv_A
only get used once below, so you don't need the copies here.
For example:
inv_A_mult_b = mdivide_left(val_A, b_ref.val());
I'll only comment here, but there are similar cases throughout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
stan/math/fwd/fun/mdivide_left.hpp
Outdated
mdivide_left(const T1& A, const T2& b) { | ||
using T = typename value_type_t<T1>::Scalar; | ||
constexpr int R1 = T1::RowsAtCompileTime; | ||
constexpr int C1 = T1::ColsAtCompileTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need all of these since C1 and R2 should be the same for the matrices to be multiplicable, similarly R1 and C1 should be the same for the matrix to be square
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
stan/math/fwd/fun/mdivide_left.hpp
Outdated
const Eigen::Matrix<double, R1, C1> &A, | ||
const Eigen::Matrix<fvar<T>, R2, C2> &b) { | ||
template <typename T1, typename T2, require_eigen_t<T1>* = nullptr, | ||
require_same_vt<double, T1>* = nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these templates be collapsed to: require_eigen_vt<std::is_arithmetic, T1>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
@@ -56,10 +66,18 @@ inline Eigen::Matrix<fvar<T>, R1, C2> mdivide_left_tri_low( | |||
return to_fvar(inv_A_mult_b, deriv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the eventual goal is to propagate expression templates, it could be good to start collapsing some of these function calls:
deriv = mdivide_left(val_A, deriv_b)
- multiply(mdivide_left(val_A, deriv_A),
mdivide_left(val_A, val_b));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do that, but don't expect any significant performance benefit in this case. The majority of time is spent in those O(n**3) LU decompositions in prim mdivide_left
.
stan/math/rev/fun/mdivide_left.hpp
Outdated
const Eigen::Matrix<double, R2, C2> &b) { | ||
template <typename T1, typename T2, require_eigen_vt<is_var, T1> * = nullptr, | ||
require_eigen_t<T2> * = nullptr, | ||
require_same_vt<double, T2> * = nullptr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require_eigen_vt
here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -73,9 +73,6 @@ TEST(AgradFwdMatrixAssign, eigen_row_vector_fvar_double_shape_mismatch) { | |||
Matrix<fvar<double>, Dynamic, Dynamic> zzz(3, 1); | |||
zzz << 1, 2, 3; | |||
EXPECT_THROW(assign(x, zzz), std::invalid_argument); | |||
|
|||
Matrix<fvar<double>, Dynamic, Dynamic> zzzz(1, 3); | |||
EXPECT_THROW(assign(x, zzzz), std::invalid_argument); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tricky one. Size-wise the two objects are the same, but one is declared as a matrix and one a vector. This means that throwing in the math library will be dimension-based, but throwing in the Stan language will be type-dependent (although I'm not sure if assign
is even exposed?). Not sure if that's an important distinction to make somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I think there is no reason to have different requirements for Eigen::Block
and Eigen::Matrix
. I assume it is better to allow assigning of matrices woth different compile time sizes. Do you think it is prefereble to fail assigments for blocks with different sizes? Or is there a reason for difference in handling of matrices and blocks?
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
@andrjohns This is ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good to me, merge when ready
Summary
Generalizes inputs to functions that start with a - chol2.
chol2inv
depends onmdivide_left
,mdivide_left_tri
andmdivide_left_tri_low
so these were generalized as well.Tests
assign()
had different requirements wrt compile time size for Eigen matrices and Eigen blocks. I changed that so it accepts all assignments Eigen does (It previously did that for blocks. For matrices it required same compile time sizes). That also changed some tests.Side Effects
None.
Checklist
Math issue Generalize matrix function signatures #1470
Copyright holder: Tadej Ciglarič
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested