You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/calvo_gradient.md
+40-12Lines changed: 40 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -944,22 +944,22 @@ in the code.**
944
944
945
945
**Response note to Humphrey** Shouldn't it instead be $ \vec \beta \cdot \beta \cdot (\vec \mu @ B)^T(\vec \mu @ B)$?
946
946
947
-
**Response note to Tom**: Thanks so much for pointing this out, you are right! That is what in my code. Sorry for the typo. I think in every case, we have $\vec{\beta} \cdot \vec{\beta}$. Perhaps we can just define:
947
+
**Response note to Tom**: Thanks so much for pointing this out. You are right! That is what is in my code. Sorry for the typo. I think in every case, we have $\vec{\beta} \cdot \vec{\beta}$. Perhaps we can just define:
\frac{\partial J}{\partial \vec{\mu}} = h_1 B^T \vec{\beta} + 2 h_2 (\vec{\beta} \cdot B^T B \vec{\mu}) - c \vec{\beta} \vec{\mu}
997
995
$$
998
996
999
-
But I think it is safe to ask `JAX` to compute the gradient of $J$ with respect to $\vec \mu$, so we can avoid the manual computation above.
997
+
Hence
998
+
999
+
$$
1000
+
\vec{\mu}^R = \left(2h_2 B^T \operatorname{diag}(\vec{\beta}) B - c \operatorname{diag}(\vec{\beta})\right)^{-1} \left(-h_1 B^T \vec{\beta}\right) \; \text{where } \operatorname{diag}(\vec{\beta}) = \vec{\beta} \cdot \mathbf{I}
1001
+
$$
1002
+
1003
+
The function `compute_μ` tries to implement this analytical solution, but the result agrees with our original method at first, then differs at the last few values (please find the function below).
1004
+
1005
+
I am not yet sure where this discrepancy comes from, but I think it is safe to ask `JAX` to compute the gradient of $J$ with respect to $\vec{\mu}$ because it agrees with our previous lecture and generates the same $V$ value. Hence, it helps us avoid the manual computation above.
0 commit comments