-
-
Notifications
You must be signed in to change notification settings - Fork 336
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Deploying to gh-pages from @ c917bf9 🚀
- Loading branch information
1 parent
0838d6b
commit d7b5f99
Showing
93 changed files
with
5,321 additions
and
6,607 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
<!doctype html> <html lang=en > <meta charset=UTF-8 > <meta name=viewport content="width=device-width, initial-scale=1"> <link rel=stylesheet href="/css/franklin.css"> <link rel=stylesheet href="/css/tufte.css"> <link rel=stylesheet href="/css/latex.css"> <link rel=stylesheet href="/css/adjust.css"> <link rel=icon href="/assets/favicon.png"> <link rel=stylesheet href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <title>404 - MIT Parallel Computing and Scientific Machine Learning (SciML)</title> <div id=layout > <div id=menu > <ul> <li><a style="font-size:larger;" href=https://github.com/SciML/SciMLBook><i class="fa fa-github"></i></a> <li><a style="font-size:larger;" href="/">Home</a> <li><a style="font-size:larger;" href="/course/">Course</a> <li><a style="font-size:larger;" href="/homework/">Homework</a> <li><a style="font-size:larger;" href="/lectures/">Lectures</a> <li><a style="font-size:larger;" href="/notes/">Notes</a> <ul style="font-size:smaller"> <li><a href="/notes/02-Optimizing_Serial_Code/">02: Serial Code</a> <li><a href="/notes/03-Introduction_to_Scientific_Machine_Learning_through_Physics-Informed_Neural_Networks/">03: SciML Intro</a> <li><a href="/notes/04-How_Loops_Work-An_Introduction_to_Discrete_Dynamics/">04: How Loops Work</a> <li><a href="/notes/05-The_Basics_of_Single_Node_Parallel_Computing/">05: Basics of Parallelism</a> <li><a href="/notes/06-The_Different_Flavors_of_Parallelism/">06: Flavors of Parallelism</a> <li><a href="/notes/07-Ordinary_Differential_Equations-Applications_and_Discretizations/">07: ODEs</a> <li><a href="/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/">08: Forward AD</a> <li><a href="/notes/09-Solving_Stiff_Ordinary_Differential_Equations/">09: Stiff ODEs</a> <li><a href="/notes/10-Basic_Parameter_Estimation-Reverse-Mode_AD-and_Inverse_Problems/">10: Reverse AD</a> <li><a href="/notes/11-Differentiable_Programming_and_Neural_Differential_Equations/">11: δP</a> <li><a href="/notes/12-Description_of_MPI_and_MPI/">12: MPI</a> <li><a href="/notes/13-GPU_programming/">13: GPUs</a> <li><a href="/notes/14-PDEs_Convolutions_and_the_Mathematics_of_Locality/">14: PDEs</a> <li><a href="/notes/15-Mixing_Differential_Equations_and_Neural_Networks_for_Physics-Informed_Learning/">15: Physics Informed Learning</a> <li><a href="/notes/16-From_Optimization_to_Probabilistic_Programming/">16: Probabilistic Programming</a> <li><a href="/notes/17-Global_Sensitivity_Analysis/">17: Global Sensitivity Analysis</a> <li><a href="/notes/18-Code_Profiling_and_Optimization/">18: Profiling & Optimization</a> <li><a href="/notes/19-Uncertainty_Programming-Generalized_Uncertainty_Quantification/">19: Uncertainty Programming</a> </ul> </ul> </div> <div id=main > <div class=franklin-content > <div style="margin-top: 40px; font-size: 40px; text-align: center;"> <br> <div style="font-weight: bold;"> 404 </div> <br> <br> The requested page was not found <br> <br> <br> <br> <div style="margin-bottom: 300px; font-size: 24px"> <a href="/">Click here</a> to go back to the homepage. </div> </div> <div class=back-to-top > <span><a href="#" title="Back to Top"><i class="fa fa-chevron-circle-up"></i></a></span> </div> <div class=page-foot > <div class=copyright > <a href=https://github.com/SciML/SciMLBook>SciML Book source code</a> <br> © Chris Rackauckas. Last modified: October 01, 2024. <br> Built with <a href="https://github.com/tlienart/Franklin.jl">Franklin.jl</a> and the <a href="https://julialang.org">Julia programming language</a>. </div> </div> </div> </div> </div> | ||
<!doctype html> <html lang=en > <meta charset=UTF-8 > <meta name=viewport content="width=device-width, initial-scale=1"> <link rel=stylesheet href="/css/franklin.css"> <link rel=stylesheet href="/css/tufte.css"> <link rel=stylesheet href="/css/latex.css"> <link rel=stylesheet href="/css/adjust.css"> <link rel=icon href="/assets/favicon.png"> <link rel=stylesheet href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <title>404 - MIT Parallel Computing and Scientific Machine Learning (SciML)</title> <div id=layout > <div id=menu > <ul> <li><a style="font-size:larger;" href=https://github.com/SciML/SciMLBook><i class="fa fa-github"></i></a> <li><a style="font-size:larger;" href="/">Home</a> <li><a style="font-size:larger;" href="/course/">Course</a> <li><a style="font-size:larger;" href="/homework/">Homework</a> <li><a style="font-size:larger;" href="/lectures/">Lectures</a> <li><a style="font-size:larger;" href="/notes/">Notes</a> <ul style="font-size:smaller"> <li><a href="/notes/02-Optimizing_Serial_Code/">02: Serial Code</a> <li><a href="/notes/03-Introduction_to_Scientific_Machine_Learning_through_Physics-Informed_Neural_Networks/">03: SciML Intro</a> <li><a href="/notes/04-How_Loops_Work-An_Introduction_to_Discrete_Dynamics/">04: How Loops Work</a> <li><a href="/notes/05-The_Basics_of_Single_Node_Parallel_Computing/">05: Basics of Parallelism</a> <li><a href="/notes/06-The_Different_Flavors_of_Parallelism/">06: Flavors of Parallelism</a> <li><a href="/notes/07-Ordinary_Differential_Equations-Applications_and_Discretizations/">07: ODEs</a> <li><a href="/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/">08: Forward AD</a> <li><a href="/notes/09-Solving_Stiff_Ordinary_Differential_Equations/">09: Stiff ODEs</a> <li><a href="/notes/10-Basic_Parameter_Estimation-Reverse-Mode_AD-and_Inverse_Problems/">10: Reverse AD</a> <li><a href="/notes/11-Differentiable_Programming_and_Neural_Differential_Equations/">11: δP</a> <li><a href="/notes/12-Description_of_MPI_and_MPI/">12: MPI</a> <li><a href="/notes/13-GPU_programming/">13: GPUs</a> <li><a href="/notes/14-PDEs_Convolutions_and_the_Mathematics_of_Locality/">14: PDEs</a> <li><a href="/notes/15-Mixing_Differential_Equations_and_Neural_Networks_for_Physics-Informed_Learning/">15: Physics Informed Learning</a> <li><a href="/notes/16-From_Optimization_to_Probabilistic_Programming/">16: Probabilistic Programming</a> <li><a href="/notes/17-Global_Sensitivity_Analysis/">17: Global Sensitivity Analysis</a> <li><a href="/notes/18-Code_Profiling_and_Optimization/">18: Profiling & Optimization</a> <li><a href="/notes/19-Uncertainty_Programming-Generalized_Uncertainty_Quantification/">19: Uncertainty Programming</a> </ul> </ul> </div> <div id=main > <div class=franklin-content > <div style="margin-top: 40px; font-size: 40px; text-align: center;"> <br> <div style="font-weight: bold;"> 404 </div> <br> <br> The requested page was not found <br> <br> <br> <br> <div style="margin-bottom: 300px; font-size: 24px"> <a href="/">Click here</a> to go back to the homepage. </div> </div> <div class=back-to-top > <span><a href="#" title="Back to Top"><i class="fa fa-chevron-circle-up"></i></a></span> </div> <div class=page-foot > <div class=copyright > <a href=https://github.com/SciML/SciMLBook>SciML Book source code</a> <br> © Chris Rackauckas. Last modified: December 11, 2024. <br> Built with <a href="https://github.com/tlienart/Franklin.jl">Franklin.jl</a> and the <a href="https://julialang.org">Julia programming language</a>. </div> </div> </div> </div> </div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
<h1 class=title >Neural Ordinary Differential Equation Adjoints</h1> <h5>Chris Rackauckas</h5> <h5>November 20th, 2020</h5> <p>In this homework, we will write an implementation of neural ordinary differential equations from scratch. You may use the <code>DifferentialEquations.jl</code> ODE solver, but not the adjoint sensitivities functionality. Optionally, a second problem is to add GPU support to your implementation.</p> <p>Due December 9th, 2020 at midnight.</p> <p>Please email the results to <code>18337.mit.psets@gmail.com</code>.</p> <h2>Problem 1: Neural ODE from Scratch</h2> <p>In this problem we will work through the development of a neural ODE.</p> <h3>Part 1: Gradients as vjps</h3> <p>Use the definition of the pullback as a vector-Jacobian product (vjp) to show that <span class=math >$B_f^x(1) = \left( \nabla f(x) \right)^{T}$</span> for a function <span class=math >$f: \mathbb{R}^n \rightarrow \mathbb{R}$</span>.</p> <p>(Hint: if you put 1 into the pullback, what kind of function is it? What does the Jacobian look like?)</p> <h3>Part 2: Backpropagation of a neural network</h3> <p>Implement a simple <span class=math >$NN: \mathbb{R}^2 \rightarrow \mathbb{R}^2$</span> neural network</p> <p class=math >\[ NN(u;W_i,b_i) = W_2 tanh.(W_1 u + b_1) + b_2 \]</p> <p>where <span class=math >$W_1$</span> is <span class=math >$50 \times 2$</span>, <span class=math >$b_1$</span> is length 50, <span class=math >$W_2$</span> is <span class=math >$2 \times 50$</span>, and <span class=math >$b_2$</span> is length 2. Implement the pullback of the neural network: <span class=math >$B_{NN}^{u,W_i,b_i}(y)$</span> to calculate the derivative of the neural network with respect to each of these inputs. Check for correctness by using ForwardDiff.jl to calculate the gradient.</p> <h3>Part 3: Implementing an ODE adjoint</h3> <p>The adjoint of an ODE can be described as the set of vector equations:</p> <p class=math >\[ \begin{align} u' &= f(u,p,t)\\ \end{align} \]</p> <p>forward, and then</p> <p class=math >\[ \begin{align} \lambda' &= -\lambda^\ast \frac{\partial f}{\partial u}\\ \mu' &= -\lambda^\ast \frac{\partial f}{\partial p}\\ \end{align} \]</p> <p>solved in reverse time from <span class=math >$T$</span> to <span class=math >$0$</span> for some cost function <span class=math >$C(p)$</span>. For this problem, we will use the L2 loss function.</p> <p>Note that <span class=math >$\mu(T) = 0$</span> and <span class=math >$\lambda(T) = \frac{\partial C}{\partial u(T)}$</span>. This is written in the form where the only data point is at time <span class=math >$T$</span>. If that is not the case, the reverse solve needs to add the jump <span class=math >$\frac{\partial C}{\partial u(t_i)}$</span> to <span class=math >$\lambda$</span> at each data point <span class=math >$u(t_i)$</span>. <a href="https://diffeq.sciml.ai/stable/features/callback_functions/#Example-1:-Interventions-at-Preset-Times">Use this example</a> for how to add these jumps to the equation.</p> <p>Using this formulation of the adjoint, it holds that <span class=math >$\mu(0) = \frac{\partial C}{\partial p}$</span>, and thus solving these ODEs in reverse gives the solution for the gradient as a part of the system at time zero.</p> <p>Notice that <span class=math >$B_f^u(\lambda) = \lambda^\ast \frac{\partial f}{\partial u}$</span> and similarly for <span class=math >$\mu$</span>. Implement an adjoint calculation for a neural ordinary differential equation where</p> <p class=math >\[ u' = NN(u) \]</p> <p>from above. Solve the ODE forwards using OrdinaryDiffEq.jl's <code>Tsit5()</code> integrator, then use the interpolation from the forward pass for the <code>u</code> values of the backpass and solve.</p> <p>(Note: you will want to double check this gradient by using something like ForwardDiff! Start with only measuring the datapoint at the end, then try multiple data points.)</p> <h3>Part 4: Training the neural ODE</h3> <p>Generate data from the ODE <span class=math >$u' = Au$</span> where <code>A = [-0.1 2.0; -2.0 -0.1]</code> at <code>t=0.0:0.1:1.0</code> (use <code>saveat</code>) with <span class=math >$u(0) = [2,0]$</span>. Define the cost function <code>C(θ)</code> to be the Euclidean distance between the neural ODE's solution and the data. Optimize this cost function by using gradient descent where the gradient is your adjoint method's output.</p> <p>(Note: calculate the cost and the gradient at the same time by using the forward pass to calculate the cost, and then use it in the adjoint for the interpolation. Note that you should not use <code>saveat</code> in the forward pass then, because otherwise the interpolation is linear. Instead, post-interpolate the data points.)</p> <h2>(Optional) Problem 2: Array-Based GPU Computing</h2> <p>If you have access to a GPU, you may wish to try the following.</p> <h3>Part 1: GPU Neural Network</h3> <p>Change your neural network to be GPU-accelerated by using CuArrays.jl for the underlying array types.</p> <h3>Part 2: GPU Neural ODE</h3> <p>Change the initial condition of the ODE solves to a CuArray to make your neural ODE GPU-accelerated.</p> <div class=footer > <p> Published from <a href=hw3.jmd >hw3.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.12 on 2024-10-01. </p> </div> | ||
<h1 class=title >Neural Ordinary Differential Equation Adjoints</h1> <h5>Chris Rackauckas</h5> <h5>November 20th, 2020</h5> <p>In this homework, we will write an implementation of neural ordinary differential equations from scratch. You may use the <code>DifferentialEquations.jl</code> ODE solver, but not the adjoint sensitivities functionality. Optionally, a second problem is to add GPU support to your implementation.</p> <p>Due December 9th, 2020 at midnight.</p> <p>Please email the results to <code>18337.mit.psets@gmail.com</code>.</p> <h2>Problem 1: Neural ODE from Scratch</h2> <p>In this problem we will work through the development of a neural ODE.</p> <h3>Part 1: Gradients as vjps</h3> <p>Use the definition of the pullback as a vector-Jacobian product (vjp) to show that <span class=math >$B_f^x(1) = \left( \nabla f(x) \right)^{T}$</span> for a function <span class=math >$f: \mathbb{R}^n \rightarrow \mathbb{R}$</span>.</p> <p>(Hint: if you put 1 into the pullback, what kind of function is it? What does the Jacobian look like?)</p> <h3>Part 2: Backpropagation of a neural network</h3> <p>Implement a simple <span class=math >$NN: \mathbb{R}^2 \rightarrow \mathbb{R}^2$</span> neural network</p> <p class=math >\[ NN(u;W_i,b_i) = W_2 tanh.(W_1 u + b_1) + b_2 \]</p> <p>where <span class=math >$W_1$</span> is <span class=math >$50 \times 2$</span>, <span class=math >$b_1$</span> is length 50, <span class=math >$W_2$</span> is <span class=math >$2 \times 50$</span>, and <span class=math >$b_2$</span> is length 2. Implement the pullback of the neural network: <span class=math >$B_{NN}^{u,W_i,b_i}(y)$</span> to calculate the derivative of the neural network with respect to each of these inputs. Check for correctness by using ForwardDiff.jl to calculate the gradient.</p> <h3>Part 3: Implementing an ODE adjoint</h3> <p>The adjoint of an ODE can be described as the set of vector equations:</p> <p class=math >\[ \begin{align} u' &= f(u,p,t)\\ \end{align} \]</p> <p>forward, and then</p> <p class=math >\[ \begin{align} \lambda' &= -\lambda^\ast \frac{\partial f}{\partial u}\\ \mu' &= -\lambda^\ast \frac{\partial f}{\partial p}\\ \end{align} \]</p> <p>solved in reverse time from <span class=math >$T$</span> to <span class=math >$0$</span> for some cost function <span class=math >$C(p)$</span>. For this problem, we will use the L2 loss function.</p> <p>Note that <span class=math >$\mu(T) = 0$</span> and <span class=math >$\lambda(T) = \frac{\partial C}{\partial u(T)}$</span>. This is written in the form where the only data point is at time <span class=math >$T$</span>. If that is not the case, the reverse solve needs to add the jump <span class=math >$\frac{\partial C}{\partial u(t_i)}$</span> to <span class=math >$\lambda$</span> at each data point <span class=math >$u(t_i)$</span>. <a href="https://diffeq.sciml.ai/stable/features/callback_functions/#Example-1:-Interventions-at-Preset-Times">Use this example</a> for how to add these jumps to the equation.</p> <p>Using this formulation of the adjoint, it holds that <span class=math >$\mu(0) = \frac{\partial C}{\partial p}$</span>, and thus solving these ODEs in reverse gives the solution for the gradient as a part of the system at time zero.</p> <p>Notice that <span class=math >$B_f^u(\lambda) = \lambda^\ast \frac{\partial f}{\partial u}$</span> and similarly for <span class=math >$\mu$</span>. Implement an adjoint calculation for a neural ordinary differential equation where</p> <p class=math >\[ u' = NN(u) \]</p> <p>from above. Solve the ODE forwards using OrdinaryDiffEq.jl's <code>Tsit5()</code> integrator, then use the interpolation from the forward pass for the <code>u</code> values of the backpass and solve.</p> <p>(Note: you will want to double check this gradient by using something like ForwardDiff! Start with only measuring the datapoint at the end, then try multiple data points.)</p> <h3>Part 4: Training the neural ODE</h3> <p>Generate data from the ODE <span class=math >$u' = Au$</span> where <code>A = [-0.1 2.0; -2.0 -0.1]</code> at <code>t=0.0:0.1:1.0</code> (use <code>saveat</code>) with <span class=math >$u(0) = [2,0]$</span>. Define the cost function <code>C(θ)</code> to be the Euclidean distance between the neural ODE's solution and the data. Optimize this cost function by using gradient descent where the gradient is your adjoint method's output.</p> <p>(Note: calculate the cost and the gradient at the same time by using the forward pass to calculate the cost, and then use it in the adjoint for the interpolation. Note that you should not use <code>saveat</code> in the forward pass then, because otherwise the interpolation is linear. Instead, post-interpolate the data points.)</p> <h2>(Optional) Problem 2: Array-Based GPU Computing</h2> <p>If you have access to a GPU, you may wish to try the following.</p> <h3>Part 1: GPU Neural Network</h3> <p>Change your neural network to be GPU-accelerated by using CuArrays.jl for the underlying array types.</p> <h3>Part 2: GPU Neural ODE</h3> <p>Change the initial condition of the ODE solves to a CuArray to make your neural ODE GPU-accelerated.</p> <div class=footer > <p> Published from <a href=hw3.jmd >hw3.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.12 on 2024-12-11. </p> </div> |
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.