Deploying to gh-pages from @ 4792fbc 🚀

SciML · Oct 29, 2023 · 9cbfba6 · 9cbfba6
1 parent 85968e7
commit 9cbfba6
Show file tree

Hide file tree

Showing 93 changed files with 941 additions and 948 deletions.
diff --git a/404.html b/404.html
@@ -1 +1 @@
-<!doctype html> <html lang=en > <meta charset=UTF-8 > <meta name=viewport  content="width=device-width, initial-scale=1"> <link rel=stylesheet  href="/css/franklin.css"> <link rel=stylesheet  href="/css/tufte.css"> <link rel=stylesheet  href="/css/latex.css"> <link rel=stylesheet  href="/css/adjust.css"> <link rel=icon  href="/assets/favicon.png"> <link rel=stylesheet  href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <title>404 - MIT Parallel Computing and Scientific Machine Learning (SciML)</title> <div id=layout > <div id=menu > <ul> <li><a style="font-size:larger;" href=https://github.com/SciML/SciMLBook><i class="fa fa-github"></i></a> <li><a style="font-size:larger;" href="/">Home</a> <li><a style="font-size:larger;" href="/course/">Course</a> <li><a style="font-size:larger;" href="/homework/">Homework</a> <li><a style="font-size:larger;" href="/lectures/">Lectures</a> <li><a style="font-size:larger;" href="/notes/">Notes</a> <ul style="font-size:smaller"> <li><a href="/notes/02-Optimizing_Serial_Code/">02: Serial Code</a> <li><a href="/notes/03-Introduction_to_Scientific_Machine_Learning_through_Physics-Informed_Neural_Networks/">03: SciML Intro</a> <li><a href="/notes/04-How_Loops_Work-An_Introduction_to_Discrete_Dynamics/">04: How Loops Work</a> <li><a href="/notes/05-The_Basics_of_Single_Node_Parallel_Computing/">05: Basics of Parallelism</a> <li><a href="/notes/06-The_Different_Flavors_of_Parallelism/">06: Flavors of Parallelism</a> <li><a href="/notes/07-Ordinary_Differential_Equations-Applications_and_Discretizations/">07: ODEs</a> <li><a href="/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/">08: Forward AD</a> <li><a href="/notes/09-Solving_Stiff_Ordinary_Differential_Equations/">09: Stiff ODEs</a> <li><a href="/notes/10-Basic_Parameter_Estimation-Reverse-Mode_AD-and_Inverse_Problems/">10: Reverse AD</a> <li><a href="/notes/11-Differentiable_Programming_and_Neural_Differential_Equations/">11: δP</a> <li><a href="/notes/12-Description_of_MPI_and_MPI/">12: MPI</a> <li><a href="/notes/13-GPU_programming/">13: GPUs</a> <li><a href="/notes/14-PDEs_Convolutions_and_the_Mathematics_of_Locality/">14: PDEs</a> <li><a href="/notes/15-Mixing_Differential_Equations_and_Neural_Networks_for_Physics-Informed_Learning/">15: Physics Informed Learning</a> <li><a href="/notes/16-From_Optimization_to_Probabilistic_Programming/">16: Probabilistic Programming</a> <li><a href="/notes/17-Global_Sensitivity_Analysis/">17: Global Sensitivity Analysis</a> <li><a href="/notes/18-Code_Profiling_and_Optimization/">18: Profiling & Optimization</a> <li><a href="/notes/19-Uncertainty_Programming-Generalized_Uncertainty_Quantification/">19: Uncertainty Programming</a> </ul> </ul> </div> <div id=main > <div class=franklin-content > <div style="margin-top: 40px; font-size: 40px; text-align: center;"> <br> <div style="font-weight: bold;"> 404 </div> <br> <br> The requested page was not found <br> <br> <br> <br> <div style="margin-bottom: 300px; font-size: 24px"> <a href="/">Click here</a> to go back to the homepage. </div> </div> <div class=back-to-top > <span><a href="#" title="Back to Top"><i class="fa fa-chevron-circle-up"></i></a></span> </div> <div class=page-foot > <div class=copyright > <a href=https://github.com/SciML/SciMLBook>SciML Book source code</a> <br> &copy; Chris Rackauckas. Last modified: July 15, 2023. <br> Built with <a href="https://github.com/tlienart/Franklin.jl">Franklin.jl</a> and the <a href="https://julialang.org">Julia programming language</a>. </div> </div> </div> </div> </div>
+<!doctype html> <html lang=en > <meta charset=UTF-8 > <meta name=viewport  content="width=device-width, initial-scale=1"> <link rel=stylesheet  href="/css/franklin.css"> <link rel=stylesheet  href="/css/tufte.css"> <link rel=stylesheet  href="/css/latex.css"> <link rel=stylesheet  href="/css/adjust.css"> <link rel=icon  href="/assets/favicon.png"> <link rel=stylesheet  href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <title>404 - MIT Parallel Computing and Scientific Machine Learning (SciML)</title> <div id=layout > <div id=menu > <ul> <li><a style="font-size:larger;" href=https://github.com/SciML/SciMLBook><i class="fa fa-github"></i></a> <li><a style="font-size:larger;" href="/">Home</a> <li><a style="font-size:larger;" href="/course/">Course</a> <li><a style="font-size:larger;" href="/homework/">Homework</a> <li><a style="font-size:larger;" href="/lectures/">Lectures</a> <li><a style="font-size:larger;" href="/notes/">Notes</a> <ul style="font-size:smaller"> <li><a href="/notes/02-Optimizing_Serial_Code/">02: Serial Code</a> <li><a href="/notes/03-Introduction_to_Scientific_Machine_Learning_through_Physics-Informed_Neural_Networks/">03: SciML Intro</a> <li><a href="/notes/04-How_Loops_Work-An_Introduction_to_Discrete_Dynamics/">04: How Loops Work</a> <li><a href="/notes/05-The_Basics_of_Single_Node_Parallel_Computing/">05: Basics of Parallelism</a> <li><a href="/notes/06-The_Different_Flavors_of_Parallelism/">06: Flavors of Parallelism</a> <li><a href="/notes/07-Ordinary_Differential_Equations-Applications_and_Discretizations/">07: ODEs</a> <li><a href="/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/">08: Forward AD</a> <li><a href="/notes/09-Solving_Stiff_Ordinary_Differential_Equations/">09: Stiff ODEs</a> <li><a href="/notes/10-Basic_Parameter_Estimation-Reverse-Mode_AD-and_Inverse_Problems/">10: Reverse AD</a> <li><a href="/notes/11-Differentiable_Programming_and_Neural_Differential_Equations/">11: δP</a> <li><a href="/notes/12-Description_of_MPI_and_MPI/">12: MPI</a> <li><a href="/notes/13-GPU_programming/">13: GPUs</a> <li><a href="/notes/14-PDEs_Convolutions_and_the_Mathematics_of_Locality/">14: PDEs</a> <li><a href="/notes/15-Mixing_Differential_Equations_and_Neural_Networks_for_Physics-Informed_Learning/">15: Physics Informed Learning</a> <li><a href="/notes/16-From_Optimization_to_Probabilistic_Programming/">16: Probabilistic Programming</a> <li><a href="/notes/17-Global_Sensitivity_Analysis/">17: Global Sensitivity Analysis</a> <li><a href="/notes/18-Code_Profiling_and_Optimization/">18: Profiling & Optimization</a> <li><a href="/notes/19-Uncertainty_Programming-Generalized_Uncertainty_Quantification/">19: Uncertainty Programming</a> </ul> </ul> </div> <div id=main > <div class=franklin-content > <div style="margin-top: 40px; font-size: 40px; text-align: center;"> <br> <div style="font-weight: bold;"> 404 </div> <br> <br> The requested page was not found <br> <br> <br> <br> <div style="margin-bottom: 300px; font-size: 24px"> <a href="/">Click here</a> to go back to the homepage. </div> </div> <div class=back-to-top > <span><a href="#" title="Back to Top"><i class="fa fa-chevron-circle-up"></i></a></span> </div> <div class=page-foot > <div class=copyright > <a href=https://github.com/SciML/SciMLBook>SciML Book source code</a> <br> &copy; Chris Rackauckas. Last modified: October 29, 2023. <br> Built with <a href="https://github.com/tlienart/Franklin.jl">Franklin.jl</a> and the <a href="https://julialang.org">Julia programming language</a>. </div> </div> </div> </div> </div>
diff --git a/Project.toml b/Project.toml
@@ -55,6 +55,7 @@ Plots = "1"
 SIMD = "3"
 Sobol = "1"
 StaticArrays = "1"
+Statistics = "1"
 StatsPlots = "0.15"
 Sundials = "4"
 Traceur = "0.3"

diff --git a/_weave/homework01/hw1/index.html b/_weave/homework01/hw1/index.html
diff --git a/_weave/homework02/hw2/index.html b/_weave/homework02/hw2/index.html
@@ -11,4 +11,4 @@ <h1 class=title >Homework 2</h1> <h5>Chris Rackauckas</h5> <h5>October 8th, 2020
 module load julia-latest
 module load mpi/mpich-x86_64
 
-mpirun julia mycode.jl</code></pre> <p>to receive two cores on two nodes. Recreate the bandwidth vs message plots and the interpretation. Does the fact that the nodes are physically disconnected cause a substantial difference?</p> <div class=footer > <p> Published from <a href=hw2.jmd >hw2.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.9 on 2023-07-15. </p> </div>
+mpirun julia mycode.jl</code></pre> <p>to receive two cores on two nodes. Recreate the bandwidth vs message plots and the interpretation. Does the fact that the nodes are physically disconnected cause a substantial difference?</p> <div class=footer > <p> Published from <a href=hw2.jmd >hw2.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.9 on 2023-10-29. </p> </div>
diff --git a/_weave/homework03/hw3/index.html b/_weave/homework03/hw3/index.html
@@ -1 +1 @@
-<h1 class=title >Neural Ordinary Differential Equation Adjoints</h1> <h5>Chris Rackauckas</h5> <h5>November 20th, 2020</h5> <p>In this homework, we will write an implementation of neural ordinary differential equations from scratch. You may use the <code>DifferentialEquations.jl</code> ODE solver, but not the adjoint sensitivities functionality. Optionally, a second problem is to add GPU support to your implementation.</p> <p>Due December 9th, 2020 at midnight.</p> <p>Please email the results to <code>18337.mit.psets@gmail.com</code>.</p> <h2>Problem 1: Neural ODE from Scratch</h2> <p>In this problem we will work through the development of a neural ODE.</p> <h3>Part 1: Gradients as vjps</h3> <p>Use the definition of the pullback as a vector-Jacobian product &#40;vjp&#41; to show that <span class=math >$B_f^x(1) = \left( \nabla f(x) \right)^{T}$</span> for a function <span class=math >$f: \mathbb{R}^n \rightarrow \mathbb{R}$</span>.</p> <p>&#40;Hint: if you put 1 into the pullback, what kind of function is it? What does the Jacobian look like?&#41;</p> <h3>Part 2: Backpropagation of a neural network</h3> <p>Implement a simple <span class=math >$NN: \mathbb{R}^2 \rightarrow \mathbb{R}^2$</span> neural network</p> <p class=math >\[ NN(u;W_i,b_i) = W_2 tanh.(W_1 u + b_1) + b_2 \]</p> <p>where <span class=math >$W_1$</span> is <span class=math >$50 \times 2$</span>, <span class=math >$b_1$</span> is length 50, <span class=math >$W_2$</span> is <span class=math >$2 \times 50$</span>, and <span class=math >$b_2$</span> is length 2. Implement the pullback of the neural network: <span class=math >$B_{NN}^{u,W_i,b_i}(y)$</span> to calculate the derivative of the neural network with respect to each of these inputs. Check for correctness by using ForwardDiff.jl to calculate the gradient.</p> <h3>Part 3: Implementing an ODE adjoint</h3> <p>The adjoint of an ODE can be described as the set of vector equations:</p> <p class=math >\[ \begin{align} u' &= f(u,p,t)\\ \end{align} \]</p> <p>forward, and then</p> <p class=math >\[ \begin{align} \lambda' &= -\lambda^\ast \frac{\partial f}{\partial u}\\ \mu' &= -\lambda^\ast \frac{\partial f}{\partial p}\\ \end{align} \]</p> <p>solved in reverse time from <span class=math >$T$</span> to <span class=math >$0$</span> for some cost function <span class=math >$C(p)$</span>. For this problem, we will use the L2 loss function.</p> <p>Note that <span class=math >$\mu(T) = 0$</span> and <span class=math >$\lambda(T) = \frac{\partial C}{\partial u(T)}$</span>. This is written in the form where the only data point is at time <span class=math >$T$</span>. If that is not the case, the reverse solve needs to add the jump <span class=math >$\frac{\partial C}{\partial u(t_i)}$</span> to <span class=math >$\lambda$</span> at each data point <span class=math >$u(t_i)$</span>. <a href="https://diffeq.sciml.ai/stable/features/callback_functions/#Example-1:-Interventions-at-Preset-Times">Use this example</a> for how to add these jumps to the equation.</p> <p>Using this formulation of the adjoint, it holds that <span class=math >$\mu(0) = \frac{\partial C}{\partial p}$</span>, and thus solving these ODEs in reverse gives the solution for the gradient as a part of the system at time zero.</p> <p>Notice that <span class=math >$B_f^u(\lambda) = \lambda^\ast \frac{\partial f}{\partial u}$</span> and similarly for <span class=math >$\mu$</span>. Implement an adjoint calculation for a neural ordinary differential equation where</p> <p class=math >\[ u' = NN(u) \]</p> <p>from above. Solve the ODE forwards using OrdinaryDiffEq.jl&#39;s <code>Tsit5&#40;&#41;</code> integrator, then use the interpolation from the forward pass for the <code>u</code> values of the backpass and solve.</p> <p>&#40;Note: you will want to double check this gradient by using something like ForwardDiff&#33; Start with only measuring the datapoint at the end, then try multiple data points.&#41;</p> <h3>Part 4: Training the neural ODE</h3> <p>Generate data from the ODE <span class=math >$u' = Au$</span> where <code>A &#61; &#91;-0.1 2.0; -2.0 -0.1&#93;</code> at <code>t&#61;0.0:0.1:1.0</code> &#40;use <code>saveat</code>&#41; with <span class=math >$u(0) = [2,0]$</span>. Define the cost function <code>C&#40;θ&#41;</code> to be the Euclidean distance between the neural ODE&#39;s solution and the data. Optimize this cost function by using gradient descent where the gradient is your adjoint method&#39;s output.</p> <p>&#40;Note: calculate the cost and the gradient at the same time by using the forward pass to calculate the cost, and then use it in the adjoint for the interpolation. Note that you should not use <code>saveat</code> in the forward pass then, because otherwise the interpolation is linear. Instead, post-interpolate the data points.&#41;</p> <h2>&#40;Optional&#41; Problem 2: Array-Based GPU Computing</h2> <p>If you have access to a GPU, you may wish to try the following.</p> <h3>Part 1: GPU Neural Network</h3> <p>Change your neural network to be GPU-accelerated by using CuArrays.jl for the underlying array types.</p> <h3>Part 2: GPU Neural ODE</h3> <p>Change the initial condition of the ODE solves to a CuArray to make your neural ODE GPU-accelerated.</p> <div class=footer > <p> Published from <a href=hw3.jmd >hw3.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.9 on 2023-07-15. </p> </div>
+<h1 class=title >Neural Ordinary Differential Equation Adjoints</h1> <h5>Chris Rackauckas</h5> <h5>November 20th, 2020</h5> <p>In this homework, we will write an implementation of neural ordinary differential equations from scratch. You may use the <code>DifferentialEquations.jl</code> ODE solver, but not the adjoint sensitivities functionality. Optionally, a second problem is to add GPU support to your implementation.</p> <p>Due December 9th, 2020 at midnight.</p> <p>Please email the results to <code>18337.mit.psets@gmail.com</code>.</p> <h2>Problem 1: Neural ODE from Scratch</h2> <p>In this problem we will work through the development of a neural ODE.</p> <h3>Part 1: Gradients as vjps</h3> <p>Use the definition of the pullback as a vector-Jacobian product &#40;vjp&#41; to show that <span class=math >$B_f^x(1) = \left( \nabla f(x) \right)^{T}$</span> for a function <span class=math >$f: \mathbb{R}^n \rightarrow \mathbb{R}$</span>.</p> <p>&#40;Hint: if you put 1 into the pullback, what kind of function is it? What does the Jacobian look like?&#41;</p> <h3>Part 2: Backpropagation of a neural network</h3> <p>Implement a simple <span class=math >$NN: \mathbb{R}^2 \rightarrow \mathbb{R}^2$</span> neural network</p> <p class=math >\[ NN(u;W_i,b_i) = W_2 tanh.(W_1 u + b_1) + b_2 \]</p> <p>where <span class=math >$W_1$</span> is <span class=math >$50 \times 2$</span>, <span class=math >$b_1$</span> is length 50, <span class=math >$W_2$</span> is <span class=math >$2 \times 50$</span>, and <span class=math >$b_2$</span> is length 2. Implement the pullback of the neural network: <span class=math >$B_{NN}^{u,W_i,b_i}(y)$</span> to calculate the derivative of the neural network with respect to each of these inputs. Check for correctness by using ForwardDiff.jl to calculate the gradient.</p> <h3>Part 3: Implementing an ODE adjoint</h3> <p>The adjoint of an ODE can be described as the set of vector equations:</p> <p class=math >\[ \begin{align} u' &= f(u,p,t)\\ \end{align} \]</p> <p>forward, and then</p> <p class=math >\[ \begin{align} \lambda' &= -\lambda^\ast \frac{\partial f}{\partial u}\\ \mu' &= -\lambda^\ast \frac{\partial f}{\partial p}\\ \end{align} \]</p> <p>solved in reverse time from <span class=math >$T$</span> to <span class=math >$0$</span> for some cost function <span class=math >$C(p)$</span>. For this problem, we will use the L2 loss function.</p> <p>Note that <span class=math >$\mu(T) = 0$</span> and <span class=math >$\lambda(T) = \frac{\partial C}{\partial u(T)}$</span>. This is written in the form where the only data point is at time <span class=math >$T$</span>. If that is not the case, the reverse solve needs to add the jump <span class=math >$\frac{\partial C}{\partial u(t_i)}$</span> to <span class=math >$\lambda$</span> at each data point <span class=math >$u(t_i)$</span>. <a href="https://diffeq.sciml.ai/stable/features/callback_functions/#Example-1:-Interventions-at-Preset-Times">Use this example</a> for how to add these jumps to the equation.</p> <p>Using this formulation of the adjoint, it holds that <span class=math >$\mu(0) = \frac{\partial C}{\partial p}$</span>, and thus solving these ODEs in reverse gives the solution for the gradient as a part of the system at time zero.</p> <p>Notice that <span class=math >$B_f^u(\lambda) = \lambda^\ast \frac{\partial f}{\partial u}$</span> and similarly for <span class=math >$\mu$</span>. Implement an adjoint calculation for a neural ordinary differential equation where</p> <p class=math >\[ u' = NN(u) \]</p> <p>from above. Solve the ODE forwards using OrdinaryDiffEq.jl&#39;s <code>Tsit5&#40;&#41;</code> integrator, then use the interpolation from the forward pass for the <code>u</code> values of the backpass and solve.</p> <p>&#40;Note: you will want to double check this gradient by using something like ForwardDiff&#33; Start with only measuring the datapoint at the end, then try multiple data points.&#41;</p> <h3>Part 4: Training the neural ODE</h3> <p>Generate data from the ODE <span class=math >$u' = Au$</span> where <code>A &#61; &#91;-0.1 2.0; -2.0 -0.1&#93;</code> at <code>t&#61;0.0:0.1:1.0</code> &#40;use <code>saveat</code>&#41; with <span class=math >$u(0) = [2,0]$</span>. Define the cost function <code>C&#40;θ&#41;</code> to be the Euclidean distance between the neural ODE&#39;s solution and the data. Optimize this cost function by using gradient descent where the gradient is your adjoint method&#39;s output.</p> <p>&#40;Note: calculate the cost and the gradient at the same time by using the forward pass to calculate the cost, and then use it in the adjoint for the interpolation. Note that you should not use <code>saveat</code> in the forward pass then, because otherwise the interpolation is linear. Instead, post-interpolate the data points.&#41;</p> <h2>&#40;Optional&#41; Problem 2: Array-Based GPU Computing</h2> <p>If you have access to a GPU, you may wish to try the following.</p> <h3>Part 1: GPU Neural Network</h3> <p>Change your neural network to be GPU-accelerated by using CuArrays.jl for the underlying array types.</p> <h3>Part 2: GPU Neural ODE</h3> <p>Change the initial condition of the ODE solves to a CuArray to make your neural ODE GPU-accelerated.</p> <div class=footer > <p> Published from <a href=hw3.jmd >hw3.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.9 on 2023-10-29. </p> </div>
diff --git a/_weave/lecture02/jl_UdliHl/optimizing_16_1.png b/_weave/lecture02/jl_UdliHl/optimizing_16_1.png
diff --git a/_weave/lecture02/jl_UdliHl/optimizing_17_1.png b/_weave/lecture02/jl_UdliHl/optimizing_17_1.png
diff --git a/_weave/lecture02/jl_h3cAOL/optimizing_16_1.png b/_weave/lecture02/jl_h3cAOL/optimizing_16_1.png
diff --git a/_weave/lecture02/jl_h3cAOL/optimizing_17_1.png b/_weave/lecture02/jl_h3cAOL/optimizing_17_1.png
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		<!doctype html> <html lang=en > <meta charset=UTF-8 > <meta name=viewport content="width=device-width, initial-scale=1"> <link rel=stylesheet href="/css/franklin.css"> <link rel=stylesheet href="/css/tufte.css"> <link rel=stylesheet href="/css/latex.css"> <link rel=stylesheet href="/css/adjust.css"> <link rel=icon href="/assets/favicon.png"> <link rel=stylesheet href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <title>404 - MIT Parallel Computing and Scientific Machine Learning (SciML)</title> <div id=layout > <div id=menu > <ul> <li><a style="font-size:larger;" href=https://github.com/SciML/SciMLBook><i class="fa fa-github"></i></a> <li><a style="font-size:larger;" href="/">Home</a> <li><a style="font-size:larger;" href="/course/">Course</a> <li><a style="font-size:larger;" href="/homework/">Homework</a> <li><a style="font-size:larger;" href="/lectures/">Lectures</a> <li><a style="font-size:larger;" href="/notes/">Notes</a> <ul style="font-size:smaller"> <li><a href="/notes/02-Optimizing_Serial_Code/">02: Serial Code</a> <li><a href="/notes/03-Introduction_to_Scientific_Machine_Learning_through_Physics-Informed_Neural_Networks/">03: SciML Intro</a> <li><a href="/notes/04-How_Loops_Work-An_Introduction_to_Discrete_Dynamics/">04: How Loops Work</a> <li><a href="/notes/05-The_Basics_of_Single_Node_Parallel_Computing/">05: Basics of Parallelism</a> <li><a href="/notes/06-The_Different_Flavors_of_Parallelism/">06: Flavors of Parallelism</a> <li><a href="/notes/07-Ordinary_Differential_Equations-Applications_and_Discretizations/">07: ODEs</a> <li><a href="/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/">08: Forward AD</a> <li><a href="/notes/09-Solving_Stiff_Ordinary_Differential_Equations/">09: Stiff ODEs</a> <li><a href="/notes/10-Basic_Parameter_Estimation-Reverse-Mode_AD-and_Inverse_Problems/">10: Reverse AD</a> <li><a href="/notes/11-Differentiable_Programming_and_Neural_Differential_Equations/">11: δP</a> <li><a href="/notes/12-Description_of_MPI_and_MPI/">12: MPI</a> <li><a href="/notes/13-GPU_programming/">13: GPUs</a> <li><a href="/notes/14-PDEs_Convolutions_and_the_Mathematics_of_Locality/">14: PDEs</a> <li><a href="/notes/15-Mixing_Differential_Equations_and_Neural_Networks_for_Physics-Informed_Learning/">15: Physics Informed Learning</a> <li><a href="/notes/16-From_Optimization_to_Probabilistic_Programming/">16: Probabilistic Programming</a> <li><a href="/notes/17-Global_Sensitivity_Analysis/">17: Global Sensitivity Analysis</a> <li><a href="/notes/18-Code_Profiling_and_Optimization/">18: Profiling & Optimization</a> <li><a href="/notes/19-Uncertainty_Programming-Generalized_Uncertainty_Quantification/">19: Uncertainty Programming</a> </ul> </ul> </div> <div id=main > <div class=franklin-content > <div style="margin-top: 40px; font-size: 40px; text-align: center;"> <br> <div style="font-weight: bold;"> 404 </div> <br> <br> The requested page was not found <br> <br> <br> <br> <div style="margin-bottom: 300px; font-size: 24px"> <a href="/">Click here</a> to go back to the homepage. </div> </div> <div class=back-to-top > <span><a href="#" title="Back to Top"><i class="fa fa-chevron-circle-up"></i></a></span> </div> <div class=page-foot > <div class=copyright > <a href=https://github.com/SciML/SciMLBook>SciML Book source code</a> <br> © Chris Rackauckas. Last modified: July 15, 2023. <br> Built with <a href="https://github.com/tlienart/Franklin.jl">Franklin.jl</a> and the <a href="https://julialang.org">Julia programming language</a>. </div> </div> </div> </div> </div>
		<!doctype html> <html lang=en > <meta charset=UTF-8 > <meta name=viewport content="width=device-width, initial-scale=1"> <link rel=stylesheet href="/css/franklin.css"> <link rel=stylesheet href="/css/tufte.css"> <link rel=stylesheet href="/css/latex.css"> <link rel=stylesheet href="/css/adjust.css"> <link rel=icon href="/assets/favicon.png"> <link rel=stylesheet href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <title>404 - MIT Parallel Computing and Scientific Machine Learning (SciML)</title> <div id=layout > <div id=menu > <ul> <li><a style="font-size:larger;" href=https://github.com/SciML/SciMLBook><i class="fa fa-github"></i></a> <li><a style="font-size:larger;" href="/">Home</a> <li><a style="font-size:larger;" href="/course/">Course</a> <li><a style="font-size:larger;" href="/homework/">Homework</a> <li><a style="font-size:larger;" href="/lectures/">Lectures</a> <li><a style="font-size:larger;" href="/notes/">Notes</a> <ul style="font-size:smaller"> <li><a href="/notes/02-Optimizing_Serial_Code/">02: Serial Code</a> <li><a href="/notes/03-Introduction_to_Scientific_Machine_Learning_through_Physics-Informed_Neural_Networks/">03: SciML Intro</a> <li><a href="/notes/04-How_Loops_Work-An_Introduction_to_Discrete_Dynamics/">04: How Loops Work</a> <li><a href="/notes/05-The_Basics_of_Single_Node_Parallel_Computing/">05: Basics of Parallelism</a> <li><a href="/notes/06-The_Different_Flavors_of_Parallelism/">06: Flavors of Parallelism</a> <li><a href="/notes/07-Ordinary_Differential_Equations-Applications_and_Discretizations/">07: ODEs</a> <li><a href="/notes/08-Forward-Mode_Automatic_Differentiation_(AD)_via_High_Dimensional_Algebras/">08: Forward AD</a> <li><a href="/notes/09-Solving_Stiff_Ordinary_Differential_Equations/">09: Stiff ODEs</a> <li><a href="/notes/10-Basic_Parameter_Estimation-Reverse-Mode_AD-and_Inverse_Problems/">10: Reverse AD</a> <li><a href="/notes/11-Differentiable_Programming_and_Neural_Differential_Equations/">11: δP</a> <li><a href="/notes/12-Description_of_MPI_and_MPI/">12: MPI</a> <li><a href="/notes/13-GPU_programming/">13: GPUs</a> <li><a href="/notes/14-PDEs_Convolutions_and_the_Mathematics_of_Locality/">14: PDEs</a> <li><a href="/notes/15-Mixing_Differential_Equations_and_Neural_Networks_for_Physics-Informed_Learning/">15: Physics Informed Learning</a> <li><a href="/notes/16-From_Optimization_to_Probabilistic_Programming/">16: Probabilistic Programming</a> <li><a href="/notes/17-Global_Sensitivity_Analysis/">17: Global Sensitivity Analysis</a> <li><a href="/notes/18-Code_Profiling_and_Optimization/">18: Profiling & Optimization</a> <li><a href="/notes/19-Uncertainty_Programming-Generalized_Uncertainty_Quantification/">19: Uncertainty Programming</a> </ul> </ul> </div> <div id=main > <div class=franklin-content > <div style="margin-top: 40px; font-size: 40px; text-align: center;"> <br> <div style="font-weight: bold;"> 404 </div> <br> <br> The requested page was not found <br> <br> <br> <br> <div style="margin-bottom: 300px; font-size: 24px"> <a href="/">Click here</a> to go back to the homepage. </div> </div> <div class=back-to-top > <span><a href="#" title="Back to Top"><i class="fa fa-chevron-circle-up"></i></a></span> </div> <div class=page-foot > <div class=copyright > <a href=https://github.com/SciML/SciMLBook>SciML Book source code</a> <br> © Chris Rackauckas. Last modified: October 29, 2023. <br> Built with <a href="https://github.com/tlienart/Franklin.jl">Franklin.jl</a> and the <a href="https://julialang.org">Julia programming language</a>. </div> </div> </div> </div> </div>
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		<h1 class=title >Neural Ordinary Differential Equation Adjoints</h1> <h5>Chris Rackauckas</h5> <h5>November 20th, 2020</h5> <p>In this homework, we will write an implementation of neural ordinary differential equations from scratch. You may use the <code>DifferentialEquations.jl</code> ODE solver, but not the adjoint sensitivities functionality. Optionally, a second problem is to add GPU support to your implementation.</p> <p>Due December 9th, 2020 at midnight.</p> <p>Please email the results to <code>18337.mit.psets@gmail.com</code>.</p> <h2>Problem 1: Neural ODE from Scratch</h2> <p>In this problem we will work through the development of a neural ODE.</p> <h3>Part 1: Gradients as vjps</h3> <p>Use the definition of the pullback as a vector-Jacobian product (vjp) to show that <span class=math >$B_f^x(1) = \left( \nabla f(x) \right)^{T}$</span> for a function <span class=math >$f: \mathbb{R}^n \rightarrow \mathbb{R}$</span>.</p> <p>(Hint: if you put 1 into the pullback, what kind of function is it? What does the Jacobian look like?)</p> <h3>Part 2: Backpropagation of a neural network</h3> <p>Implement a simple <span class=math >$NN: \mathbb{R}^2 \rightarrow \mathbb{R}^2$</span> neural network</p> <p class=math >\[ NN(u;W_i,b_i) = W_2 tanh.(W_1 u + b_1) + b_2 \]</p> <p>where <span class=math >$W_1$</span> is <span class=math >$50 \times 2$</span>, <span class=math >$b_1$</span> is length 50, <span class=math >$W_2$</span> is <span class=math >$2 \times 50$</span>, and <span class=math >$b_2$</span> is length 2. Implement the pullback of the neural network: <span class=math >$B_{NN}^{u,W_i,b_i}(y)$</span> to calculate the derivative of the neural network with respect to each of these inputs. Check for correctness by using ForwardDiff.jl to calculate the gradient.</p> <h3>Part 3: Implementing an ODE adjoint</h3> <p>The adjoint of an ODE can be described as the set of vector equations:</p> <p class=math >\[ \begin{align} u' &= f(u,p,t)\\ \end{align} \]</p> <p>forward, and then</p> <p class=math >\[ \begin{align} \lambda' &= -\lambda^\ast \frac{\partial f}{\partial u}\\ \mu' &= -\lambda^\ast \frac{\partial f}{\partial p}\\ \end{align} \]</p> <p>solved in reverse time from <span class=math >$T$</span> to <span class=math >$0$</span> for some cost function <span class=math >$C(p)$</span>. For this problem, we will use the L2 loss function.</p> <p>Note that <span class=math >$\mu(T) = 0$</span> and <span class=math >$\lambda(T) = \frac{\partial C}{\partial u(T)}$</span>. This is written in the form where the only data point is at time <span class=math >$T$</span>. If that is not the case, the reverse solve needs to add the jump <span class=math >$\frac{\partial C}{\partial u(t_i)}$</span> to <span class=math >$\lambda$</span> at each data point <span class=math >$u(t_i)$</span>. <a href="https://diffeq.sciml.ai/stable/features/callback_functions/#Example-1:-Interventions-at-Preset-Times">Use this example</a> for how to add these jumps to the equation.</p> <p>Using this formulation of the adjoint, it holds that <span class=math >$\mu(0) = \frac{\partial C}{\partial p}$</span>, and thus solving these ODEs in reverse gives the solution for the gradient as a part of the system at time zero.</p> <p>Notice that <span class=math >$B_f^u(\lambda) = \lambda^\ast \frac{\partial f}{\partial u}$</span> and similarly for <span class=math >$\mu$</span>. Implement an adjoint calculation for a neural ordinary differential equation where</p> <p class=math >\[ u' = NN(u) \]</p> <p>from above. Solve the ODE forwards using OrdinaryDiffEq.jl's <code>Tsit5()</code> integrator, then use the interpolation from the forward pass for the <code>u</code> values of the backpass and solve.</p> <p>(Note: you will want to double check this gradient by using something like ForwardDiff! Start with only measuring the datapoint at the end, then try multiple data points.)</p> <h3>Part 4: Training the neural ODE</h3> <p>Generate data from the ODE <span class=math >$u' = Au$</span> where <code>A = [-0.1 2.0; -2.0 -0.1]</code> at <code>t=0.0:0.1:1.0</code> (use <code>saveat</code>) with <span class=math >$u(0) = [2,0]$</span>. Define the cost function <code>C(θ)</code> to be the Euclidean distance between the neural ODE's solution and the data. Optimize this cost function by using gradient descent where the gradient is your adjoint method's output.</p> <p>(Note: calculate the cost and the gradient at the same time by using the forward pass to calculate the cost, and then use it in the adjoint for the interpolation. Note that you should not use <code>saveat</code> in the forward pass then, because otherwise the interpolation is linear. Instead, post-interpolate the data points.)</p> <h2>(Optional) Problem 2: Array-Based GPU Computing</h2> <p>If you have access to a GPU, you may wish to try the following.</p> <h3>Part 1: GPU Neural Network</h3> <p>Change your neural network to be GPU-accelerated by using CuArrays.jl for the underlying array types.</p> <h3>Part 2: GPU Neural ODE</h3> <p>Change the initial condition of the ODE solves to a CuArray to make your neural ODE GPU-accelerated.</p> <div class=footer > <p> Published from <a href=hw3.jmd >hw3.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.9 on 2023-07-15. </p> </div>
		<h1 class=title >Neural Ordinary Differential Equation Adjoints</h1> <h5>Chris Rackauckas</h5> <h5>November 20th, 2020</h5> <p>In this homework, we will write an implementation of neural ordinary differential equations from scratch. You may use the <code>DifferentialEquations.jl</code> ODE solver, but not the adjoint sensitivities functionality. Optionally, a second problem is to add GPU support to your implementation.</p> <p>Due December 9th, 2020 at midnight.</p> <p>Please email the results to <code>18337.mit.psets@gmail.com</code>.</p> <h2>Problem 1: Neural ODE from Scratch</h2> <p>In this problem we will work through the development of a neural ODE.</p> <h3>Part 1: Gradients as vjps</h3> <p>Use the definition of the pullback as a vector-Jacobian product (vjp) to show that <span class=math >$B_f^x(1) = \left( \nabla f(x) \right)^{T}$</span> for a function <span class=math >$f: \mathbb{R}^n \rightarrow \mathbb{R}$</span>.</p> <p>(Hint: if you put 1 into the pullback, what kind of function is it? What does the Jacobian look like?)</p> <h3>Part 2: Backpropagation of a neural network</h3> <p>Implement a simple <span class=math >$NN: \mathbb{R}^2 \rightarrow \mathbb{R}^2$</span> neural network</p> <p class=math >\[ NN(u;W_i,b_i) = W_2 tanh.(W_1 u + b_1) + b_2 \]</p> <p>where <span class=math >$W_1$</span> is <span class=math >$50 \times 2$</span>, <span class=math >$b_1$</span> is length 50, <span class=math >$W_2$</span> is <span class=math >$2 \times 50$</span>, and <span class=math >$b_2$</span> is length 2. Implement the pullback of the neural network: <span class=math >$B_{NN}^{u,W_i,b_i}(y)$</span> to calculate the derivative of the neural network with respect to each of these inputs. Check for correctness by using ForwardDiff.jl to calculate the gradient.</p> <h3>Part 3: Implementing an ODE adjoint</h3> <p>The adjoint of an ODE can be described as the set of vector equations:</p> <p class=math >\[ \begin{align} u' &= f(u,p,t)\\ \end{align} \]</p> <p>forward, and then</p> <p class=math >\[ \begin{align} \lambda' &= -\lambda^\ast \frac{\partial f}{\partial u}\\ \mu' &= -\lambda^\ast \frac{\partial f}{\partial p}\\ \end{align} \]</p> <p>solved in reverse time from <span class=math >$T$</span> to <span class=math >$0$</span> for some cost function <span class=math >$C(p)$</span>. For this problem, we will use the L2 loss function.</p> <p>Note that <span class=math >$\mu(T) = 0$</span> and <span class=math >$\lambda(T) = \frac{\partial C}{\partial u(T)}$</span>. This is written in the form where the only data point is at time <span class=math >$T$</span>. If that is not the case, the reverse solve needs to add the jump <span class=math >$\frac{\partial C}{\partial u(t_i)}$</span> to <span class=math >$\lambda$</span> at each data point <span class=math >$u(t_i)$</span>. <a href="https://diffeq.sciml.ai/stable/features/callback_functions/#Example-1:-Interventions-at-Preset-Times">Use this example</a> for how to add these jumps to the equation.</p> <p>Using this formulation of the adjoint, it holds that <span class=math >$\mu(0) = \frac{\partial C}{\partial p}$</span>, and thus solving these ODEs in reverse gives the solution for the gradient as a part of the system at time zero.</p> <p>Notice that <span class=math >$B_f^u(\lambda) = \lambda^\ast \frac{\partial f}{\partial u}$</span> and similarly for <span class=math >$\mu$</span>. Implement an adjoint calculation for a neural ordinary differential equation where</p> <p class=math >\[ u' = NN(u) \]</p> <p>from above. Solve the ODE forwards using OrdinaryDiffEq.jl's <code>Tsit5()</code> integrator, then use the interpolation from the forward pass for the <code>u</code> values of the backpass and solve.</p> <p>(Note: you will want to double check this gradient by using something like ForwardDiff! Start with only measuring the datapoint at the end, then try multiple data points.)</p> <h3>Part 4: Training the neural ODE</h3> <p>Generate data from the ODE <span class=math >$u' = Au$</span> where <code>A = [-0.1 2.0; -2.0 -0.1]</code> at <code>t=0.0:0.1:1.0</code> (use <code>saveat</code>) with <span class=math >$u(0) = [2,0]$</span>. Define the cost function <code>C(θ)</code> to be the Euclidean distance between the neural ODE's solution and the data. Optimize this cost function by using gradient descent where the gradient is your adjoint method's output.</p> <p>(Note: calculate the cost and the gradient at the same time by using the forward pass to calculate the cost, and then use it in the adjoint for the interpolation. Note that you should not use <code>saveat</code> in the forward pass then, because otherwise the interpolation is linear. Instead, post-interpolate the data points.)</p> <h2>(Optional) Problem 2: Array-Based GPU Computing</h2> <p>If you have access to a GPU, you may wish to try the following.</p> <h3>Part 1: GPU Neural Network</h3> <p>Change your neural network to be GPU-accelerated by using CuArrays.jl for the underlying array types.</p> <h3>Part 2: GPU Neural ODE</h3> <p>Change the initial condition of the ODE solves to a CuArray to make your neural ODE GPU-accelerated.</p> <div class=footer > <p> Published from <a href=hw3.jmd >hw3.jmd</a> using <a href="http://github.com/JunoLab/Weave.jl">Weave.jl</a> v0.10.9 on 2023-10-29. </p> </div>