Closed
Description
Goal
High-level: Support structures beyond arrays and numbers in DifferentiationInterface.
Low-level: Write tests for taking gradients of neural networks with Flux and Lux.
Steps
- Read the slides of my autodiff intro https://gdalle.github.io/JuliaCon2024-AutoDiff/#/title-slide
- Read some of the DifferentiationInterface documentation to understand the general ideas:
- Read DITest source code to understand how test scenarios are defined
- Fork the DifferentiationInterface repository and open a pull request
- Add a file to
DifferentiationInterfaceTest/src/scenarios
calledflux.jl
- Define a
GradientScenario
involving a very simple neural network built with Flux.jl, for instance the one in this tutorial. - Usually in deep learning we differentiate with respect to the parameters of a layer. In Flux, these parameters are stored within the layer itself. So the gradient we need is the gradient of
layer(input)
with respect tolayer
!!! In other words, for yourGradientScenario
, you will havef(layer) = layer(fixed_input)
as the function (it only applies the layer to a fixed input). - Add a file to
DifferentiationInterface/test/Single/Zygote
calledflux.jl
and test your scenario withDifferentiationInterfaceTest.test_differentiation
. Take inspiration from the other test files. - Since
layer
is not an array, the returned type will not be an array either: the gradient will be some form of Flux layer as well (I think), so you probably want to compute the ground truth with Zygote at first to see how it is structured.
If you need help
- Come find me (@gdalle)
- Check out https://modernjuliaworkflows.github.io/ for the Julia development tricks
- Check out https://kshyatt.github.io/post/firstjuliapr/ for the GitHub-related issues
Participants