You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Define a GradientScenario involving a very simple neural network built with Flux.jl, for instance the one in this tutorial.
Usually in deep learning we differentiate with respect to the parameters of a layer. In Flux, these parameters are stored within the layer itself. So the gradient we need is the gradient of layer(input) with respect to layer!!! In other words, for your GradientScenario, you will have f(layer) = layer(fixed_input) as the function (it only applies the layer to a fixed input).
Since layer is not an array, the returned type will not be an array either: the gradient will be some form of Flux layer as well (I think), so you probably want to compute the ground truth with Zygote at first to see how it is structured.
Thanks for the experience.
I'm still traveling but I hope to look into what you did afterwards, no
promises though.
See you likely at the next JuliaCon (global).
Alain Marcotte
Avant tout le respect: de soi, des autres, de l'environnement
Le mer. 17 juil. 2024 à 10:14, Guillaume Dalle ***@***.***> a
écrit :
Goal
High-level: Support structures beyond arrays and numbers in DifferentiationInterface.
Low-level: Write tests for taking gradients of neural networks with Flux and Lux.
Steps
DifferentiationInterfaceTest/src/scenarios
calledflux.jl
GradientScenario
involving a very simple neural network built with Flux.jl, for instance the one in this tutorial.layer(input)
with respect tolayer
!!! In other words, for yourGradientScenario
, you will havef(layer) = layer(fixed_input)
as the function (it only applies the layer to a fixed input).DifferentiationInterface/test/Single/Zygote
calledflux.jl
and test your scenario withDifferentiationInterfaceTest.test_differentiation
. Take inspiration from the other test files.layer
is not an array, the returned type will not be an array either: the gradient will be some form of Flux layer as well (I think), so you probably want to compute the ground truth with Zygote at first to see how it is structured.If you need help
Participants
The text was updated successfully, but these errors were encountered: