Add neural network gradient tests [Juliacon 2024 Hackathon]

**Goal**

High-level: Support structures beyond arrays and numbers in DifferentiationInterface.

Low-level: Write tests for taking gradients of neural networks with Flux and Lux.

**Steps**

1. Read the slides of my autodiff intro https://gdalle.github.io/JuliaCon2024-AutoDiff/#/title-slide
2. Read some of the DifferentiationInterface documentation to understand the general ideas: 
    - [DI tutorial](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterface/stable/tutorial1/)
    - [DI operators](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterface/stable/operators/)
    - [DITest tutorial](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterfaceTest/stable/tutorial/)
    - [DITest Scenario API](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterfaceTest/stable/api/#DifferentiationInterfaceTest.Scenario)
3. Read [DITest source code](https://github.com/gdalle/DifferentiationInterface.jl/blob/60cdb043fe58ef3431513c90a19e20846d979614/DifferentiationInterfaceTest/src/scenarios/default.jl#L160-L246) to understand how test scenarios are defined
4. Fork the DifferentiationInterface repository and open a pull request
5. Add a file to [`DifferentiationInterfaceTest/src/scenarios`](https://github.com/gdalle/DifferentiationInterface.jl/tree/main/DifferentiationInterfaceTest/src/scenarios) called `flux.jl`
6. Define a `GradientScenario` involving a very simple neural network built with [Flux.jl](https://github.com/FluxML/Flux.jl), for instance the one in [this tutorial](https://fluxml.ai/Flux.jl/stable/guide/models/basics/#Stacking-It-Up).
7. Usually in deep learning we differentiate with respect to the parameters of a layer. In Flux, these parameters are stored within the layer itself. So the gradient we need is the gradient of `layer(input)` with respect to `layer`!!! In other words, for your `GradientScenario`, you will have `f(layer) = layer(fixed_input)` as the function (it only applies the layer to a fixed input).
8. Add a file to [`DifferentiationInterface/test/Single/Zygote`](https://github.com/gdalle/DifferentiationInterface.jl/tree/main/DifferentiationInterface/test/Single/Zygote) called `flux.jl` and test your scenario with [`DifferentiationInterfaceTest.test_differentiation`](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterfaceTest/stable/api/#DifferentiationInterfaceTest.test_differentiation). Take inspiration from the other test files.
9. Since `layer` is not an array, the returned type will not be an array either: the gradient will be some form of Flux layer as well (I think), so you probably want to compute the ground truth with Zygote at first to see how it is structured.

**If you need help**

- Come find me (@gdalle)
- Check out https://modernjuliaworkflows.github.io/ for the Julia development tricks
- Check out https://kshyatt.github.io/post/firstjuliapr/ for the GitHub-related issues

**Participants**

- @nialamarcotte

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add neural network gradient tests [Juliacon 2024 Hackathon] #346

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add neural network gradient tests [Juliacon 2024 Hackathon] #346

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions