Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CPUStatic backend and implemented new initialparameters interface. #22

Merged
merged 12 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .githooks/pre-push
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# pre-push git hook that runs all tests before pushing

red='\033[0;31m'
green='\033[0;32m'
no_color='\033[0m'

reponame=$(basename `git rev-parse --show-toplevel`)


echo "\nRunning pre-push hook\n"
echo "Testing $reponame"
julia --project=@. -e "using Pkg; Pkg.test(\"AbstractNeuralNetworks\")"

if [[ $? -ne 0 ]]; then
echo "\n${red}ERROR - Tests must pass before push!\n${no_color}"
exit 1
fi

echo "\n${green}Git hook was SUCCESSFUL!${no_color}\n"
4 changes: 4 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,18 @@ authors = ["Michael Kraus"]
version = "0.4.0"

[deps]
GPUArraysCore = "46192b85-c4d5-4398-a991-12ede77f4527"
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"

[compat]
GPUArraysCore = "0.2.0"
HDF5 = "0.17.2"
KernelAbstractions = "0.9"
StaticArrays = "1.9.8"
julia = "1.6"

[extras]
Expand Down
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,11 @@

This package implements abstract and general data structures for the construction of neural networks, e.g., layers, chains, and architectures.
It mainly serves as a common base package for [GeometricMachineLearning.jl](https://github.com/JuliaGNI/GeometricMachineLearning.jl) and [SymbolicNetworks.jl](https://github.com/JuliaGNI/SymbolicNetworks.jl).


## Development

We are using git hooks, e.g., to enforce that all tests pass before pushing. In order to activate these hooks, the following command must be executed once:
```
git config core.hooksPath .githooks
```
2 changes: 2 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
[deps]
AbstractNeuralNetworks = "60874f82-5ada-4c70-bd1c-fa6be7711c8a"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
19 changes: 16 additions & 3 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,13 +1,24 @@
using AbstractNeuralNetworks
using Documenter
using DocumenterCitations
import Pkg

PROJECT_TOML = Pkg.TOML.parsefile(joinpath(@__DIR__, "..", "Project.toml"))
VERSION = PROJECT_TOML["version"]
NAME = PROJECT_TOML["name"]
AUTHORS = join(PROJECT_TOML["authors"], ", ") * " and contributors"
GITHUB = "https://github.com/JuliaGNI/AbstractNeuralNetworks.jl"

bib = CitationBibliography(joinpath(@__DIR__, "src", "AbstractNeuralNetworks.bib"))

DocMeta.setdocmeta!(AbstractNeuralNetworks, :DocTestSetup, :(using AbstractNeuralNetworks); recursive=true)

makedocs(;
plugins=[bib],
modules=[AbstractNeuralNetworks],
authors="Michael Kraus",
authors=AUTHORS,
repo="https://github.com/JuliaGNI/AbstractNeuralNetworks.jl/blob/{commit}{path}#{line}",
sitename="AbstractNeuralNetworks.jl",
sitename=NAME,
format=Documenter.HTML(;
prettyurls=get(ENV, "CI", "false") == "true",
canonical="https://JuliaGNI.github.io/AbstractNeuralNetworks.jl",
Expand All @@ -16,11 +27,13 @@ makedocs(;
),
pages=[
"Home" => "index.md",
"Static Neural Network Parameters" => "static_neural_network_parameters.md",
"References" => "bibliography.md"
],
)

deploydocs(;
repo = "github.com/JuliaGNI/AbstractNeuralNetworks.jl",
repo = GITHUB,
devurl = "latest",
devbranch = "main",
)
8 changes: 8 additions & 0 deletions docs/src/AbstractNeuralNetworks.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
@inproceedings{glorot2010understanding,
title={Understanding the difficulty of training deep feedforward neural networks},
author={Glorot, Xavier and Bengio, Yoshua},
booktitle={Proceedings of the thirteenth international conference on artificial intelligence and statistics},
pages={249--256},
year={2010},
organization={JMLR Workshop and Conference Proceedings}
}
5 changes: 5 additions & 0 deletions docs/src/bibliography.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# References

```@bibliography
*
```
51 changes: 51 additions & 0 deletions docs/src/static_neural_network_parameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Static Neural Network Parameters

We can also allocate neural network parameters using [`StaticArrays`](https://github.com/JuliaArrays/StaticArrays.jl). Therefore we simply need to set the keyword `static` to true in the [`NeuralNetwork`](@ref) constructor.

!!! warning
Static neural network parameters are only supported for dense CPU arrays. `AbstractNeuralNetworks` defines a type `CPUStatic`, but does not have equivalent GPU objects.

```@example static_parameters
using AbstractNeuralNetworks
import Random
Random.seed!(123)

backend = AbstractNeuralNetworks.CPUStatic()
input_dim = 2
n_hidden_layers = 100
c = Chain(Dense(input_dim, 10, tanh), Tuple(Dense(10, 10, tanh) for _ in 1:n_hidden_layers)..., Dense(10, 1, tanh))
nn = NeuralNetwork(c, backend)
typeof(nn.params.L1.W)
```

We can compare different evaluation times:
```@example static_parameters
nn_cpu = changebackend(CPU(), nn)
second_dim = 200
x = rand(input_dim, second_dim)
nn(x); # hide
@time nn(x);
nothing # hide
```

```@example static_parameters
nn_cpu(x); # hide
@time nn_cpu(x);
nothing # hide
```

If we also make the *input* static, we get:

```@example static_parameters
using StaticArrays
x = @SMatrix rand(input_dim, second_dim)
nn(x);
@time nn(x);
nothing # hide
```

```@example static_parameters
nn_cpu(x); # hide
@time nn_cpu(x);
nothing # hide
```
12 changes: 11 additions & 1 deletion src/AbstractNeuralNetworks.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@ module AbstractNeuralNetworks
using HDF5
using HDF5: H5DataStore
using KernelAbstractions
using GPUArraysCore: AbstractGPUArray
using LinearAlgebra
using StaticArrays
using Random

export CPU, GPU

include("utils/add.jl")
include("utils/zero_vector.jl")

Expand All @@ -23,6 +25,11 @@ module AbstractNeuralNetworks

include("parameters.jl")

include("static_cpu_backend.jl")

export NeuralNetworkBackend, networkbackend

include("neural_network_backend.jl")

export OneInitializer, ZeroInitializer, GlorotUniform

Expand Down Expand Up @@ -67,4 +74,7 @@ module AbstractNeuralNetworks
include("pullback.jl")

export AbstractPullback

export changebackend
include("utils/changebackend.jl")
end
4 changes: 3 additions & 1 deletion src/architecture.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@

"""
Architecture
"""
abstract type Architecture end

struct UnknownArchitecture <: Architecture end
Expand Down
2 changes: 1 addition & 1 deletion src/cells/abstract.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ An `AbstractCell` is a map from $\mathbb{R}^{M}×\mathbb{R}^{N} \rightarrow \mat

Concrete cell types should implement the following functions:

- `initialparameters(backend::Backend, ::Type{T}, cell::AbstractCell; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng())`
- `initialparameters(backend::NeuralNetworkBackend, ::Type{T}, cell::AbstractCell; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng())`
- `update!(::AbstractLayer, θ::NamedTuple, dθ::NamedTuple, η::AbstractFloat)`

and the functors
Expand Down
2 changes: 1 addition & 1 deletion src/cells/grid.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
return Expr(:block, calls...)
end

function initialparameters(gridcell::GridCell, backend::Backend, ::Type{T}; kwargs...) where {T}
function initialparameters(gridcell::GridCell, backend::NeuralNetworkBackend, ::Type{T}; kwargs...) where {T}

Check warning on line 34 in src/cells/grid.jl

View check run for this annotation

Codecov / codecov/patch

src/cells/grid.jl#L34

Added line #L34 was not covered by tests
M, N = size(gridcell)
[initialparameters(cell(gridcell, i, j), backend, T; kwargs...) for i in 1:M, j in 1:N]
end
Expand Down
2 changes: 1 addition & 1 deletion src/cells/gru.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
end


function initialparameters(cell::GRU{M, N, O, P}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O,P,T}
function initialparameters(cell::GRU{M, N, O, P}, backend::NeuralNetworkBackend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O,P,T}

Check warning on line 20 in src/cells/gru.jl

View check run for this annotation

Codecov / codecov/patch

src/cells/gru.jl#L20

Added line #L20 was not covered by tests
Wᵣₓ = KernelAbstractions.zeros(backend, T, N, M)
Wᵣₕ = KernelAbstractions.zeros(backend, T, N, N)
Wᵤₓ = KernelAbstractions.zeros(backend, T, N, M)
Expand Down
2 changes: 1 addition & 1 deletion src/cells/identity.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
return (x, st)
end

function initialparameters(cell::IdentityCell{M, N, O, P}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O, P, T}
function initialparameters(cell::IdentityCell{M, N, O, P}, backend::NeuralNetworkBackend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O, P, T}

Check warning on line 10 in src/cells/identity.jl

View check run for this annotation

Codecov / codecov/patch

src/cells/identity.jl#L10

Added line #L10 was not covered by tests
NamedTuple()
end

Expand Down
2 changes: 1 addition & 1 deletion src/cells/lstm.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
end


function initialparameters(cell::LSTM{M, N, O, P}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O,P,T}
function initialparameters(cell::LSTM{M, N, O, P}, backend::NeuralNetworkBackend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O,P,T}

Check warning on line 23 in src/cells/lstm.jl

View check run for this annotation

Codecov / codecov/patch

src/cells/lstm.jl#L23

Added line #L23 was not covered by tests
Wfₓ = KernelAbstractions.zeros(backend, T, O, M)
Wfₕ = KernelAbstractions.zeros(backend, T, O, O)
Wᵢₓ = KernelAbstractions.zeros(backend, T, O, M)
Expand Down
4 changes: 2 additions & 2 deletions src/cells/recurrent.jl
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

usebias(::Recurrent{M, N, O, P, BIAS}) where {M, N, O, P, BIAS} = BIAS

function initialparameters(cell::Recurrent{M, N, O, P}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O,P,T}
function initialparameters(cell::Recurrent{M, N, O, P}, backend::NeuralNetworkBackend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,O,P,T}

Check warning on line 33 in src/cells/recurrent.jl

View check run for this annotation

Codecov / codecov/patch

src/cells/recurrent.jl#L33

Added line #L33 was not covered by tests
Wₛₛ = KernelAbstractions.zeros(backend, T, P, N)
Wₛₓ = KernelAbstractions.zeros(backend, T, P, M)
Wₒₛ = KernelAbstractions.zeros(backend, T, O, P)
Expand All @@ -44,7 +44,7 @@
(Wₛₛ = Wₛₛ, Wₛₓ = Wₛₓ, Wₒₛ = Wₒₛ, bₛ = bₛ, bₒ = bₒ)
end

function initialparameters(cell::Recurrent{M, N, 0, P}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,P,T}
function initialparameters(cell::Recurrent{M, N, 0, P}, backend::NeuralNetworkBackend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,P,T}

Check warning on line 47 in src/cells/recurrent.jl

View check run for this annotation

Codecov / codecov/patch

src/cells/recurrent.jl#L47

Added line #L47 was not covered by tests
Wₛₛ = KernelAbstractions.zeros(backend, T, P, N)
Wₛₓ = KernelAbstractions.zeros(backend, T, P, M)
bₛ = KernelAbstractions.zeros(backend, T, P)
Expand Down
16 changes: 4 additions & 12 deletions src/chain.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Chain(layers...)
```
or a neural network architecture together with a backend and a parameter type:
```
Chain(::Architecture, ::Backend, ::Type; kwargs...)
Chain(::Architecture, ::NeuralNetworkBackend, ::Type; kwargs...)
Chain(::Architecture, ::Type; kwargs...)
```
If the backend is omitted, the default backend `CPU()` is chosen.
Expand Down Expand Up @@ -46,20 +46,12 @@ end

@inline applychain(layers::Tuple, x, ps::Union{NamedTuple,NeuralNetworkParameters}) = applychain(layers, x, values(ps))

function initialparameters(model::Chain, backend::Backend, ::Type{T}; kwargs...) where {T <: Number}
function initialparameters(rng::AbstractRNG, initializer::Initializer, model::Chain, backend::NeuralNetworkBackend, ::Type{T}; kwargs...) where T
keys = Tuple(Symbol("L$(i)") for i in eachindex(model))
vals = Tuple(initialparameters(layer, backend, T; kwargs...) for layer in model)
NamedTuple{keys}(vals)
vals = Tuple(initialparameters(rng, initializer, layer, backend, T; kwargs...) for layer in model)
NeuralNetworkParameters{keys}(vals)
end

initialparameters(model::Chain, ::Type{T}; kwargs...) where {T <: Number} = initialparameters(model, CPU(), T; kwargs...)

initialparameters(model::Chain, backend::Backend; kwargs...) = initialparameters(model, backend, Float32; kwargs...)

initialparameters(model::Chain, backend::CPU; kwargs...) = initialparameters(model, backend, Float64; kwargs...)

initialparameters(model::Chain; kwargs...) = initialparameters(model, CPU(); kwargs...)

function update!(chain::Chain, params::Tuple, grad::Tuple, η::AbstractFloat)
for (layer, θ, dθ) in zip(chain, params, grad)
update!(layer, θ, dθ, η)
Expand Down
35 changes: 27 additions & 8 deletions src/initializer.jl
Original file line number Diff line number Diff line change
@@ -1,24 +1,43 @@
"""
Initializer

abstract type AbstractInitializer end
Determines how neural network weights are initialized.
"""
abstract type Initializer end

const Initializer = Union{AbstractInitializer, Base.Callable}
"""
ZeroInitializer <: Initializer
"""
struct ZeroInitializer <: Initializer end

struct ZeroInitializer <: AbstractInitializer end
function (::ZeroInitializer)(_, x)
x .= KernelAbstractions.zero(x)

nothing

Check warning on line 16 in src/initializer.jl

View check run for this annotation

Codecov / codecov/patch

src/initializer.jl#L16

Added line #L16 was not covered by tests
end

struct OneInitializer <: AbstractInitializer end
"""
OneInitializer <: Initializer
"""
struct OneInitializer <: Initializer end

function (::OneInitializer)(_, x::AbstractArray{T}) where T
backend = get_backend(x)
backend = networkbackend(x)
x .= KernelAbstractions.ones(backend, T, size(x))

nothing
end

default_initializer() = randn!
"""
GlorotUniform <: Initializer

struct GlorotUniform <: AbstractNeuralNetworks.AbstractInitializer end
Glorot uniform was introduced by [glorot2010understanding](@cite).
"""
struct GlorotUniform <: Initializer end

function (::GlorotUniform)(rng, x::AbstractVecOrMat{T}) where T
rand!(rng, x)
x .= sqrt(T(24.0) / sum(size(x))) * (x .- T(0.5))
end
end

const DefaultInitializer = GlorotUniform
2 changes: 1 addition & 1 deletion src/layers/abstract.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ An `AbstractLayer` is a map from $\mathbb{R}^{M} \rightarrow \mathbb{R}^{N}$.

Concrete layer types should implement the following functions:

- `initialparameters(backend::Backend, ::Type{T}, layer::AbstractLayer; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng())`
- `initialparameters(backend::NeuralNetworkBackend, ::Type{T}, layer::AbstractLayer; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng())`
- `update!(::AbstractLayer, θ::NamedTuple, dθ::NamedTuple, η::AbstractFloat)`

and the functors
Expand Down
4 changes: 2 additions & 2 deletions src/layers/dense.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,15 @@ end

usebias(::Dense{M, N, BIAS}) where {M, N, BIAS} = BIAS

function initialparameters(layer::Dense{M,N,true}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,T}
function initialparameters(rng::AbstractRNG, init::Initializer, ::Dense{M,N,true}, backend::NeuralNetworkBackend, ::Type{T}) where {M,N,T}
W = KernelAbstractions.zeros(backend, T, N, M)
b = KernelAbstractions.zeros(backend, T, N)
init(rng, W)
init(rng, b)
(W = W, b = b)
end

function initialparameters(layer::Dense{M,N,false}, backend::Backend, ::Type{T}; init::Initializer = default_initializer(), rng::AbstractRNG = Random.default_rng()) where {M,N,T}
function initialparameters(rng::AbstractRNG, init::Initializer, ::Dense{M,N,false}, backend::NeuralNetworkBackend, ::Type{T}) where {M,N,T}
W = KernelAbstractions.zeros(backend, T, N, M)
init(rng, W)
(W = W,)
Expand Down
Loading
Loading