-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Parallel and Bi layer #662
Conversation
To enable a bidirectional Base.reverse() is defined for common Flux datastructures
I messed my original Fork up and therefore created a new one. |
The approach is still the same: The user adds some layers and defines custom mappings and inverses for the input. After the mapping-phase comes the reduce-phase, which defaults to concatenation of the parallel layers: mutable struct Parallel
layers::Vector
map::Vector{Function}
inv::Vector{Function}
reduce::Function
end
function (p::Parallel)(xs))
# y = map(x) |> f |> map^-1
apply(l) = p.inv[l](p.layers[l](p.map[l](xs)))
Z = map(l -> apply(l), eachindex(p.layers))
p.reduce(Z)
end For the full implementation see: The unit test |
Is there a reason that |
@Tokazama you mean something like this? I just was unaware of where see my I'v added a
Well technical it is possible, but I don't know if this would hinder more than help for support of Flux.gpu()? @MikeInnes @ViralBShah Also it would add |
If I try a to add gpu support with an explicit mapping function, which preallocates the intermediate Matrix: # explicit mapping - preallocated size
first = apply(1)
Z = Vector{typeof(first)}(UndefInitializer(), length(p.layers))
for l in eachindex(p.layers)
if l == 1
Z[l] = first
else
Z[l] = apply(l)
end
end then I run into the following error (without the explicit mapping I run into the same error, but I thought this would help maybe for the gpu inference stuff): models using a `Parallel` layer: Error During Test at /home/sebastian/develop/julia/Flux.jl/test/layers/parallel.jl:40
Got exception outside of a @test
ReadOnlyMemoryError()
Stacktrace:
[1] gemv!(::Char, ::Float32, ::Array{Float32,2}, ::CuArray{Float32,1}, ::Float32, ::CuArray{Float32,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/blas.jl:577
[2] gemv!(::CuArray{Float32,1}, ::Char, ::Array{Float32,2}, ::CuArray{Float32,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/matmul.jl:360
[3] mul! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/matmul.jl:64 [inlined]
[4] * at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/matmul.jl:46 [inlined]
[5] _forward at /home/sebastian/develop/julia/Flux.jl/src/tracker/lib/array.jl:370 [inlined]
[6] #track#1 at /home/sebastian/develop/julia/Flux.jl/src/tracker/Tracker.jl:51 [inlined]
[7] track at /home/sebastian/develop/julia/Flux.jl/src/tracker/Tracker.jl:51 [inlined]
[8] * at /home/sebastian/develop/julia/Flux.jl/src/tracker/lib/array.jl:362 [inlined]
[9] Dense at /home/sebastian/develop/julia/Flux.jl/src/layers/basic.jl:82 [inlined]
[10] (::Dense{typeof(identity),TrackedArray{…,Array{Float32,2}},TrackedArray{…,Array{Float32,1}}})(::CuArray{Float32,1}) at /home/sebastian/develop/julia/Flux.jl/src/layers/basic.jl:122
[11] (::getfield(Flux, Symbol("#apply#104")){Parallel,CuArray{Float32,1}})(::Int64) at /home/sebastian/develop/julia/Flux.jl/src/layers/parallel.jl:67
[12] (::Parallel)(::CuArray{Float32,1}) at /home/sebastian/develop/julia/Flux.jl/src/layers/parallel.jl:73
[13] applychain(::Tuple{Parallel,Dense{typeof(identity),TrackedArray{…,Array{Float32,2}},TrackedArray{…,Array{Float32,1}}}}, ::CuArray{Float32,1}) at /home/sebastian/develop/julia/Flux.jl/src/layers/basic.jl:31
[14] (::Chain{Tuple{Parallel,Dense{typeof(identity),TrackedArray{…,Array{Float32,2}},TrackedArray{…,Array{Float32,1}}}}})(::CuArray{Float32,1}) at /home/sebastian/develop/julia/Flux.jl/src/layers/basic.jl:33
[15] top-level scope at /home/sebastian/develop/julia/Flux.jl/test/layers/parallel.jl:46
[16] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
[17] top-level scope at /home/sebastian/develop/julia/Flux.jl/test/layers/parallel.jl:40
[18] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
[19] top-level scope at /home/sebastian/develop/julia/Flux.jl/test/layers/parallel.jl:10
[20] include at ./boot.jl:326 [inlined]
[21] include_relative(::Module, ::String) at ./loading.jl:1038
[22] include(::Module, ::String) at ./sysimg.jl:29
[23] include(::String) at ./client.jl:403
[24] top-level scope at none:0
[25] eval(::Module, ::Any) at ./boot.jl:328
[26] eval_user_input(::Any, ::REPL.REPLBackend) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:85
[27] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:117 [inlined]
[28] (::getfield(REPL, Symbol("##26#27")){REPL.REPLBackend})() at ./task.jl:259 |
To clarify, I'm in no position to say whether this should or shouldn't have |
There isn't much reason to use I'm kind of unsure about this I see the the motivation is to have more convenient bidirectional RNNs. It'd be fine to just fold |
Sure, I am also not very convinced that this is very compatible to the non sequential layers like the I had seen that the @thebhatman tried to go another way by just using |
I modified flip by using reverse for multidimensional arrays so that it reverses only a specific dimension. I created the bidirectional layer using this. |
@thebhatman can you link to your approach? I can't find I right now. |
Is each element of |
@Tokazama I think it depends on what you put into the |
Since we now have a |
Thanks all |
@MikeInnes I've cleaned the parallel branch and separated the peephole LSTM into another branch for another pull request.
To enable a bidirectional model
Base.reverse()
is defined for common Flux data structures.