Convolution with Bias CUDNN #335

avik-pal · 2018-07-24T09:38:35Z

This version is buggy and needs fixes before it can be merged. This directly depends on JuliaGPU/CuArrays.jl#100.
On the positive side the major speed issues discussed in avik-pal/DeepLearningBenchmarks#1 is resolved by this.

MikeInnes · 2018-11-08T13:29:51Z

What's the status of this, is it still WIP?

avik-pal · 2018-11-08T13:37:34Z

Would require some updates and small fixes.

avik-pal · 2019-01-19T07:05:50Z

src/layers/conv.jl

@@ -51,11 +51,13 @@ function Base.show(io::IO, l::Conv)
  print(io, ")")
 end

+#=


If these functions are present I am getting an "ambiguous function" exception. How do I fix this?

I hit the same thing while trying to implement this.

One way to fix this is to let (c::Conv)(x) directly call something like _conv(c::Conv, x) and then have a specialized _conv(c::Conv, x::CuParam).

@avik-pal Another solution is to change line 270 in cudnn.jl to something like

(m::Flux.Conv{<:Any, <:Any, W})(x::Union{CuParam{T,4},CuParam{T,5}}) where {T<:CUDNNFloat, W<:CuParam}

This should remove the "ambiguous function" exception.

KristofferC · 2019-01-19T17:39:10Z

src/cuda/cudnn.jl

+            alpha = alpha, workspace = workspace, algo = algo, activationMode = activationMode)
+end
+
+∇conv_bias(Δ::CuArray{T}, b::CuArray{T}; pad = 0, beta = 0,


Can use https://github.com/JuliaGPU/CuArrays.jl/blob/3825b5fb018eb587c1b4be845b3ad5a501feff6f/src/dnn/nnlib.jl#L108-L109 directly?

Thanks, I had overlooked that function.

KristofferC · 2019-01-19T17:46:32Z

I get

Closest candidates are:
  unsafe_convert(::Type{Ptr{Nothing}}, !Matched::Base.CFunction) at c.jl:36
  unsafe_convert(::Type{Ptr{Nothing}}, !Matched::IOStream) at iostream.jl:27
  unsafe_convert(::Type{Ptr{Nothing}}, !Matched::Timer) at event.jl:370
  ...
Stacktrace:
 [1] macro expansion at C:\Users\Kristoffer\.julia\packages\CuArrays\bf9Yf\src\dnn\error.jl:17 [inlined]
 [2] cudnnConvolutionBiasActivationForward(::Base.RefValue{Float32}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,4}, ::CuArrays.CUDNN.FilterDesc, ::CuArray{Float32,4}, ::CuArrays.CUDNN.ConvDesc, ::Int64, ::Nothing, ::Int64, ::Base.RefValue{Float32}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,4}, ::CuArrays.CUDNN.ActivationDesc, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\CuArrays\bf9Yf\src\dnn\libcudnn.jl:224
 [3] #cudnnConvolutionBiasActivationForward#7(::Int64, ::Nothing, ::Int64, ::Int64, ::Int64, ::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Int64, ::Int64, ::Float64, ::UInt32, ::typeof(CuArrays.CUDNN.cudnnConvolutionBiasActivationForward), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\CuArrays\bf9Yf\src\dnn\libcudnn.jl:242
 [4] (::getfield(CuArrays.CUDNN, Symbol("#kw##cudnnConvolutionBiasActivationForward")))(::NamedTuple{(:padding, :stride, :mode, :alpha1, :activationMode, :algo, :workspace, :workspace_size),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Int64,Int64,Int64,Int64,Nothing,Int64}}, ::typeof(CuArrays.CUDNN.cudnnConvolutionBiasActivationForward), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at .\none:0
 [5] #convbias!#33(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Int64, ::Tuple{Int64,Int64}, ::Nothing, ::Int64, ::Int64, ::typeof(Flux.CUDA.convbias!), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\cuda\cudnn.jl:253
 [6] #convbias! at .\none:0 [inlined]
 [7] #convbias#36(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Int64, ::Tuple{Int64,Int64}, ::Nothing, ::Int64, ::Int64, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\cuda\cudnn.jl:261
 [8] (::getfield(Flux.CUDA, Symbol("#kw##convbias")))(::NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at .\none:0
 [9] #_forward#46(::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}}, ::Function, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::TrackedArray{â€¦,CuArray{Float32,4}}, ::TrackedArray{â€¦,CuArray{Float32,1}}) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\cuda\cudnn.jl:299
 [10] (::getfield(Flux.Tracker, Symbol("#kw##_forward")))(::NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::typeof(Flux.Tracker._forward), ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::TrackedArray{â€¦,CuArray{Float32,4}}, ::TrackedArray{â€¦,CuArray{Float32,1}}) at .\none:0
 [11] #track#1(::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}}, ::Function, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::Vararg{Any,N} where N) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\tracker\Tracker.jl:51
 [12] #track at .\none:0 [inlined]

using this, I think the workspace is Nothing or something.

avik-pal · 2019-01-19T18:42:43Z

JuliaGPU/CuArrays.jl#260 should fix this error.

KristofferC · 2019-01-22T13:29:28Z

src/cuda/cudnn.jl

+  else
+    workspace_size = length(workspace[])
+  end
+  cudnnConvolutionBiasActivationForward(y, x, w, b, padding=pad, stride=stride, mode=flipkernel, alpha1=alpha, activationMode=activationMode, algo=algo, workspace=workspace, workspace_size=workspace_size)


AFAIU, using the identity activation function here only works with one convolution algorithm (https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBiasActivationForward):

Note: Only the CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM algo is enabled with CUDNN_ACTIVATION_IDENTITY. In other words, in the cudnnActivationDescriptor_t structure of the input activationDesc, if the mode of the cudnnActivationMode_t field is set to the enum value CUDNN_ACTIVATION_IDENTITY, then the input cudnnConvolutionFwdAlgo_t of this function cudnnConvolutionBiasActivationForward() must be set to the enum value CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM. See also the documentation for the function cudnnSetActivationDescriptor().

Yes. That is correct.
So if we want to use any other algorithm, we should be using the cudnnAddTensor function for bias. (the @ grad for convbias can do it easily)

Yes but if we are using relu activation we want to call this with algo=1 and do the whole relu.(conv .+ b) in one call.

That is actually a bit problematic for implementing the backward pass. https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnActivationBackward. From what I understand we need the intermediate state before applying the activation to calculate the gradient.
However, if we are guaranteed that the user doesn't want gradients (if there is no tracked arrays) we can use it.

In the case of relu (and probably identity) activation function, I am not sure the intermediate value is needed (since they are linear):

julia> m = Conv((2,2), 1=>1, relu) Conv((2, 2), 1=>1, NNlib.relu) julia> x = rand(Float32, 4, 4, 1, 1); xt = TrackedArray(x); julia> t = m(xt); julia> Flux.Tracker.back!(t, t.data); julia> xt.grad # data gradient after relu 4×4×1×1 Array{Float32,4}: [:, :, 1, 1] = -0.0206988 0.323376 0.0 0.0 -0.211878 0.763559 0.27797 0.224908 -0.363751 0.3568 -0.0381117 0.540027 -0.138073 0.0981863 -0.239143 0.17006 julia> m.weight.grad # weight gradient after relu 2×2×1×1 Array{Float32,4}: [:, :, 1, 1] = 1.17692 1.02465 1.54479 1.2704 julia> m.bias.grad # bias gradient after relu 1-element Array{Float32,1}: 2.5020714 julia> NNlib.∇conv_data(t.data, x, m.weight.data) 4×4×1×1 Array{Float32,4}: [:, :, 1, 1] = -0.0206988 0.323376 0.0 0.0 -0.211878 0.763559 0.27797 0.224908 -0.363751 0.3568 -0.0381117 0.540027 -0.138073 0.0981863 -0.239143 0.17006 julia> NNlib.∇conv_filter(t.data, x, m.weight.data) 2×2×1×1 Array{Float32,4}: [:, :, 1, 1] = 1.17692 1.02465 1.54479 1.2704 julia> Flux.CUDA.∇conv_bias(cu(t.data), reshape(cu(m.bias.data), 1, 1, 1, 1)) 1-element CuArray{Float32,1}: 2.5020714

So we get the same answer using backward propagation or the "analytical ones" on the value outputted from the activation function.

Basically something like KristofferC@fd656bc.

Why do we need to redefine all the functions? Can't we simple pass a different argument in activationMode since we do not have to do anything different for backward pass?

Yeah, it's very possible that it can be done cleaner. The tricky part is getting the backward pass to dispatch properly.

avik-pal added 8 commits July 24, 2018 15:05

Integrate the new conv function

3b22965

Some fixes for generalizaions

a82f28e

Typos

97b0959

Fixes

fde04c4

Support pre cudnn 7.1 versions

2193703

Fixes

756c9ef

Replace broadcasted addition

4d40d2f

Keep data of bias

5acd887

avik-pal changed the title ~~[WIP] Convolution with Bias CUDNN~~ Convolution with Bias CUDNN Aug 1, 2018

avik-pal added 4 commits August 3, 2018 18:30

Merge branch 'master' of https://github.com/FluxML/Flux.jl into convbias

4fdda4e

Update the track functions

960b159

Add new tests

72b94d3

Update test

808c3c3

MikeInnes force-pushed the master branch from 65799d1 to 193c4de Compare September 5, 2018 15:53

Fix conflict

4b2bd17

avik-pal changed the title ~~Convolution with Bias CUDNN~~ [WIP] Convolution with Bias CUDNN Nov 14, 2018

avik-pal added 5 commits December 1, 2018 18:09

Fix conflicts

d72aa9c

Update the function calls

db4db19

Missing bracket

f97b937

Fix method ambiguity

ccb3327

Remove redundant test

ef847fe

avik-pal mentioned this pull request Jan 17, 2019

Bottleneck in mapreducedim for convolutional layers #558

Open

avik-pal changed the title ~~[WIP] Convolution with Bias CUDNN~~ Convolution with Bias CUDNN Jan 19, 2019

avik-pal added 2 commits January 19, 2019 11:36

Minor fixes

45f000a

Some hacks to get things working

42a2e5f

avik-pal commented Jan 19, 2019

View reviewed changes

Make them Float32

6b36dd9

KristofferC reviewed Jan 19, 2019

View reviewed changes

KristofferC reviewed Jan 22, 2019

View reviewed changes

CarloLucibello closed this Dec 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convolution with Bias CUDNN #335

Convolution with Bias CUDNN #335

avik-pal commented Jul 24, 2018

MikeInnes commented Nov 8, 2018

avik-pal commented Nov 8, 2018

avik-pal Jan 19, 2019

KristofferC Jan 19, 2019

vandyt Jan 24, 2019

KristofferC Jan 19, 2019

avik-pal Jan 19, 2019

KristofferC commented Jan 19, 2019

avik-pal commented Jan 19, 2019

KristofferC Jan 22, 2019

avik-pal Jan 22, 2019 •

edited

Loading

KristofferC Jan 22, 2019

avik-pal Jan 22, 2019

KristofferC Jan 22, 2019 •

edited

Loading

KristofferC Jan 22, 2019

avik-pal Jan 22, 2019

KristofferC Jan 22, 2019

Convolution with Bias CUDNN #335

Convolution with Bias CUDNN #335

Conversation

avik-pal commented Jul 24, 2018

MikeInnes commented Nov 8, 2018

avik-pal commented Nov 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KristofferC commented Jan 19, 2019

avik-pal commented Jan 19, 2019

Choose a reason for hiding this comment

avik-pal Jan 22, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KristofferC Jan 22, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avik-pal Jan 22, 2019 •

edited

Loading

KristofferC Jan 22, 2019 •

edited

Loading