Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convolution with Bias CUDNN #335

Closed
wants to merge 21 commits into from
Closed

Conversation

avik-pal
Copy link
Member

This version is buggy and needs fixes before it can be merged. This directly depends on JuliaGPU/CuArrays.jl#100.
On the positive side the major speed issues discussed in avik-pal/DeepLearningBenchmarks#1 is resolved by this.

@avik-pal avik-pal changed the title [WIP] Convolution with Bias CUDNN Convolution with Bias CUDNN Aug 1, 2018
@MikeInnes
Copy link
Member

What's the status of this, is it still WIP?

@avik-pal
Copy link
Member Author

avik-pal commented Nov 8, 2018

Would require some updates and small fixes.

@avik-pal avik-pal changed the title Convolution with Bias CUDNN [WIP] Convolution with Bias CUDNN Nov 14, 2018
@avik-pal avik-pal changed the title [WIP] Convolution with Bias CUDNN Convolution with Bias CUDNN Jan 19, 2019
@@ -51,11 +51,13 @@ function Base.show(io::IO, l::Conv)
print(io, ")")
end

#=
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these functions are present I am getting an "ambiguous function" exception. How do I fix this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hit the same thing while trying to implement this.

One way to fix this is to let (c::Conv)(x) directly call something like _conv(c::Conv, x) and then have a specialized _conv(c::Conv, x::CuParam).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@avik-pal Another solution is to change line 270 in cudnn.jl to something like

(m::Flux.Conv{<:Any, <:Any, W})(x::Union{CuParam{T,4},CuParam{T,5}}) where {T<:CUDNNFloat, W<:CuParam}

This should remove the "ambiguous function" exception.

alpha = alpha, workspace = workspace, algo = algo, activationMode = activationMode)
end

∇conv_bias(Δ::CuArray{T}, b::CuArray{T}; pad = 0, beta = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I had overlooked that function.

@KristofferC
Copy link
Contributor

I get

Closest candidates are:
  unsafe_convert(::Type{Ptr{Nothing}}, !Matched::Base.CFunction) at c.jl:36
  unsafe_convert(::Type{Ptr{Nothing}}, !Matched::IOStream) at iostream.jl:27
  unsafe_convert(::Type{Ptr{Nothing}}, !Matched::Timer) at event.jl:370
  ...
Stacktrace:
 [1] macro expansion at C:\Users\Kristoffer\.julia\packages\CuArrays\bf9Yf\src\dnn\error.jl:17 [inlined]
 [2] cudnnConvolutionBiasActivationForward(::Base.RefValue{Float32}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,4}, ::CuArrays.CUDNN.FilterDesc, ::CuArray{Float32,4}, ::CuArrays.CUDNN.ConvDesc, ::Int64, ::Nothing, ::Int64, ::Base.RefValue{Float32}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,4}, ::CuArrays.CUDNN.ActivationDesc, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\CuArrays\bf9Yf\src\dnn\libcudnn.jl:224
 [3] #cudnnConvolutionBiasActivationForward#7(::Int64, ::Nothing, ::Int64, ::Int64, ::Int64, ::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Int64, ::Int64, ::Float64, ::UInt32, ::typeof(CuArrays.CUDNN.cudnnConvolutionBiasActivationForward), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\CuArrays\bf9Yf\src\dnn\libcudnn.jl:242
 [4] (::getfield(CuArrays.CUDNN, Symbol("#kw##cudnnConvolutionBiasActivationForward")))(::NamedTuple{(:padding, :stride, :mode, :alpha1, :activationMode, :algo, :workspace, :workspace_size),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Int64,Int64,Int64,Int64,Nothing,Int64}}, ::typeof(CuArrays.CUDNN.cudnnConvolutionBiasActivationForward), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at .\none:0
 [5] #convbias!#33(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Int64, ::Tuple{Int64,Int64}, ::Nothing, ::Int64, ::Int64, ::typeof(Flux.CUDA.convbias!), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\cuda\cudnn.jl:253
 [6] #convbias! at .\none:0 [inlined]
 [7] #convbias#36(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Int64, ::Tuple{Int64,Int64}, ::Nothing, ::Int64, ::Int64, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\cuda\cudnn.jl:261
 [8] (::getfield(Flux.CUDA, Symbol("#kw##convbias")))(::NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at .\none:0
 [9] #_forward#46(::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}}, ::Function, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::TrackedArray{…,CuArray{Float32,4}}, ::TrackedArray{…,CuArray{Float32,1}}) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\cuda\cudnn.jl:299
 [10] (::getfield(Flux.Tracker, Symbol("#kw##_forward")))(::NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::typeof(Flux.Tracker._forward), ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::TrackedArray{…,CuArray{Float32,4}}, ::TrackedArray{…,CuArray{Float32,1}}) at .\none:0
 [11] #track#1(::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:pad, :stride, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}}, ::Function, ::typeof(Flux.CUDA.convbias), ::CuArray{Float32,4}, ::Vararg{Any,N} where N) at C:\Users\Kristoffer\.julia\packages\Flux\tQQa4\src\tracker\Tracker.jl:51
 [12] #track at .\none:0 [inlined]

using this, I think the workspace is Nothing or something.

@avik-pal
Copy link
Member Author

JuliaGPU/CuArrays.jl#260 should fix this error.

else
workspace_size = length(workspace[])
end
cudnnConvolutionBiasActivationForward(y, x, w, b, padding=pad, stride=stride, mode=flipkernel, alpha1=alpha, activationMode=activationMode, algo=algo, workspace=workspace, workspace_size=workspace_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU, using the identity activation function here only works with one convolution algorithm (https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBiasActivationForward):

Note: Only the CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_​PRECOMP_GEMM algo is enabled with CUDNN_ACTIVATION_IDENTITY. In other words, in the cudnnActivationDescriptor_t structure of the input activationDesc, if the mode of the cudnnActivationMode_t field is set to the enum value CUDNN_ACTIVATION_IDENTITY, then the input cudnnConvolutionFwdAlgo_t of this function cudnnConvolutionBiasActivationForward() must be set to the enum value CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_​PRECOMP_GEMM. See also the documentation for the function cudnnSetActivationDescriptor().

Copy link
Member Author

@avik-pal avik-pal Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That is correct.
So if we want to use any other algorithm, we should be using the cudnnAddTensor function for bias. (the @ grad for convbias can do it easily)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but if we are using relu activation we want to call this with algo=1 and do the whole relu.(conv .+ b) in one call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is actually a bit problematic for implementing the backward pass. https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnActivationBackward. From what I understand we need the intermediate state before applying the activation to calculate the gradient.
However, if we are guaranteed that the user doesn't want gradients (if there is no tracked arrays) we can use it.

Copy link
Contributor

@KristofferC KristofferC Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of relu (and probably identity) activation function, I am not sure the intermediate value is needed (since they are linear):

julia> m = Conv((2,2), 1=>1, relu)
Conv((2, 2), 1=>1, NNlib.relu)

julia> x = rand(Float32, 4, 4, 1, 1); xt = TrackedArray(x);

julia> t = m(xt);

julia> Flux.Tracker.back!(t, t.data);

julia> xt.grad # data gradient after relu
4×4×1×1 Array{Float32,4}:
[:, :, 1, 1] =
 -0.0206988  0.323376    0.0        0.0
 -0.211878   0.763559    0.27797    0.224908
 -0.363751   0.3568     -0.0381117  0.540027
 -0.138073   0.0981863  -0.239143   0.17006

julia> m.weight.grad # weight gradient after relu
2×2×1×1 Array{Float32,4}:
[:, :, 1, 1] =
 1.17692  1.02465
 1.54479  1.2704

julia> m.bias.grad # bias gradient after relu
1-element Array{Float32,1}:
 2.5020714

julia> NNlib.∇conv_data(t.data, x, m.weight.data)
4×4×1×1 Array{Float32,4}:
[:, :, 1, 1] =
 -0.0206988  0.323376    0.0        0.0
 -0.211878   0.763559    0.27797    0.224908
 -0.363751   0.3568     -0.0381117  0.540027
 -0.138073   0.0981863  -0.239143   0.17006

julia> NNlib.∇conv_filter(t.data, x, m.weight.data)
2×2×1×1 Array{Float32,4}:
[:, :, 1, 1] =
 1.17692  1.02465
 1.54479  1.2704

julia> Flux.CUDA.∇conv_bias(cu(t.data), reshape(cu(m.bias.data), 1, 1, 1, 1))
1-element CuArray{Float32,1}:
 2.5020714

So we get the same answer using backward propagation or the "analytical ones" on the value outputted from the activation function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically something like KristofferC@fd656bc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to redefine all the functions? Can't we simple pass a different argument in activationMode since we do not have to do anything different for backward pass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's very possible that it can be done cleaner. The tricky part is getting the backward pass to dispatch properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants