Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster path for onehotbatch(::CUArray{Int}, ::UnitRange) #29

Merged
merged 5 commits into from
Dec 31, 2022

Conversation

mcabbott
Copy link
Member

Follow-up to #27, moves the inbounds check:

julia> inds100 = rand(1:100, 100); inds100cu = inds100 |> cu;

julia> @btime onehotbatch($inds100, 1:100);
  348.811 ns (1 allocation: 496 bytes)

julia> @btime onehotbatch($inds100cu, 1:100);  # with #27, minimum + maximum sync twice
  70.267 μs (86 allocations: 4.02 KiB)

julia> @btime unsafe_onehotbatch($inds100, 1:100);
  231.916 ns (1 allocation: 496 bytes)

julia> @btime unsafe_onehotbatch($inds100cu, 1:100);  # version with no checks
  8.889 μs (28 allocations: 1.11 KiB)

julia> function fused_onehotbatch(data::AbstractArray{<:Integer}, labels::AbstractUnitRange{<:Integer})
         offset = 1 - first(labels)
         indices = map(data) do datum
                    i = UInt32(datum + offset)
                    checkbounds(labels, i)  # like this PR
                    i
                  end
         return OneHotArray(indices, length(labels))
       end
fused_onehotbatch (generic function with 1 method)

julia> @btime fused_onehotbatch($inds100cu, 1:100);
  10.708 μs (31 allocations: 1.20 KiB)

julia> bad100 = copy(inds100); bad100[33] = 101; bad100cu = bad100 |> cu;

julia> fused_onehotbatch(bad100cu, 1:100)
ERROR: Out-of-bounds array access.
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: KernelException: exception thrown during kernel execution on device Tesla V100-PCIE-16GB
Stacktrace:
 [1] check_exceptions()
   @ CUDA ~/.julia/packages/CUDA/DfvRa/src/compiler/exceptions.jl:34
 [2] nonblocking_synchronize
   @ ~/.julia/packages/CUDA/DfvRa/lib/cudadrv/context.jl:331 [inlined]
 [3] device_synchronize()
   @ CUDA ~/.julia/packages/CUDA/DfvRa/lib/cudadrv/context.jl:319

julia> cu(ones(100))[bad100cu]  # getindex does much the same check, inside the kernel
ERROR: Out-of-bounds array access.
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: KernelException: exception thrown during kernel execution on device Tesla V100-PCIE-16GB
Stacktrace:
 [1] check_exceptions()
   @ CUDA ~/.julia/packages/CUDA/DfvRa/src/compiler/exceptions.jl:34
 [2] nonblocking_synchronize
   @ ~/.julia/packages/CUDA/DfvRa/lib/cudadrv/context.jl:331 [inlined]
 [3] device_synchronize()
   @ CUDA ~/.julia/packages/CUDA/DfvRa/lib/cudadrv/context.jl:319

@codecov-commenter
Copy link

codecov-commenter commented Dec 31, 2022

Codecov Report

Base: 96.21% // Head: 95.68% // Decreases project coverage by -0.52% ⚠️

Coverage data is based on head (49561f9) compared to base (32e06c8).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #29      +/-   ##
==========================================
- Coverage   96.21%   95.68%   -0.53%     
==========================================
  Files           4        4              
  Lines         132      139       +7     
==========================================
+ Hits          127      133       +6     
- Misses          5        6       +1     
Impacted Files Coverage Δ
src/onehot.jl 96.61% <100.00%> (+0.45%) ⬆️
src/OneHotArrays.jl 0.00% <0.00%> (-100.00%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/onehot.jl Outdated Show resolved Hide resolved
Co-authored-by: Brian Chen <ToucheSir@users.noreply.github.com>
src/onehot.jl Outdated Show resolved Hide resolved
@mcabbott mcabbott merged commit 8f447ff into FluxML:main Dec 31, 2022
@mcabbott mcabbott deleted the gpuarrays3 branch December 31, 2022 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants