-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure of BenchmarkTools #19
Comments
Is it possible that the issue is related to CuStream? Because CUDA.synchronize() function takes CuStream as the input argument.
In your current code, the stream id seems to be the default value 0. |
julia> @time (mul!(b, a, a); CUDA.synchronize(CUDA.context()))
0.093036 seconds (2 allocations: 48 bytes) This one works. Using the |
The current benchmark in the RAEDME does not look good. When we read benchmark, we always read the min-time, because it reflects the true performance. julia> using CuTropicalGEMM
julia> @benchmark CUDA.@sync $a * $a
BenchmarkTools.Trial: 93 samples with 4 evaluations.
Range (min … max): 6.653 μs … 158.961 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 13.535 ms ┊ GC (median): 0.00%
Time (mean ± σ): 13.499 ms ± 15.867 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
6.65 μs Histogram: frequency by time 13.5 ms <
Memory estimate: 256 bytes, allocs estimate: 7. |
BenchmarkTools are not working correctly:
Comparing to results directly from the C-CUDA tests, the result of
@ benchmark
is correct.The text was updated successfully, but these errors were encountered: