Skip to content

Conversation

@Abdelrahman912
Copy link
Contributor

@Abdelrahman912 Abdelrahman912 commented Mar 27, 2025

This PR adds a new extension module SparseMatricesCSRExt that enables dispatching SparseMatrixCSR from SparseMatricesCSR.jl to CuSparseMatrixCSR.

@codecov
Copy link

codecov bot commented Mar 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.62%. Comparing base (3d44e32) to head (0e80594).
Report is 4 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2720       +/-   ##
===========================================
+ Coverage   77.34%   89.62%   +12.28%     
===========================================
  Files         153      153               
  Lines       13108    13195       +87     
===========================================
+ Hits        10138    11826     +1688     
+ Misses       2970     1369     -1601     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleadt maleadt requested review from amontoison and kshyatt March 27, 2025 17:29
@Abdelrahman912 Abdelrahman912 marked this pull request as ready for review March 28, 2025 00:03
@github-actions
Copy link
Contributor

github-actions bot commented Mar 28, 2025

Your PR no longer requires formatting changes. Thank you for your contribution!

@maleadt maleadt requested a review from kshyatt March 29, 2025 13:48
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: 0e80594 Previous: 7a83380 Ratio
latency/precompile 42831148863 ns 42926176632.5 ns 1.00
latency/ttfp 7055807029 ns 7106699468 ns 0.99
latency/import 3375773699 ns 3389948934 ns 1.00
integration/volumerhs 9597397 ns 9608052 ns 1.00
integration/byval/slices=1 147336 ns 147162 ns 1.00
integration/byval/slices=3 426120 ns 425761 ns 1.00
integration/byval/reference 145426 ns 145313 ns 1.00
integration/byval/slices=2 286891 ns 286425 ns 1.00
integration/cudadevrt 103861 ns 103604 ns 1.00
kernel/indexing 14551 ns 14311 ns 1.02
kernel/indexing_checked 15115.5 ns 15150.5 ns 1.00
kernel/occupancy 719.3142857142857 ns 707.8943661971831 ns 1.02
kernel/launch 2538 ns 2395.4444444444443 ns 1.06
kernel/rand 18953 ns 18678 ns 1.01
array/reverse/1d 20039 ns 19617 ns 1.02
array/reverse/2d 24584 ns 24183.5 ns 1.02
array/reverse/1d_inplace 11270 ns 10894 ns 1.03
array/reverse/2d_inplace 12872 ns 12860 ns 1.00
array/copy 21059 ns 20975 ns 1.00
array/iteration/findall/int 159756.5 ns 157315 ns 1.02
array/iteration/findall/bool 140032 ns 138316 ns 1.01
array/iteration/findfirst/int 154689 ns 154273.5 ns 1.00
array/iteration/findfirst/bool 156124 ns 154976.5 ns 1.01
array/iteration/scalar 72823 ns 73010 ns 1.00
array/iteration/logical 220835.5 ns 215446 ns 1.03
array/iteration/findmin/1d 42153 ns 41421 ns 1.02
array/iteration/findmin/2d 94512 ns 94396 ns 1.00
array/reductions/reduce/1d 45220 ns 43605.5 ns 1.04
array/reductions/reduce/2d 51768.5 ns 45135 ns 1.15
array/reductions/mapreduce/1d 40197.5 ns 40394 ns 1.00
array/reductions/mapreduce/2d 52388.5 ns 51842 ns 1.01
array/broadcast 21246 ns 21077 ns 1.01
array/copyto!/gpu_to_gpu 12645 ns 11034 ns 1.15
array/copyto!/cpu_to_gpu 218110 ns 216527 ns 1.01
array/copyto!/gpu_to_cpu 284544 ns 283217 ns 1.00
array/accumulate/1d 110400 ns 108949 ns 1.01
array/accumulate/2d 81360 ns 80328 ns 1.01
array/construct 1237.2 ns 1243.7 ns 0.99
array/random/randn/Float32 48013 ns 47595 ns 1.01
array/random/randn!/Float32 25166 ns 25281 ns 1.00
array/random/rand!/Int64 27536 ns 27337 ns 1.01
array/random/rand!/Float32 8832 ns 8944.333333333334 ns 0.99
array/random/rand/Int64 34423 ns 33758 ns 1.02
array/random/rand/Float32 13124 ns 13235 ns 0.99
array/permutedims/4d 61784.5 ns 61033.5 ns 1.01
array/permutedims/2d 55629 ns 54927 ns 1.01
array/permutedims/3d 56340 ns 55899 ns 1.01
array/sorting/1d 2778494 ns 2777029.5 ns 1.00
array/sorting/by 3369759 ns 3367967 ns 1.00
array/sorting/2d 1087242 ns 1085746 ns 1.00
cuda/synchronization/stream/auto 1005 ns 1030 ns 0.98
cuda/synchronization/stream/nonblocking 7507.5 ns 8027.8 ns 0.94
cuda/synchronization/stream/blocking 803.6382978723404 ns 803.8265306122449 ns 1.00
cuda/synchronization/context/auto 1161.3 ns 1173.1 ns 0.99
cuda/synchronization/context/nonblocking 8219.7 ns 7706.2 ns 1.07
cuda/synchronization/context/blocking 899.6078431372549 ns 906.5853658536586 ns 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@Abdelrahman912
Copy link
Contributor Author

This PR should be ready by now

@maleadt
Copy link
Member

maleadt commented Apr 23, 2025

Extension packages need to be listed explicitly in the CI pipeline:

- group: ":telescope: Downstream"
depends_on: "cuda"
steps:
#- label: "NNlib.jl"
# plugins:
# - JuliaCI/julia#v1:
# version: "1.11"
# - JuliaCI/julia-coverage#v1:
# dirs:
# - src
# - lib
# - examples
# command: |
# julia --project -e '
# using Pkg
#
# cuda = pwd()
# cudnn = joinpath(cuda, "lib", "cudnn")
# devdir = mktempdir()
# nnlib = joinpath(devdir, "NNlib")
#
# println("--- :julia: Installing TestEnv")
# Pkg.activate(; temp=true)
# Pkg.add("TestEnv")
# using TestEnv
#
# println("--- :julia: Installing NNlib")
# withenv("JULIA_PKG_PRECOMPILE_AUTO" => 0,
# "JULIA_PKG_DEVDIR" => devdir) do
# Pkg.develop("NNlib")
# Pkg.activate(nnlib)
#
# try
# Pkg.develop([PackageSpec(path=cuda), PackageSpec(path=cudnn)])
# TestEnv.activate()
# catch err
# @error "Could not install NNlib" exception=(err,catch_backtrace())
# exit(3)
# finally
# Pkg.activate(nnlib)
# end
# end
#
# println("+++ :julia: Running tests")
# Pkg.test(; coverage=true)'
# env:
# NNLIB_TEST_CUDA: "true"
# NNLIB_TEST_CPU: "false"
# agents:
# queue: "juliagpu"
# cuda: "*"
# if: |
# build.message =~ /\[only tests\]/ ||
# build.message =~ /\[only downstream\]/ ||
# build.message !~ /\[only/ && !build.pull_request.draft &&
# build.message !~ /\[skip tests\]/ &&
# build.message !~ /\[skip downstream\]/
# timeout_in_minutes: 30
# soft_fail:
# - exit_status: 3
- label: "Enzyme.jl"
plugins:
- JuliaCI/julia#v1:
version: "1.10" # XXX: Enzyme.jl is broken on 1.11
- JuliaCI/julia-coverage#v1:
dirs:
- src
- lib
- examples
command: |
julia -e '
using Pkg
println("--- :julia: Instantiating project")
withenv("JULIA_PKG_PRECOMPILE_AUTO" => 0) do
# add Enzyme to the test deps
Pkg.activate("test")
Pkg.add(["Enzyme", "EnzymeCore"])
# to check compatibility, also add Enzyme to the main environment
# (or Pkg.test, which merges both environments, could fail)
Pkg.activate(".")
# Try to co-develop Enzyme and KA, if that fails, try just to dev Enzyme
try
Pkg.develop([PackageSpec("Enzyme"), PackageSpec("KernelAbstractions")])
catch err
try
Pkg.develop([PackageSpec("Enzyme")])
catch err
@error "Could not install Enzyme" exception=(err,catch_backtrace())
exit(3)
end
end
end
println("+++ :julia: Running tests")
Pkg.test(; coverage=true, test_args=`extensions/enzyme`)'
agents:
queue: "juliagpu"
cuda: "*"
if: |
build.message =~ /\[only tests\]/ ||
build.message =~ /\[only downstream\]/ ||
build.message !~ /\[only/ && !build.pull_request.draft &&
build.message !~ /\[skip tests\]/ &&
build.message !~ /\[skip downstream\]/
timeout_in_minutes: 60
soft_fail: true

Alternatively, since this is a simple interface package, we could add it to the test project environment and test it unconditionally, much like SpecialFunctions or BFloat16s.

@maleadt
Copy link
Member

maleadt commented Apr 25, 2025

Tests in test/extensions aren't executed by default:

CUDA.jl/test/runtests.jl

Lines 115 to 117 in c113666

# package extensions often require additional dependencies,
# which we don't want to put in our test env by default.
startswith(test, "extensions") && return false

You'll have to move it to e.g. libraries/cusparse

@maleadt maleadt merged commit 82c2074 into JuliaGPU:master May 8, 2025
3 checks passed
@Abdelrahman912 Abdelrahman912 deleted the csr-dispatch branch May 8, 2025 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants