Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update kref! for GPU support #217

Merged
merged 1 commit into from
Sep 16, 2020
Merged

Conversation

amontoison
Copy link
Member

@amontoison amontoison commented Sep 13, 2020

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 97.426% when pulling aa27b87 on amontoison:gpu_kref! into 2e3a8ed on JuliaSmoothOptimizers:master.

@codecov
Copy link

codecov bot commented Sep 13, 2020

Codecov Report

Merging #217 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #217   +/-   ##
=======================================
  Coverage   97.42%   97.42%           
=======================================
  Files          30       30           
  Lines        3190     3190           
=======================================
  Hits         3108     3108           
  Misses         82       82           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e3a8ed...aa27b87. Read the comment docs.

@amontoison
Copy link
Member Author

@mtanneau can you check that this PR solves the problem with MINRES-QLP? You need the gpu_kref! branch of amontoison/krylov.jl and the master branch of CUDA.jl. I interfaced some functions associated to CUBLAS and Tim Besard merged the PR this morning.

@mtanneau
Copy link

mtanneau commented Sep 16, 2020

@mtanneau can you check that this PR solves the problem with MINRES-QLP? You need the gpu_kref! branch of amontoison/krylov.jl and the master branch of CUDA.jl. I interfaced some functions associated to CUBLAS and Tim Besard merged the PR this morning.

It works on my machine, yes.

using Random, LinearAlgebra, CUDA, Krylov

Random.seed!(0)
n = 64
A = Matrix(Symmetric(rand(n, n)))
cA = CuArray(A)

b = ones(n)
cb = CuArray(b)

x, stats = minres_qlp(A, b);

CUDA.allowscalar(false)
cx, cstats = minres_qlp(cA, cb);  # no error
Status `~/Git/Tulip.jl/Project.toml`
  [052768ef] CUDA v1.3.0 `https://github.com/JuliaGPU/CUDA.jl.git#master`
  [ba0b0d4f] Krylov v0.5.2 `https://github.com/amontoison/Krylov.jl.git#gpu_kref!`
  [40e66cde] LDLFactorizations v0.5.0
  [b8f27783] MathOptInterface v0.9.15
  [10f199a5] QPSReader v0.2.0
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [de0858da] Printf
  [2f01184e] SparseArrays
  [4607b0f0] SuiteSparse
  [8dfed614] Test

I have an Nvidia GTX1080 GPU, you may want to check on another machine just to be sure.

@dpo
Copy link
Member

dpo commented Sep 16, 2020

@amontoison give me the green light when this is ready to go.

@amontoison
Copy link
Member Author

@mtanneau I checked with a Tesla P100 available at the GERAD and it works well too.
@dpo you can merge this PR.
Do you want to merge another PR like #206 and / or #128 before I create the release 0.5.3 ?

@amontoison
Copy link
Member Author

using Random, LinearAlgebra, CUDA, Krylov

Random.seed!(0)
n = 2^12
A = Matrix(Symmetric(rand(n, n)))
cA = CuArray(A)

b = ones(n)
cb = CuArray(b)

minres_qlp(A, b)  # warm up
CUDA.@time minres_qlp(A, b);

-> 104.331374 seconds

CUDA.allowscalar(false)
minres_qlp(cA, cb) #warm up
CUDA.@time  minres_qlp(cA, cb);

-> 3.651324 seconds 💪

@mtanneau
Copy link

As a side remark, the CUDA version seems to be allocated some memory on the CPU.

julia> CUDA.@time minres_qlp(A, b);
 68.133280 seconds (56 CPU allocations: 578.141 KiB)

julia> CUDA.@time  minres_qlp(cA, cb);
  5.153048 seconds (2.99 M CPU allocations: 117.499 MiB, 0.21% gc time) (6 GPU allocations: 192.000 KiB, 0.00% gc time)

(these are all post warm-up runs)

No idea if it's expected nor where it comes from.

@amontoison
Copy link
Member Author

Without the modifications of this PR:

CUDA.@time minres_qlp(cA, cb);

-> 860.511457 seconds 🐌

For the allocations on CPU, I don't know where the problem comes from. It's difficult to profile it by ssh.

@dpo dpo merged commit 3cd029c into JuliaSmoothOptimizers:master Sep 16, 2020
@dpo
Copy link
Member

dpo commented Sep 16, 2020

Thanks. Let's release.

@amontoison amontoison deleted the gpu_kref! branch August 4, 2021 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

minres_qlp performs scalar operations on GPU
4 participants