Fixes grid equality for GPUs #2030

tomchor · 2021-10-28T03:23:26Z

For some reason when I tested the code after merging #2028 it didn't really work on my main code for GPUs.

It worked for a MWE when I tested it here but I guess I must have done something wrong? In any case, I apologize!

I also expanded the test to test grids on GPUs (which would have caught this error) so I think this'll help.

Given that we just released a new version, I didn't bump this to 0.63.4 here. But let me know if I should do that.

CC: @navidcy

…r GPU

test/test_grids.jl

navidcy · 2021-10-28T03:35:14Z

@tomchor:

I don't understand what "really" means in "it didn't really work". Did it work or it didn't work?
In case there was an issue, can you elaborate what that was and why Adapt is the solution?
Last, if there is still an issue and this PR fixes it then it worths a patch release. There is no quota of patch releases. We can release as many as we need to.

glwagner · 2021-10-28T03:46:02Z

src/Grids/Grids.jl

-    x1, y1, z1 = nodes((Face, Face, Face), grid1)
-    x2, y2, z2 = nodes((Face, Face, Face), grid2)
+    x1, y1, z1 = Adapt.adapt(CPU(), nodes((Face, Face, Face), grid1))
+    x2, y2, z2 = Adapt.adapt(CPU(), nodes((Face, Face, Face), grid2))


I don't think this does anything. I did a little test and this is what I get:

julia> using Adapt, CUDA, Oceananigans julia> a = CuArray(rand(3)) 3-element CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}: 0.7359732430925523 0.19547988326307286 0.6593710335404803 julia> b = Adapt.adapt(CPU(), a) 3-element CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}: 0.7359732430925523 0.19547988326307286 0.6593710335404803

The reason is because the fallback for Adapt.adapt is an identity (eg adapt(to, x) = x). There's no special adapt(to, x) method for to::Oceananigans.AbstractArchitecture --- the variable to is a CUDA-specific object (this function isn't called when we run on the CPU).

What is this code trying to do --- is it converting to Array? This might be sensible, because a == b triggers scalar operations on the GPU if one of them is GPU and one is CPU. Otherwise, this should work:

julia> using CUDA julia> a = 0.0:0.1:0.3 0.0:0.1:0.3 julia> b = Array(a) 4-element Vector{Float64}: 0.0 0.1 0.2 0.3 julia> c = CuArray(a) 4-element CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}: 0.0 0.1 0.2 0.3 julia> a == b true julia> CUDA.@allowscalar a == c true julia> CUDA.@allowscalar b == c true julia> b == c ERROR: Scalar indexing is disallowed.

I again don't have GPU access at this exact moment to test this in more detail, but it seemed to work for me. This is what I got (from my logs):

julia> x1, y1, z1 = Adapt.adapt(CPU(), nodes((Face, Face, Face), grid1)) ([0.0, 0.5], [0.0, 0.5], [1.0, 2.0, 3.0]) julia> x1, y1, z1 = nodes((Face, Face, Face), grid1) ([0.0, 0.5], [0.0, 0.5], [Error showing value of type Tuple{SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}, Tuple{UnitRange{Int64}}, true}, SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}, Tuple{UnitRange{Int64}}, true}, SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CUDA.CuArray{Float64, 1}}, Tuple{UnitRange{Int64}}, true}}: ERROR: Scalar indexing is disallowed. Invocation of getindex resulted in scalar indexing of a GPU array. This is typically caused by calling an iterating implementation of a method. Such implementations *do not* execute on the GPU, but very slowly on the CPU, and therefore are only permitted from the REPL for prototyping purposes. If you did intend to index this array, annotate the caller with @allowscalar. Stacktrace:

Well that sent me on a journey. There is this method:

Oceananigans.jl/src/Architectures.jl

Line 73 in 432fab2

Adapt.adapt_structure(::CPU, a::OffsetGPUArray) = OffsetArray(Array(a.parent), a.offsets...)

Which I think gets called here. (I have myself to blame for that --- it seems misguided now.)

I think I prefer converting to Array here rather than using Adapt (which does basically the same thing for OffsetArray, but may not work in the future or for other array types). @allowscalar is also ok but Array seems better for some reason.

I see. I'll commit your suggestion and see if tests pass then.

tomchor · 2021-10-28T03:46:45Z

@tomchor:

I don't understand what "really" means in "it didn't really work". Did it work or it didn't work?

In case there was an issue, can you elaborate what that was and why Adapt is the solution?

Last, if there is still an issue and this PR fixes it then it worths a patch release. There is no quota of patch releases. We can release as many as we need to.

Sorry, let me be more clear:

It worked when I tried with a MWE that I had. I didn't try it with my main research code at the time because I only had limited GPU time and that code takes a while (I've been having trouble getting my hands on GPUs). However, I assume I must have done something wrong with my MWE because when I finally was able to test this with my main research code I wasn't able to start the simulation
The issue that popped up was a scalar indexing one. I think basically the z1==z2 equality uses scalar indexing, which doesn't work for GPUs. Wrapping the nodes() expression with Adapt.adapt() seemed to solve the issue on every instance that I was able to test so far. I guess CUDA.@allowscalar would also be a possibility.
I will bump a patch release then. I just wasn't sure if we wanted to release a another version with just one PR.

src/Grids/Grids.jl

Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>

tomchor · 2021-10-29T03:13:20Z

@glwagner for some reason your suggestion using tuples didn't work. I was getting a "scalar indexing disallowed" error on that same line, which was weird. So I changed it to an @allowscalar statement.

It's also weird that the test I added (which supposedly tests both CPUs and GPUs) didn't catch that.... not sure why

navidcy

I admit I missed the thread a bit with the discussion about Adapt. But from what I see now Adapt isn't involved. Seeing the latest changes in the files I think they make a lot of sense so I'm happy.

fixed grid equality for GPUs and expanded equality test to account fo…

b97dfc8

…r GPU

tomchor requested a review from navidcy October 28, 2021 03:23

glwagner reviewed Oct 28, 2021

View reviewed changes

test/test_grids.jl Show resolved Hide resolved

glwagner reviewed Oct 28, 2021

View reviewed changes

src/Grids/Grids.jl Outdated Show resolved Hide resolved

Update src/Grids/Grids.jl

4f0200c

Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>

tomchor requested a review from glwagner October 28, 2021 22:54

tomchor added 3 commits October 28, 2021 15:56

patch release

c5b88f9

added CUDA.@allowscalar before equality operation

efc4838

cleaned up debugging statement

d5207d0

tomchor requested a review from francispoulin October 29, 2021 21:00

navidcy approved these changes Oct 31, 2021

View reviewed changes

navidcy added bug 🐞 Even a perfect program still has bugs GPU 👾 Where Oceananigans gets its powers from labels Oct 31, 2021

tomchor merged commit 8da06a2 into main Nov 1, 2021

tomchor deleted the tc/grid_equality2 branch November 1, 2021 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes grid equality for GPUs #2030

Fixes grid equality for GPUs #2030

tomchor commented Oct 28, 2021

navidcy commented Oct 28, 2021

glwagner Oct 28, 2021

tomchor Oct 28, 2021

glwagner Oct 28, 2021

tomchor Oct 28, 2021

tomchor commented Oct 28, 2021

tomchor commented Oct 29, 2021

navidcy left a comment

Fixes grid equality for GPUs #2030

Fixes grid equality for GPUs #2030

Conversation

tomchor commented Oct 28, 2021

navidcy commented Oct 28, 2021

glwagner Oct 28, 2021

Choose a reason for hiding this comment

tomchor Oct 28, 2021

Choose a reason for hiding this comment

glwagner Oct 28, 2021

Choose a reason for hiding this comment

tomchor Oct 28, 2021

Choose a reason for hiding this comment

tomchor commented Oct 28, 2021

tomchor commented Oct 29, 2021

navidcy left a comment

Choose a reason for hiding this comment