-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPU support for extruded 1D spaces #2172
base: main
Are you sure you want to change the base?
Conversation
6e1c215
to
ada7571
Compare
ada7571
to
78f655b
Compare
I[i, t] * | ||
I[i, s] * | ||
( | ||
I[i, t] * ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this a bug? Why aren't our unit tests catching this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like there are any unit tests for remapping 1D extruded spaces. I discovered this bug through CliMA/ClimaAtmos.jl#3593
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good to know. Can we add a unit test in ClimaCore to exercise this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's easy to add. I only discovered this bug last night so I'll add some tests today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are unit tests for 1D extruded spaces, but they were skipped on GPU because some things were broken on GPUs for those spaces. Here they are:
ClimaCore.jl/test/Remapping/distributed_remapping.jl
Lines 46 to 147 in a1c6470
@testset "2D extruded" begin | |
vertdomain = Domains.IntervalDomain( | |
Geometry.ZPoint(0.0), | |
Geometry.ZPoint(1000.0); | |
boundary_names = (:bottom, :top), | |
) | |
vertmesh = Meshes.IntervalMesh(vertdomain, nelems = 30) | |
verttopo = Topologies.IntervalTopology( | |
ClimaComms.SingletonCommsContext(device), | |
vertmesh, | |
) | |
vert_center_space = Spaces.CenterFiniteDifferenceSpace(verttopo) | |
horzdomain = Domains.IntervalDomain( | |
Geometry.XPoint(-500.0) .. Geometry.XPoint(500.0), | |
periodic = true, | |
) | |
quad = Quadratures.GLL{4}() | |
horzmesh = Meshes.IntervalMesh(horzdomain, nelems = 10) | |
horztopology = Topologies.IntervalTopology( | |
ClimaComms.SingletonCommsContext(device), | |
horzmesh, | |
) | |
horzspace = Spaces.SpectralElementSpace1D(horztopology, quad) | |
hv_center_space = | |
Spaces.ExtrudedFiniteDifferenceSpace(horzspace, vert_center_space) | |
coords = Fields.coordinate_field(hv_center_space) | |
xpts = range(-500.0, 500.0, length = 21) | |
zpts = range(0.0, 1000.0, length = 21) | |
hcoords = [Geometry.XPoint(x) for x in xpts] | |
zcoords = [Geometry.ZPoint(z) for z in zpts] | |
remapper = Remapping.Remapper( | |
hv_center_space, | |
hcoords, | |
zcoords, | |
buffer_length = 2, | |
) | |
interp_x = Remapping.interpolate(remapper, coords.x) | |
interp_x2 = Remapping.interpolate(coords.x, hcoords, zcoords) | |
if ClimaComms.iamroot(context) | |
@test Array(interp_x) ≈ [x for x in xpts, z in zpts] | |
@test Array(interp_x2) ≈ [x for x in xpts, z in zpts] | |
end | |
interp_z = Remapping.interpolate(remapper, coords.z) | |
expected_z = [z for x in xpts, z in zpts] | |
if ClimaComms.iamroot(context) | |
@test Array(interp_z[:, 2:(end - 1)]) ≈ expected_z[:, 2:(end - 1)] | |
@test Array(interp_z[:, 1]) ≈ | |
[1000.0 * (0 / 30 + 1 / 30) / 2 for x in xpts] | |
@test Array(interp_z[:, end]) ≈ | |
[1000.0 * (29 / 30 + 30 / 30) / 2 for x in xpts] | |
end | |
# Remapping two fields | |
interp_xx = Remapping.interpolate(remapper, [coords.x, coords.x]) | |
if ClimaComms.iamroot(context) | |
@test interp_x ≈ interp_xx[:, :, 1] | |
@test interp_x ≈ interp_xx[:, :, 2] | |
end | |
# Remapping three fields (more than the buffer length) | |
interp_xxx = | |
Remapping.interpolate(remapper, [coords.x, coords.x, coords.x]) | |
if ClimaComms.iamroot(context) | |
@test interp_x ≈ interp_xxx[:, :, 1] | |
@test interp_x ≈ interp_xxx[:, :, 2] | |
@test interp_x ≈ interp_xxx[:, :, 3] | |
end | |
# Remapping in-place one field | |
dest = ArrayType(zeros(21, 21)) | |
Remapping.interpolate!(dest, remapper, coords.x) | |
if ClimaComms.iamroot(context) | |
@test interp_x ≈ dest | |
end | |
# Two fields | |
dest = ArrayType(zeros(21, 21, 2)) | |
Remapping.interpolate!(dest, remapper, [coords.x, coords.x]) | |
if ClimaComms.iamroot(context) | |
@test interp_x ≈ dest[:, :, 1] | |
@test interp_x ≈ dest[:, :, 2] | |
end | |
# Three fields (more than buffer length) | |
dest = ArrayType(zeros(21, 21, 3)) | |
Remapping.interpolate!(dest, remapper, [coords.x, coords.x, coords.x]) | |
if ClimaComms.iamroot(context) | |
@test interp_x ≈ dest[:, :, 1] | |
@test interp_x ≈ dest[:, :, 2] | |
@test interp_x ≈ dest[:, :, 3] | |
end | |
end | |
end |
@@ -275,9 +273,9 @@ function set_interpolated_values_kernel!( | |||
|
|||
h = local_horiz_indices[i] | |||
out[i, k] = 0 | |||
for t in 1:Nq, s in 1:Nq | |||
for t in 1:Nq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, is this a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is also an untested bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, cc @Sbozzolo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should been tested here:
ClimaCore.jl/test/Remapping/distributed_remapping.jl
Lines 666 to 760 in a1c6470
@testset "Purely vertical space" begin | |
vertdomain = Domains.IntervalDomain( | |
Geometry.ZPoint(0.0), | |
Geometry.ZPoint(1000.0); | |
boundary_names = (:bottom, :top), | |
) | |
vertmesh = Meshes.IntervalMesh(vertdomain, nelems = 30) | |
verttopo = Topologies.IntervalTopology( | |
ClimaComms.SingletonCommsContext(ClimaComms.device()), | |
vertmesh, | |
) | |
cspace = Spaces.CenterFiniteDifferenceSpace(verttopo) | |
fspace = Spaces.FaceFiniteDifferenceSpace(cspace) | |
zpts = range(0.0, 1000.0, 21) | |
zcoords = [Geometry.ZPoint(z) for z in zpts] | |
# Center space | |
cremapper = Remapping.Remapper( | |
cspace; | |
target_zcoords = zcoords, | |
buffer_length = 2, | |
) | |
ccoords = Fields.coordinate_field(cspace) | |
cinterp_z = Remapping.interpolate(cremapper, ccoords.z) | |
cexpected_z = ArrayType(zpts) | |
ClimaComms.allowscalar(device) do | |
cexpected_z[1] = 1000.0 * (0 / 30 + 1 / 30) / 2 | |
cexpected_z[end] = 1000.0 * (29 / 30 + 30 / 30) / 2 | |
end | |
@test cinterp_z ≈ cexpected_z | |
# Face space | |
fremapper = Remapping.Remapper( | |
fspace; | |
target_zcoords = zcoords, | |
buffer_length = 2, | |
) | |
fcoords = Fields.coordinate_field(fspace) | |
finterp_z = Remapping.interpolate(fremapper, fcoords.z) | |
fexpected_z = ArrayType(zpts) | |
finterp_z ≈ fexpected_z | |
# Remapping two fields | |
cinterp = Remapping.interpolate(cremapper, [ccoords.z, ccoords.z]) | |
@test cexpected_z ≈ cinterp[:, 1] | |
@test cexpected_z ≈ cinterp[:, 2] | |
finterp = Remapping.interpolate(fremapper, [fcoords.z, fcoords.z]) | |
@test fexpected_z ≈ finterp[:, 1] | |
@test fexpected_z ≈ finterp[:, 2] | |
# Remapping three fields (more than the buffer length) | |
cinterp = | |
Remapping.interpolate(cremapper, [ccoords.z, ccoords.z, ccoords.z]) | |
@test cexpected_z ≈ cinterp[:, 1] | |
@test cexpected_z ≈ cinterp[:, 2] | |
@test cexpected_z ≈ cinterp[:, 3] | |
finterp = | |
Remapping.interpolate(fremapper, [fcoords.z, fcoords.z, fcoords.z]) | |
@test fexpected_z ≈ finterp[:, 1] | |
@test fexpected_z ≈ finterp[:, 2] | |
@test fexpected_z ≈ finterp[:, 3] | |
# Remapping in-place one field | |
dest = ArrayType(zeros(21)) | |
Remapping.interpolate!(dest, cremapper, ccoords.z) | |
@test cexpected_z ≈ dest | |
Remapping.interpolate!(dest, fremapper, fcoords.z) | |
@test fexpected_z ≈ dest | |
# Three fields (more than buffer length) | |
dest = ArrayType(zeros(21, 3)) | |
Remapping.interpolate!( | |
dest, | |
cremapper, | |
[ccoords.z, ccoords.z, ccoords.z], | |
) | |
@test cexpected_z ≈ dest[:, 1] | |
@test cexpected_z ≈ dest[:, 2] | |
@test cexpected_z ≈ dest[:, 3] | |
Remapping.interpolate!( | |
dest, | |
fremapper, | |
[fcoords.z, fcoords.z, fcoords.z], | |
) | |
@test fexpected_z ≈ dest[:, 1] | |
@test fexpected_z ≈ dest[:, 2] | |
@test fexpected_z ≈ dest[:, 3] | |
end | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump the number of allowed JET failures for SpectralElementSpace1D in opt_spaces.jl from 141 to 825 on GPUs and from 137 to 150 on CPUs
This is alarming, without manifest changes. Would you mind taking a look at what is causing the spike in inferences? (I think that script makes it pretty easy to modify & see)
The spike in inferences for |
src/Hypsography/Hypsography.jl
Outdated
surface::F | ||
end | ||
|
||
strip_field(::LinearAdaption) = LinearAdaption(nothing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be strip_space
? And it should keep the field values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to keep the field values in this function because they are passed in separately through ref_z_to_physical_z
. I think it is best to avoid passing duplicate field values through here. In general, the design of this section should probably be refactored to split the surface elevations from the adaption specification.
src/Topologies/dss.jl
Outdated
function dss!(data::DSSDataTypes, topology::Topology2D) | ||
function dss!(data::DSSTypes1D, topology::IntervalTopology) | ||
Nij = DataLayouts.get_Nij(data) | ||
length(parent(data)) == 0 && return nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be more cases here that we can short circuit. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like whether we are using GL or GLL quadrature? I think we're just assuming GLL quadrature everywhere outside of a few unit tests. Or did you have something else in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking Nh > 1
should be good (+GLL?)
end | ||
using Test | ||
@testset "Example broadcast expression on GPUs" begin | ||
@test !isnothing(@. ᶜgradᵥ(level_field + ᶠscalar_field)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉, thank you!
It turns out that these are all false warnings, induced by our |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chatted with @dennisYatunin about this PR. These changes look fine, and I approve once we add some unit tests exercising the new pieces/features.
68be4de
to
128d4d1
Compare
This PR adds GPU support for extruded 1D spaces, with the following modifications:
Ni == Nj
frommapreduce_cuda_kernel
,columnwise_partition
,columnwise_universal_index
, andmultiple_field_solve_universal_index
operator_shmem
operator_fill_shmem!
, andoperator_evaluate
corresponding to the 1D versions of all SEM operatorsDSSDataTypes
toDSSTypes2D
, and define a new constantDSSTypes1D
dss_1d!
corresponding toCUDADevice
, and extenddss!
to support both 1D and 2D data typeslocal_geometry
anddss_weights
of aSpectralElementSpace1D
fromArray
s toCuArray
sThese changes are also tested through CI:
unit_2d.jl
andconvergence_2d.jl
IFH
,IHF
,VIFH
, andVIHF
tounit_mapreduce.jl
, which already has a CUDA jobTo fix several test failues, I had to make a few additional changes:
Gradient
by directly differentiating vectors instead of breaking them up into components, as the original design could not distinguish the gradient of a scalar from the gradient of a single-component vector, and one of the tests inunit_2d.jl
takes the gradient of aUVector
fieldstrip_space
toFiniteDifferenceOperator
boundary conditions, as one of the tests inunit_2d.jl
uses a 1D field for theCurlC2F
boundary conditionlevel(space::ExtrudedFiniteDifferenceSpace, v)
from 4 to 1 to avoid a GPU inference failure inreconstruct_placeholder_space
when evaluating boundary conditions (we hit some sort of compiler heuristic when we use 2 or 4 specialized methods, so we need to just have one method with union-splitting)CuArray
s toArray
s inunit_2d.jl
andunit_spaces.jl
to avoid scalar indexing errorsSpectralElementSpace1D
inopt_spaces.jl
from 141 to 825 on GPUs and from 137 to 150 on CPUs