-
Notifications
You must be signed in to change notification settings - Fork 39
feat: sharding with non-divisible axes #822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@simone-silvestri might have a comment about why we have data with odd dimension sizes even on an "even-dimensioned grid" (because of staggering) and what the implications are if any for parallelization |
Is there a MWE for this @avik-pal ? |
| function fn_set_1(x) | ||
| x[:, 1:2] .= 1 | ||
| return x | ||
| end | ||
|
|
||
| res1 = @jit fn_set_1(x_padded) | ||
| res2 = @jit fn_set_1(x_test) | ||
| @test Array(res1) ≈ Array(res2) | ||
| @test_broken all(Array(x_padded)[:, 1:2] .== 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jumerckx this broken test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, and is there a way to reproduce this on a machine where
julia> length(addressable_devices)
1
? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XLA_FLAGS="--xla_force_host_platform_device_count=8"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set the backend to cpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gotten to the bottom of it yet, but in codegen_unflatten! the result does not have a :resargs path like it should.
40c9728 to
658d996
Compare
Variables that are located on Therefore, if we want to split a Therefore, if, for example, we have a grid of size (10, 10, 10), where the x direction is For example, running the following script with using Oceananigans
using Oceananigans.Grids: topology
arch = Distributed(CPU())
grid = RectilinearGrid(arch, size=10, x=(0, 1), topology=(Bounded, Flat, Flat))
u = XFaceField(grid)
@info arch.local_rank topology(grid) size(u)
@info gridoutputs (base) simonesilvestri@Simones-MacBook-Pro ~ % mpiexecjl -np 5 julia --color=yes --project test_mpi.jl
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
┌ Info: 2
│ topology(grid) = (Oceananigans.Grids.FullyConnected, Flat, Flat)
└ size(u) = (2, 1, 1)
┌ Info: 0
│ topology(grid) = (Oceananigans.Grids.RightConnected, Flat, Flat)
└ size(u) = (2, 1, 1)
┌ Info: 4
│ topology(grid) = (Oceananigans.Grids.LeftConnected, Flat, Flat)
└ size(u) = (3, 1, 1)
┌ Info: 3
│ topology(grid) = (Oceananigans.Grids.FullyConnected, Flat, Flat)
└ size(u) = (2, 1, 1)
┌ Info: 1
│ topology(grid) = (Oceananigans.Grids.FullyConnected, Flat, Flat)
└ size(u) = (2, 1, 1)
┌ Info: 4
│ grid =
│ 2×1×1 RectilinearGrid{Float64, Oceananigans.Grids.LeftConnected, Flat, Flat} on Distributed{CPU} with 3×0×0 halo
│ ├── LeftConnected x ∈ [0.8, 1.0] regularly spaced with Δx=0.1
│ ├── Flat y
└ └── Flat z
┌ Info: 0
│ grid =
│ 2×1×1 RectilinearGrid{Float64, Oceananigans.Grids.RightConnected, Flat, Flat} on Distributed{CPU} with 3×0×0 halo
│ ├── RightConnected x ∈ [-1.58603e-17, 0.2) regularly spaced with Δx=0.1
│ ├── Flat y
└ └── Flat z
┌ Info: 3
│ grid =
│ 2×1×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Flat, Flat} on Distributed{CPU} with 3×0×0 halo
│ ├── FullyConnected x ∈ [0.6, 0.8) regularly spaced with Δx=0.1
│ ├── Flat y
└ └── Flat z
┌ Info: 2
│ grid =
│ 2×1×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Flat, Flat} on Distributed{CPU} with 3×0×0 halo
│ ├── FullyConnected x ∈ [0.4, 0.6) regularly spaced with Δx=0.1
│ ├── Flat y
└ └── Flat z
┌ Info: 1
│ grid =
│ 2×1×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Flat, Flat} on Distributed{CPU} with 3×0×0 halo
│ ├── FullyConnected x ∈ [0.2, 0.4) regularly spaced with Δx=0.1
│ ├── Flat y
└ └── Flat zNote that the x-limits for the local grids explicitly include the rightmost boundary only for rank 4 (square bracket instead of round bracket). |
This comment was marked as off-topic.
This comment was marked as off-topic.
|
closing in favor of #825. That is a more general and less intrusive solution |
Can't seem to figure out why the padded view doesn't support mutation on the outer level.
fixes #820