Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement AMR with MPI #361

Merged
merged 30 commits into from
Dec 16, 2020
Merged

Implement AMR with MPI #361

merged 30 commits into from
Dec 16, 2020

Conversation

efaulhaber
Copy link
Member

@efaulhaber efaulhaber commented Nov 30, 2020

Implement WP3 of #159.
Resolve #330.

src/callbacks_step/amr_dg2d.jl Show resolved Hide resolved
src/mesh/parallel.jl Outdated Show resolved Hide resolved
src/mesh/parallel.jl Outdated Show resolved Hide resolved
src/solvers/dg/dg_2d_parallel.jl Show resolved Hide resolved
src/mesh/parallel.jl Outdated Show resolved Hide resolved
src/mesh/parallel.jl Outdated Show resolved Hide resolved
src/solvers/dg/dg_2d_parallel.jl Outdated Show resolved Hide resolved
Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one small change (comment to docstring). However, I cannot find a test for the new functionality, or am I missing something? If we haven't talked about it yet - I think it would be good if you can add one test to the 2D parallel tests that exercises this parallel AMR functionality such that we know it works and that it cannot be easily broken by future commits.

src/mesh/parallel.jl Outdated Show resolved Hide resolved
test/test_examples_2d_parallel.jl Outdated Show resolved Hide resolved
@efaulhaber
Copy link
Member Author

Taal's new modularity allows us to create new AMR indicators only for testing purposes without the need to write code into Trixi that will only be used for testing.
For example, the two tests I added use their own AMR indicators that don't have any practical use apart from testing. I don't think that it would be a good idea to add these non-practical examples to the Trixi examples. That's why I created a new folder examples inside the test folder. My idea is that we store all examples there that are only relevant for testing. What do you think?

@sloede
Copy link
Member

sloede commented Dec 10, 2020

Taal's new modularity allows us to create new AMR indicators only for testing purposes without the need to write code into Trixi that will only be used for testing.
For example, the two tests I added use their own AMR indicators that don't have any practical use apart from testing. I don't think that it would be a good idea to add these non-practical examples to the Trixi examples. That's why I created a new folder examples inside the test folder. My idea is that we store all examples there that are only relevant for testing. What do you think?

Hm. My thinking is that these tests will become obsolete at some point (i.e., when we have proper parallel AMR tests), so they will not be there permanently. OTOH, storing them in a different location increases complexity and I'll certainly have forgotten about them by the time I have to change all elixirs, and then they get left out. Thus, I'd prefer to have them in their usual location for now - but you should add a comment at the very top of the elixirs (and also next to the test themselves) that these elixirs and indicators do not make much practical sense and are only used for testing. Then it should be clear to everyone what this is all about.

Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @erik-f - nice work! I didn't check all algorithms in detail since I assume @sloede had a closer look at everything.

Thus, I'd prefer to have them in their usual location for now - but you should add a comment at the very top of the elixirs (and also next to the test themselves) that these elixirs and indicators do not make much practical sense and are only used for testing. Then it should be clear to everyone what this is all about.

I think this would be good 👍

We have some depwarns:

Warning: `Allgatherv(sendbuf, counts::Vector{Cint}, comm::Comm)` is deprecated, use `Allgatherv!(sendbuf, VBuffer(similar(sendbuf, sum(counts)), counts), comm)` instead.
Warning: `Gatherv(sendbuf::AbstractArray, counts::Vector{Cint}, root::Integer, comm::Comm)` is deprecated, use `Gatherv!(view(sendbuf, 1:counts[MPI.Comm_rank(comm) + 1]), if Comm_rank(comm) == root
│         VBuffer(similar(sendbuf, sum(counts)), counts)
│     else
│         nothing
│     end, root, comm)` instead.

src/callbacks_step/save_solution_dg.jl Outdated Show resolved Hide resolved
src/mesh/parallel.jl Outdated Show resolved Hide resolved
@efaulhaber
Copy link
Member Author

Why does elixir_advection_amr_coarsen_once.jl fail?

elixir_advection_amr_coarsen_once.jl: elixir_advection_amr_coarsen_once.jl: Test Failed at Test Failed at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:636
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:636
  Expression: contains_warn(read(fname, String), $(Expr(:escape, :(r"^(?!.)"s))))
Stacktrace:
 [1] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:636
 [2] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_trixi.jl:36
 [3] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:41
 [4] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
 [5] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:41
 [6] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
 [7] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:20
 [8] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
 [9] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:18
  Expression: contains_warn(read(fname, String), $(Expr(:escape, :(r"^(?!.)"s))))
Stacktrace:
 [1] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:636
 [2] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_trixi.jl:36
 [3] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:41
 [4] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
 [5] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:41
 [6] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
 [7] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:20
 [8] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
 [9] top-level scope at /home/runner/work/Trixi.jl/Trixi.jl/test/test_examples_2d_parallel.jl:18

It doesn't show warnings anymore in the logs but still fails. The other tests however show warnings in the logs, but they don't fail.

@ranocha
Copy link
Member

ranocha commented Dec 11, 2020

There are still some depwarns related to MPI. Could you please try to fix them? If that doesn't help, could you please run the specific tests that fails locally with julia --depwarn=yes --check-bounds=yes?

Otherwise Julia will throw a warning if both examples are run together
in the same session.
A view must be passed for receiving.
Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very, very good. Two minor issues to resolve and then we can merge asap.

src/callbacks_step/save_restart_dg.jl Outdated Show resolved Hide resolved
src/callbacks_step/save_solution_dg.jl Outdated Show resolved Hide resolved
@efaulhaber efaulhaber mentioned this pull request Dec 13, 2020
Add assertion to make sure that every rank has at least one element
even when allow_coarsening is set to true.
@efaulhaber
Copy link
Member Author

I didn't feel 100% confident with the new partition! function, so I added a few unit tests. Apart from that, I only implemented the requested changes.

src/mesh/parallel.jl Outdated Show resolved Hide resolved
src/solvers/dg/dg.jl Outdated Show resolved Hide resolved
test/test_manual.jl Outdated Show resolved Hide resolved
test/test_manual.jl Outdated Show resolved Hide resolved
test/test_manual.jl Outdated Show resolved Hide resolved
src/mesh/parallel.jl Outdated Show resolved Hide resolved
Co-authored-by: Hendrik Ranocha <ranocha@users.noreply.github.com>
Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and can be merged once all tests pass & coverage is OK.

Those who merge should also kick CompatHelper to create a compat entry for SimpleMock.jl in test/Project.toml.

@ranocha
Copy link
Member

ranocha commented Dec 16, 2020

Those who merge should also kick CompatHelper to create a compat entry for SimpleMock.jl in test/Project.toml.

This could also be added manually (setting the lower version bound to the recent version of SimpleMock.jl). Otherwise, CompatHelper should be scheduled tonight.

@sloede sloede merged commit b351d09 into trixi-framework:master Dec 16, 2020
@efaulhaber efaulhaber deleted the 159-mpi-amr branch December 21, 2020 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MPI: Replace deprecated methods
3 participants