-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zoltan2: Refactor directory to use Teuchos comm #8045
Zoltan2: Refactor directory to use Teuchos comm #8045
Conversation
Most of this PR is just straight forward conversion of raw MPI calls to Teuchos forms. Teuchos does not have waitAny yet so I serialized that segment anticipating we would upgrade Teuchos and then switch this back. I expect this has introduced a small performance loss. Also switches to Tpetra::ScopeGuard and cleans up some prior hacks which allow these tests to run serial for debugging.
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks, @MicheldeMessieres
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ kddevin ]! |
Status Flag 'Master Merge AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_7.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_9.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Using Repos:
Pull Request Author: MicheldeMessieres |
Status Flag 'Master Merge AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 15 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_7.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_9.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.8.4 # 7984 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 7800 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3_SERIAL # 6216 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0 # 5697 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 2257 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_7.0.1 # 368 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_9.0.0 # 1325 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 702 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_9.2 # 5552 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 436 (click to expand)
|
Status Flag 'Master Merge AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_7.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_9.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Using Repos:
Pull Request Author: MicheldeMessieres |
Status Flag 'Master Merge AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 15 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_7.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_9.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.8.4 # 7990 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 7806 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3_SERIAL # 6222 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0 # 5698 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 2263 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_7.0.1 # 369 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_9.0.0 # 1326 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 708 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_9.2 # 5558 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 442 (click to expand)
|
Status Flag 'Master Merge AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Master Merge AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_7.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_9.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Using Repos:
Pull Request Author: MicheldeMessieres |
Status Flag 'Master Merge AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 15 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_7.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_9.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.8.4 # 7997 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 7813 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3_SERIAL # 6229 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0 # 5700 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 2270 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_7.0.1 # 371 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_9.0.0 # 1328 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 715 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_9.2 # 5565 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 449 (click to expand)
|
The base branch was changed.
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ kddevin ]! |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: MicheldeMessieres |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ kddevin ]! |
Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR... |
@kddevin These are the conversions to Teuchos comm we discussed a while back. Most of this is just straight forward conversion from the MPI version to the Teuchos version.
Teuchos didn't have logical or (LOR) for reduce and I switched to bitwise (BOR). We're using int as 0 or 1 to represent bool logic in this case. I'm not sure it matters so much which we use but I could easily add LOR to Teuchos.
The only complication was Teuchos does not have waitAny. What I did as a short term fix was serialize that segment to just call wait on each request in order (would be some loss of performance) and left a note describing how we can convert this back to waitAny when available. I started looking at what Teuchos needs and it should be pretty easy to set up since we can just copy the waitAll method as a pattern for the setup. But there is a complication since the Teuchos request holds the MPI_Request internally. For waitAll, Teuchos releases all of them. For waitAny we want to release just the one that responds (but don't know this until after waitAny). Maybe we release all and restore all except the one we get. I tried a bit and had some bug with it so decided to leave it for a separate PR.
What I'm proposing is we accept the loss of performance for now so this PR doesn't spread too much. Then I or someone could work on adding the Teuchos::waitAny and restore that as a separate PR.
Other minor fixes: I switched to Tpetra::ScopeGuard and also cleaned up a hack I had in place so it runs on serial. I've preserved the serial tests, which we don't really need, just because it really helps with debugging. But it's in a much cleaner setup now.
There are some untested code blocks remaining and I cleaned up the throw comments I had left for that.
@trilinos/zoltan2
Motivation
Clean up comm commands to use Teuchos instead of raw MPI calls.
Testing
Mac parallel and serial builds - Cuda on white - zoltan2 tests