Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dist2 initial improvements #623

Merged
merged 4 commits into from
Mar 1, 2020

Conversation

brian-kelley
Copy link
Contributor

  • Fix distance-2 algorithms not checking for columns being in-bounds (meaning it was not usable for MueLu aggregation on distributed problems)
  • Remove the serial distance-2 algorithm from the distance-1 handle
  • Implement a new net-based serial algorithm (asymptotically better than the neighbors-of-neighbors nested loop used in vertex-based). In practice, is faster than ColPack by >10x and both the previous serial algorithm and Zoltan by >18x.
  • Implement a new parallel net-based algorithm (COLORING_D2_NB_BIT). It gives a 2x speedup over VB and VB_BIT on GPU, but can still be improved.
  • Improve the user interface to the D1 and D2 coloring performance tests

Also, @srajama1 and I just found out about a serious bug in all D2 algorithms where D1 conflicts ignored (the algorithms assume that diagonal edges exist, but this is wrong). The D2 coloring verification has the same bug, so it was never caught in testing. This PR is still a step in the right direction (doesn't introduce bugs) so I will just submit it as-is and work on the next fix on top of these changes.

RIDE spot check:
#######################################################
PASSED TESTS
#######################################################
cuda-10.1.105-Cuda_OpenMP-release build_time=619 run_time=416
cuda-10.1.105-Cuda_Serial-release build_time=498 run_time=518
cuda-9.2.88-Cuda_OpenMP-release build_time=550 run_time=418
cuda-9.2.88-Cuda_Serial-release build_time=521 run_time=520
gcc-6.4.0-OpenMP_Serial-release build_time=215 run_time=393
gcc-7.2.0-OpenMP-release build_time=122 run_time=128
gcc-7.2.0-OpenMP_Serial-release build_time=194 run_time=360
gcc-7.2.0-Serial-release build_time=120 run_time=226
ibm-16.1.0-Serial-release build_time=490 run_time=392

Bowman spot check:
#######################################################
PASSED TESTS
#######################################################
intel-16.4.258-Pthread-release build_time=768 run_time=1033
intel-16.4.258-Pthread_Serial-release build_time=1112 run_time=1910
intel-16.4.258-Serial-release build_time=723 run_time=820
intel-17.2.174-OpenMP-release build_time=958 run_time=560
intel-17.2.174-OpenMP_Serial-release build_time=1364 run_time=1456
intel-17.2.174-Pthread-release build_time=870 run_time=896
intel-17.2.174-Pthread_Serial-release build_time=1298 run_time=1781
intel-17.2.174-Serial-release build_time=827 run_time=883

* Filtering remote columns, making MueLu aggregation actually work on
distributed problems (apart for a nondeterministic bug in MueLu
UncoupledAggregation_kokkos)
* Removed old distance-2 serial algorithm from distance-1 coloring
  handle
* Implemented a much faster (>= 18x) net-based serial algorithm to
replace the old one
* Implemented a modestly faster (2x) net-based parallel algorithm
 (COLORING_D2_VB_DYNAMIC) - more improvement should be possible in
future
meaning net-based, with bitset forbidden.
* Don't require a number after "--serial"
* Do require a number after "--cuda" (the device ID)
* If --flag expects another arg after it, but at the end of argc/argv,
  print a message saying that instead of crashign
@srajama1
Copy link
Contributor

Let me understand this, we still have the d2 bug when diagonal entries are not present. Otherwise, things are ok ?

@brian-kelley
Copy link
Contributor Author

@srajama1 Correct.

@srajama1
Copy link
Contributor

Ok, reviewing it now. Can we file two issues, one for these changes. one for the missing diagonal giving wrong results ?

@brian-kelley
Copy link
Contributor Author

Created those issues: #625, and #624.

Copy link
Contributor

@srajama1 srajama1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley : Thanks for the work on cleaning this up.

Two minor comments. No need to rerun spot check if you end up fixing the typo.

src/graph/KokkosGraph_Distance2ColorHandle.hpp Outdated Show resolved Hide resolved
const_lno_nnz_view_t Ccolinds;
const nnz_lno_type numVerts;
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to write this algorithm up in latex.

@srajama1
Copy link
Contributor

Thanks, @brian-kelley ! The issues will help @ndellingwood track these for the release. We also need to update the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants