Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First iteration of HashmapAccumulator cleanup #731

Merged
merged 23 commits into from
Jul 8, 2020

Conversation

e10harvey
Copy link
Contributor

@e10harvey e10harvey commented May 29, 2020

The following changes were made to the HashmapAccumulator class:

  1. max_value_size changed to private member __max_value_size.
  2. hash_key_size was removed.
  3. used_size was removed.
  4. Hashes are now computed within the HashmapAccumulator insertion routines:
    • __hashOpRHS was added as a private member.
    • __compute_hash was added as a private member. This function is selected at compile time via a templated argument to the HashmapAccumulator constructor.
    • vector_atomic_insert_into_hash_mergeAdd_with_team_level_list_length does not compute hashes internally due to the special use-case in KokkosSparse_spgemm_impl_speed.hpp:operator GPUTag.

Fixes #508.

spot-checks

1

<snip>
WARNING!! THE FOLLOWING CHANGES ARE UNCOMMITTED!! :
?? build/
?? testing/

KokkosKernels Repository Status:  7ff59541f124a9756cabd2c6949758500db58afe common/HashmapAccumulator: cleanup insert fn declarations more

Kokkos Repository Status:  33f730c475802bc226fc17e42e58fe7612d86b41 Merge pull request #3073 from masterleinad/enable_travis_compiler_warnings


Going to test compilers:  gcc/6.4.0 gcc/7.2.0 ibm/16.1.1 cuda/9.2.88 cuda/10.1.105
<snip>
#######################################################
PASSED TESTS
#######################################################
cuda-10.1.105-Cuda_OpenMP-release build_time=647 run_time=170
cuda-10.1.105-Cuda_Serial-release build_time=587 run_time=186
cuda-9.2.88-Cuda_OpenMP-release build_time=664 run_time=185
cuda-9.2.88-Cuda_Serial-release build_time=650 run_time=201
gcc-6.4.0-OpenMP_Serial-release build_time=228 run_time=139
gcc-7.2.0-OpenMP-release build_time=158 run_time=61
gcc-7.2.0-OpenMP_Serial-release build_time=220 run_time=138
gcc-7.2.0-Serial-release build_time=141 run_time=71
ibm-16.1.1-Serial-release build_time=1012 run_time=74

2

KokkosKernels Repository Status:  7ff59541f124a9756cabd2c6949758500db58afe common/HashmapAccumulator: cleanup insert fn declarations more

Kokkos Repository Status:  33f730c475802bc226fc17e42e58fe7612d86b41 Merge pull request #3073 from masterleinad/enable_travis_compiler_warnings


Going to test compilers:  cuda/9.2
Testing compiler cuda/9.2
  Starting job cuda-9.2-Cuda_OpenMP-release
kokkos devices: Cuda,OpenMP
kokkos arch: Kepler35
kokkos options: 
kokkos cuda options: force_uvm
kokkos cxxflags: -O3 -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized 
extra_args: 
kokkoskernels scalars: 'double,complex_double'
kokkoskernels ordinals: int
kokkoskernels offsets: int,size_t
kokkoskernels layouts: LayoutLeft
[eharvey@kokkos-dev testing]$ tail -f do-test-issue-508-05-29-2020.out 
kokkos devices: Cuda,OpenMP
kokkos arch: Kepler35
kokkos options: 
kokkos cuda options: force_uvm
kokkos cxxflags: -O3 -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized 
extra_args: 
kokkoskernels scalars: 'double,complex_double'
kokkoskernels ordinals: int
kokkoskernels offsets: int,size_t
kokkoskernels layouts: LayoutLeft
  PASSED cuda-9.2-Cuda_OpenMP-release
#######################################################
PASSED TESTS
#######################################################
cuda-9.2-Cuda_OpenMP-release build_time=1095 run_time=264

3

<snip>
WARNING!! THE FOLLOWING CHANGES ARE UNCOMMITTED!! :
?? build-issue-508/
?? build.750fe245/
?? build/
?? do-cmake.sh
?? do-test.sh
?? issue-727/
?? testing/

KokkosKernels Repository Status:  7ff59541f124a9756cabd2c6949758500db58afe common/HashmapAccumulator: cleanup insert fn declarations more

Kokkos Repository Status:  cb9727fae308ce7ae2248dbb8168c430d958bc32 core/src/impl: Conditionally define get_gpu in Kokkos_Core
<snip>
#######################################################
PASSED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release build_time=687 run_time=159
clang-8.0-Pthread_Serial-release build_time=207 run_time=110
clang-9.0.0-Pthread-release build_time=126 run_time=61
clang-9.0.0-Serial-release build_time=207 run_time=56
cuda-10.1-Cuda_OpenMP-release build_time=806 run_time=149
cuda-9.2-Cuda_Serial-release build_time=753 run_time=180
gcc-4.8.4-OpenMP-release build_time=126 run_time=58
gcc-7.3.0-OpenMP-release build_time=136 run_time=57
gcc-7.3.0-Pthread-release build_time=111 run_time=52
gcc-8.3.0-Serial-release build_time=135 run_time=58
gcc-9.1-OpenMP-release build_time=170 run_time=58
gcc-9.1-Serial-release build_time=155 run_time=62
intel-17.0.1-Serial-release build_time=262 run_time=65
intel-18.0.5-OpenMP-release build_time=338 run_time=53
intel-19.0.5-Pthread-release build_time=469 run_time=56
<snip>
WARNING!! THE FOLLOWING CHANGES ARE UNCOMMITTED!! :
?? build-issue-508/
?? build.750fe245/
?? build/
?? do-cmake.sh
?? do-test.sh
?? issue-727/
?? testing/

KokkosKernels Repository Status:  7ff59541f124a9756cabd2c6949758500db58afe common/HashmapAccumulator: cleanup insert fn declarations more

Kokkos Repository Status:  33f730c475802bc226fc17e42e58fe7612d86b41 Merge pull request #3073 from masterleinad/enable_travis_compiler_warnings


Going to test compilers:  gcc/7.3.0 gcc/8.3.0 gcc/9.1 gcc/4.8.4 intel/17.0.1 intel/18.0.5 intel/19.0.5 clang/8.0 clang/9.0.0 cuda/10.1
<snip>
#######################################################
PASSED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release build_time=606 run_time=153
clang-8.0-Pthread_Serial-release build_time=197 run_time=111
clang-9.0.0-Pthread-release build_time=115 run_time=53
clang-9.0.0-Serial-release build_time=126 run_time=57
cuda-10.1-Cuda_OpenMP-release build_time=837 run_time=151
gcc-4.8.4-OpenMP-release build_time=116 run_time=60
gcc-7.3.0-OpenMP-release build_time=138 run_time=59
gcc-7.3.0-Pthread-release build_time=110 run_time=53
gcc-8.3.0-Serial-release build_time=136 run_time=59
gcc-9.1-OpenMP-release build_time=169 run_time=60
gcc-9.1-Serial-release build_time=153 run_time=59
intel-17.0.1-Serial-release build_time=245 run_time=56
intel-18.0.5-OpenMP-release build_time=363 run_time=53
intel-19.0.5-Pthread-release build_time=442 run_time=47

When enabling all scalars

$ ./KokkosKernels_sparse_cuda --gtest_filter=cuda.sparse_spmv_* still results in:

[==========] 32 tests from 1 test case ran. (233294 ms total)
[  PASSED  ] 20 tests.
[  FAILED  ] 12 tests, listed below:
[  FAILED  ] cuda.sparse_spmv_float_int_int_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_struct_float_int_int_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_float_int_size_t_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_struct_float_int_size_t_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_kokkos_complex_float_int_int_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_struct_kokkos_complex_float_int_int_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_kokkos_complex_float_int_size_t_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_struct_kokkos_complex_float_int_size_t_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_mv_float_int_int_LayoutLeft_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_mv_float_int_size_t_LayoutLeft_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_mv_kokkos_complex_float_int_int_LayoutLeft_TestExecSpace
[  FAILED  ] cuda.sparse_spmv_mv_kokkos_complex_float_int_size_t_LayoutLeft_TestExecSpace

12 FAILED TESTS

and $ ./KokkosKernels_sparse_cuda --gtest_filter=cuda.sparse_spgemm_* still results in:

[==========] 16 tests from 1 test case ran. (127993 ms total)
[  PASSED  ] 12 tests.
[  FAILED  ] 4 tests, listed below:
[  FAILED  ] cuda.sparse_spgemm_jacobi_float_int_int_TestExecSpace
[  FAILED  ] cuda.sparse_spgemm_jacobi_float_int_size_t_TestExecSpace
[  FAILED  ] cuda.sparse_spgemm_jacobi_kokkos_complex_float_int_int_TestExecSpace
[  FAILED  ] cuda.sparse_spgemm_jacobi_kokkos_complex_float_int_size_t_TestExecSpace

 4 FAILED TESTS

e10harvey added 20 commits May 20, 2020 11:37

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
  - Remove max_value_size as parameter to insertion routines.
  - Add vector_atomic_insert_into_hash_mergeAdd_with_team_level_list_length
  to support spgemm "speed" use-case with team-level list lengths.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
  - Remove unused params from vector_atomic_insert_into_hash_mergeOr
  - Conditionally assert and abort upon insertion of hash = -1
  - Add comments
@e10harvey e10harvey added the Cleanup Code maintenance that isn't a bugfix or new feature label May 29, 2020
@e10harvey e10harvey self-assigned this May 29, 2020
@seheracer
Copy link
Contributor

Which machine are these spot-check results from?

Copy link
Contributor

@brian-kelley brian-kelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally everything looks great, there's just a couple minor changes I would like to see

@e10harvey e10harvey requested a review from brian-kelley July 7, 2020 19:06
@e10harvey
Copy link
Contributor Author

@ndellingwood, @srajama1: This is ready to merge.

@ndellingwood ndellingwood merged commit f89a3d9 into kokkos:develop Jul 8, 2020
@ndellingwood
Copy link
Contributor

Thanks @e10harvey !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cleanup Code maintenance that isn't a bugfix or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HashmapAccumulator has several unused members, misnamed parameters
4 participants