Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed block row formula in block GS test (Fix #648) #649

Merged
merged 1 commit into from
Mar 9, 2020

Conversation

brian-kelley
Copy link
Contributor

(an example of #478, which is incorrect integer division with rounding up)

This caused the X vector to have the wrong length. I replicated #648 locally and checked that this fixed it.

RIDE:
#######################################################
PASSED TESTS
#######################################################
cuda-10.1.105-Cuda_OpenMP-release build_time=541 run_time=398
cuda-10.1.105-Cuda_Serial-release build_time=551 run_time=508
cuda-9.2.88-Cuda_OpenMP-release build_time=531 run_time=545
cuda-9.2.88-Cuda_Serial-release build_time=571 run_time=647
gcc-6.4.0-OpenMP_Serial-release build_time=207 run_time=382
gcc-7.2.0-OpenMP-release build_time=130 run_time=126
gcc-7.2.0-OpenMP_Serial-release build_time=183 run_time=356
gcc-7.2.0-Serial-release build_time=122 run_time=226
ibm-16.1.0-Serial-release build_time=504 run_time=392

Kokkos-dev2 (only failure is due to #645 )
#######################################################
PASSED TESTS
#######################################################
clang-8.0-Pthread_Serial-release build_time=86 run_time=370
cuda-10.1-Cuda_OpenMP-release build_time=272 run_time=318
cuda-9.2-Cuda_Serial-release build_time=264 run_time=428
gcc-7.3.0-OpenMP-release build_time=60 run_time=117
gcc-7.3.0-Pthread-release build_time=58 run_time=186
gcc-8.3.0-Serial-release build_time=59 run_time=177
gcc-9.1-OpenMP-release build_time=72 run_time=109
gcc-9.1-Serial-release build_time=65 run_time=175
intel-18.0.5-OpenMP-release build_time=169 run_time=121
#######################################################
FAILED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release (test failed)

@brian-kelley brian-kelley self-assigned this Mar 9, 2020
Copy link
Contributor

@ndellingwood ndellingwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @brian-kelley ! I like that you added the non-power-of-two block size test as well :)

@brian-kelley
Copy link
Contributor Author

@ndellingwood Argh, I just noticed that I changed the block size in the rank-2 test to divide evenly into num rows, but that was not intentional. Block GS is intended to work with block size not being a multiple of num rows, the issue here was the way ceiling integer division was done.

So I'll change that back and retest.

(an example of kokkos#478, which is incorrect integer
division with rounding up)
@brian-kelley brian-kelley force-pushed the FixBlockGS_CUSPARSE branch from 7b5fa1b to fd03e31 Compare March 9, 2020 18:33
@brian-kelley
Copy link
Contributor Author

Still good, so I'm merging:
#######################################################
PASSED TESTS
#######################################################
cuda-10.1.105-Cuda_OpenMP-release build_time=500 run_time=405
cuda-10.1.105-Cuda_Serial-release build_time=556 run_time=516
cuda-9.2.88-Cuda_OpenMP-release build_time=551 run_time=546
cuda-9.2.88-Cuda_Serial-release build_time=563 run_time=676
gcc-6.4.0-OpenMP_Serial-release build_time=221 run_time=386
gcc-7.2.0-OpenMP-release build_time=121 run_time=129
gcc-7.2.0-OpenMP_Serial-release build_time=196 run_time=361
gcc-7.2.0-Serial-release build_time=116 run_time=226
ibm-16.1.0-Serial-release build_time=521 run_time=392

#######################################################
PASSED TESTS
#######################################################
clang-8.0-Pthread_Serial-release build_time=92 run_time=423
cuda-10.1-Cuda_OpenMP-release build_time=272 run_time=327
cuda-9.2-Cuda_Serial-release build_time=274 run_time=441
gcc-7.3.0-OpenMP-release build_time=58 run_time=119
gcc-7.3.0-Pthread-release build_time=57 run_time=185
gcc-8.3.0-Serial-release build_time=59 run_time=179
gcc-9.1-OpenMP-release build_time=71 run_time=112
gcc-9.1-Serial-release build_time=66 run_time=181
intel-18.0.5-OpenMP-release build_time=186 run_time=121
#######################################################
FAILED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release (test failed)

@brian-kelley brian-kelley merged commit 89ebbdf into kokkos:develop Mar 9, 2020
@brian-kelley brian-kelley deleted the FixBlockGS_CUSPARSE branch July 30, 2020 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants