Fix the problem in QR. #696

kyungjoo-kim · 2020-04-16T20:09:01Z

As seen in #691 , it reports three failure cases.

With TPL, the code goes into my test code. This part is work-in-progress and it is better not to be included in the test. Anyway, I fixed it.
QR is failed. This turns out that some random matrices are badly conditioned and it exceed the error threshold. I made the random matrices diagonal dominant and evaluate the correctness element-wisely so that the error does not proportionally increase with the problem size.
Vectorization on kokkos complex. When testing odd number of vector size (e.g., 3), the aggressive vectorization with complex is failed. This is supposed to be failed as the vectorization does not care about the correctness but it issued a vector instruction. Somehow this passes the test so far and it pops up now. I remove the aggressive vectorization for non-built-in types i.e., complex.

@ndellingwood This fix all the problems.

…pdated.

As we use random matrices for testing, sometimes bad guys show up. We need to make the matrices more easier to solve and residual check should be done element-wise which does not increase with the problem size

ndellingwood

Looks good, thanks @kyungjoo-kim ! Do you have spot-check or test results to post?

kyungjoo-kim · 2020-04-16T21:18:18Z

I use your reproducers to check if this resolves the problems. I am running the cm_test_all_sandia scripts on white and blake and will post the output.

kyungjoo-kim · 2020-04-17T00:32:09Z

I ran the test all sandia on kokkos-dev-2. Almost all compilers are okay except for a few failures that are not related to this PR. I am wondering if this is a known issue or not but I just report here. The failure is segfault in common_openmp and common_serial with intel 17. I don't think that this is deterministic and I am not sure if this is a compiler specific thing (maybe this just happens on intel 17). Sometimes it does not fail in some configuration and sometimes common_openmp only fails and sometimes common_openmp/serial fails. However, the all failures got caught with segfault at common_sort_crsgraph.

10: [ RUN      ] serial.common_sort_crsgraph
10/10 Test #10: common_serial ....................***Exception: SegFault  0.38 sec

Anyone who experience the same error ?

#######################################################
PASSED TESTS
#######################################################
cuda-10.0-Cuda_Pthread-release build_time=194 run_time=124
cuda-10.0-Cuda_Serial-release build_time=202 run_time=127
cuda-10.1-Cuda_Pthread-release build_time=197 run_time=124
cuda-10.1-Cuda_Serial-release build_time=203 run_time=127
cuda-9.2-Cuda_Pthread-release build_time=194 run_time=134
cuda-9.2-Cuda_Serial-release build_time=202 run_time=138
intel-15.0.2-OpenMP-release build_time=117 run_time=35
intel-15.0.2-OpenMP_Serial-release build_time=146 run_time=97
intel-15.0.2-Pthread-release build_time=92 run_time=56
intel-15.0.2-Pthread_Serial-release build_time=117 run_time=110
intel-15.0.2-Serial-release build_time=86 run_time=56
intel-16.0.1-OpenMP-release build_time=133 run_time=36
intel-16.0.1-OpenMP_Serial-release build_time=170 run_time=107
intel-16.0.1-Pthread-release build_time=100 run_time=65
intel-16.0.1-Pthread_Serial-release build_time=128 run_time=129
intel-16.0.1-Serial-release build_time=97 run_time=65
intel-17.0.1-Pthread-release build_time=111 run_time=58
intel-17.0.1-Pthread_Serial-release build_time=143 run_time=115
intel-17.0.1-Serial-release build_time=107 run_time=58
intel-18.0.5-OpenMP-release build_time=175 run_time=37
intel-18.0.5-OpenMP_Serial-release build_time=224 run_time=98
intel-18.0.5-Pthread-release build_time=125 run_time=59
intel-18.0.5-Pthread_Serial-release build_time=170 run_time=119
intel-18.0.5-Serial-release build_time=113 run_time=57
intel-19.0.5-OpenMP-release build_time=191 run_time=36
intel-19.0.5-OpenMP_Serial-release build_time=229 run_time=98
intel-19.0.5-Pthread-release build_time=202 run_time=57
intel-19.0.5-Pthread_Serial-release build_time=232 run_time=119
intel-19.0.5-Serial-release build_time=190 run_time=59
#######################################################
FAILED TESTS
#######################################################
intel-17.0.1-OpenMP-release (test failed)
#######################################################

[kyukim @kokkos-dev-2] test-all > ../../../kyukim/scripts/cm_test_all_sandia cuda intel 
Running on machine: kokkos-dev-2
KokkosKernels Repository Status:  3f245f02bbdd032ecb02b9be8dc8d8dc5304e5bf Merge branch 'qr-develop' of https://github.com/kyungjoo-kim/kokkos-kernels into qr-develop

Kokkos Repository Status:  7c1ab6cc7066d6b7efecdfd8cfc54482cae65caf Merge pull request #2957 from aprokop/install_trilinos_version

...
5/5 Test #5: common_openmp ....................***Exception: SegFault  0.21 sec
...
The following tests FAILED:
	  5 - common_openmp (SEGFAULT)
Errors while running CTest
  FAILED intel-17.0.1-OpenMP_Serial-release
The following tests FAILED:
	  5 - common_openmp (SEGFAULT)
	 10 - common_serial (SEGFAULT)
Errors while running CTest

ndellingwood · 2020-04-17T20:26:24Z

Sometimes it does not fail in some configuration and sometimes common_openmp only fails and sometimes common_openmp/serial fails. However, the all failures got caught with segfault at common_sort_crsgraph.

@brian-kelley are you able to take a look and try reproducing on kokkos-dev-2? @kyungjoo-kim is seeing intermittent seg faults with intel/17.0.1 in common_sort_crsgraph added in PR #663

brian-kelley · 2020-04-17T20:58:21Z

@ndellingwood @kyungjoo-kim Yes, I was able to replicate this (KokkosKernels_common_openmp crashes consistently for me). The CRS sorting stuff was merged in #663 about a month ago, but I only ran the spot-checks for that so Intel 17 was not used (so I suspect this is a compiler issue). I can still try to debug it though, there might be an actual bug or a workaround.

brian-kelley · 2020-04-17T21:21:26Z

It was actually crashing inside std::sort, when sorting a std::pair with a lambda as a custom comparator. I replaced the pair with a struct and operator<, and now it doesn't crash. I'll put in a PR with the workaround but it doesn't block this from being merged.

srajama1 · 2020-04-17T22:47:26Z

Let us merge Brian's PR first, make sure QR passes all tests before merging. This is the second iteration, so better to be safe.

brian-kelley · 2020-04-18T01:57:50Z

@kyungjoo-kim @srajama1 My change #698 did fix all the Intel 17.0.1 builds on kokkos-dev2, but I'm seeing a couple new issues:

several other warnings in the full check of kokkos-dev2, with GCC 4.8.4.
a batched QR test (scalar=double) is failing on RIDE (not the same failure as KokkosBatched QR PR breaking nightly tests #691, that this PR was for).

2: [ RUN      ] cuda.batched_scalar_team_vector_qr_double
2: /ascldap/users/bmkelle/StdSortWorkaround-testing-1587158991/kokkos-kernels/unit_test/../test_common/KokkosKernels_TestUtils.hpp:88: Failure
2: The difference between double(AT1::abs(val1)) and double(AT2::abs(val2)) is 2, which exceeds double(AT3::abs(tol)), where 
2: double(AT1::abs(val1)) evaluates to 2, 
2: double(AT2::abs(val2)) evaluates to 0, and
2: double(AT3::abs(tol)) evaluates to 2.2204460492503131e-13.

This is CUDA 9.2.88 CUDA/OpenMP release, so it has UVM enabled. Maybe just a missing fence in the test? I had a bunch of those that I needed to fix. This test passed in all the kokkos-dev2 CUDA builds so far.

The warnings for GCC 4.8.4 are in a bunch of places, here are a couple of them:

/ascldap/users/bmkelle/StdSortWorkaround-fulltest-1587158944/kokkos-kernels/perf_test/sparse/KokkosSparse_sptrsv_superlu.cpp:705:14: error: invalid suffix on literal; C++11 requires a space between literal and identifier [-Werror=literal-suffix]
       #error "Invalid type specified in KOKKOSKERNELS_SCALARS, supported types are "double,complex<double>""
              ^

                 from /ascldap/users/bmkelle/StdSortWorkaround-fulltest-1587158944/Testing/TestAll_2020-04-17_14.29.10/gcc/4.8.4/OpenMP-release/src/impl/generated_specializations_cpp/trsm/Blas3_trsm_eti_DOUBLE_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE.cpp:48:
/ascldap/users/bmkelle/StdSortWorkaround-fulltest-1587158944/kokkos-kernels/src/batched/KokkosBatched_Vector_SIMD.hpp:49:0: error: ignoring #pragma omp simd [-Werror=unknown-pragmas]
 #pragma omp simd

So I can look into fixing those too.

srajama1 · 2020-04-18T05:01:28Z

@kyungjoo-kim can take care of the QR test as part of the other PR.

@iyamazaki Can you look at the sptrsv warnings please ?

srajama1 · 2020-04-18T05:03:36Z

@ndellingwood Why is this merged before spot checks ? There are still failures (see above)

iyamazaki · 2020-04-18T05:19:49Z

I wonder how this is compiled. For SpTrsv, I thought that part of the code is still protected with

#if defined(KOKKOSKERNELS_ENABLE_TPL_SUPERLU) &&
defined(KOKKOSKERNELS_ENABLE_SUPERNODAL_SPTRSV)

and should not be compiled in?

We'll try to address this in another PR (e.g., in PR 680). Please let us know if you figure more details related to SpTRSV warnings. Thank you!!

ndellingwood · 2020-04-18T20:28:58Z

@ndellingwood Why is this merged before spot checks ? There are still failures (see above)

This PR fixed the QR errors reported in #691 and @kyungjoo-kim reported that the PR fixed the #691 failures in his testing when I asked about the spot-check and was merged a couple days ago. I'm not sure why Kyungjoo had not encountered the batched_scalar_team_vector_qr_double failure Brian later reported, I merged based on report that this PR addressed the QR failures, but that test slipped through. Aside from that test the other errors reported above are not due to any changes in this PR but were pre-existing errors from merges of past PRs and shouldn't be blockers on this PR, separate issues should be open to track them.

ndellingwood · 2020-04-18T20:51:32Z

This is CUDA 9.2.88 CUDA/OpenMP release, so it has UVM enabled. Maybe just a missing fence in the test? I had a bunch of those that I needed to fix. This test passed in all the kokkos-dev2 CUDA builds so far.

@brian-kelley did the batched_scalar_team_vector_qr_double test fail consistently for you on Ride? The White nightlies testing cuda/9.2 with Cuda_OpenMP began passing again after merge of this PR (all the nightlies reported in #691 resumed passing after the merge). Could you post reproducer instructions in a new issue, including the queue and options passed to cm_test_all_sandia (or other instructions if you encountered this in a different build)?

srajama1 · 2020-04-18T21:42:43Z

@ndellingwood Your merge appear before even Kyungjoo's spot-check output which had errors that Brian is looking at. Let us not merge anything before spot-check unless it is really low impact like fixing a comment. This set of changes are causing way too many stability issues that if someone reviews us end of the year this will show on top.

ndellingwood · 2020-04-18T23:05:54Z

Your merge appear before even Kyungjoo's spot-check output which had errors that Brian is looking at

Kyungjoo confirmed this PR fixed the issues reported in #691, wasn't clear spot-check results were going to be posted. I'll hold off on merges unless there are spot-check results on future PRs, but what Kyungjoo reported in the later test results wasn't caused by this PR, and merging this PR addressed the failing nightly tests reported in #691. We need to find the gaps in nightlies that aren't detecting what Brian and Kyungjoo reported here.

brian-kelley · 2020-04-20T01:45:16Z

@kyungjoo-kim The batched QR failed in the same way the second run, so I don't think it's random (or at least, it happens often). Here is how to reproduce on RIDE (White should be exactly the same, but I haven't actually tried it there).

module load cmake/3.12.3 cuda/9.2.88 gcc/7.2.0 ibm/xl/16.1.0
# Run on rhel7F queue (Power8 + Kepler K80)
bsub -Is -q rhel7F bash
export KOKKOS_SRC=...
export KOKKOSKERNELS_SRC=...
$KOKKOSKERNELS_SRC/cm_generate_makefile.bash --with-devices=Cuda,Serial --arch=Power8,Kepler37 --compiler=$KOKKOS_SRC/bin/nvcc_wrapper --cxxflags="-O3 -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized " --cxxstandard="11" --ldflags="" --with-cuda=/home/projects/ppc64le-pwr8-nvidia/cuda/9.2.88 --kokkos-path=$KOKKOS_SRC --kokkoskernels-path=$KOKKOSKERNELS_SRC --with-scalars='double,complex_double' --with-ordinals= --with-offsets= --with-layouts= --with-tpls=    --with-options= --with-cuda-options=force_uvm --no-examples --with-options=enable_large_mem_tests

make -j32

cd unit_test
# make sure CUDA_MANAGED_FORCE_DEVICE_ALLOC and CUDA_LAUNCH_BLOCKING
# are not defined, or are defined to "0"
./KokkosKernels_batched_dla_cuda

EDIT: same failure if $CUDA_LAUNCH_BLOCKING and $CUDA_MANAGED_FORCE_DEVICE_ALLOC both exported to "1", so that's not the issue.

The test that fails is "cuda.batched_scalar_team_vector_qr_double" which shouldn't involve complex at all, but the build does enable double and complex_double as scalars. I'm doing another build with just double enabled, to see if it still happens. That could be the hole in nightlies, not sure.

brian-kelley · 2020-04-20T01:46:43Z

@iyamazaki The issue with SPTRSV was just that a string literal in a macro didn't have quotes escaped. I fixed it in #698.

kyungjoo-kim · 2020-04-20T15:34:40Z

@brian-kelley Let me try to reproduce it on white.

brian-kelley · 2020-04-20T15:40:25Z

@kyungjoo-kim You can remove the --with-scalars='double,complex_double', since I just checked the double-only build and it was still reproduced.

kyungjoo-kim · 2020-04-20T15:41:54Z

@brian-kelley White is too busy that I cannot greb a kepler node. Do you also encounter the error from a pascal node ?

brian-kelley · 2020-04-20T15:46:06Z

@kyungjoo-kim I'm not sure, I didn't try on a pascal node of RIDE. I didn't see the bug happening on my workstation (skylake + pascal + force_uvm) though.

ndellingwood · 2020-04-20T16:57:36Z

@kyungjoo-kim @brian-kelley the cuda.batched_scalar_team_vector_qr_double had its first failure in the nightlies on kokkos-dev Sat night; it appears to be random, as the same nightly test resumed passing last night.

Here's the failure: https://jenkins-son.sandia.gov/job/KokkosKernels_KokkosDev_CLANG7_CUDA9/47/console

Reproducer:

ssh kokkos-dev

module load sems-env sems-cmake/3.12.2 kokkos-env kokkos-hwloc/1.10.1/base sems-clang/7.0.1 sems-cuda/9.2

$KOKKOSKERNELS_PATH/cm_generate_makefile.bash --with-devices=Cuda,OpenMP --arch=Kepler35 --compiler=clang++ --cxxflags="-O3 -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized " --cxxstandard="11" --ldflags="" --with-cuda --kokkos-path=$KOKKOS_PATH --kokkoskernels-path=$KOKKOSKERNELS_PATH --with-scalars='' --with-ordinals= --with-offsets= --with-layouts= --with-tpls=    --with-options=disable_deprecated_code --with-cuda-options= --no-examples

Based on where the test is failing it seems that cuda/9.2.88, Cuda_OpenMP build, with Kepler GPU are the commonalities?

kyungjoo-kim · 2020-04-20T16:59:52Z

@ndellingwood Thanks. It is good that I can also reproduce it on kokkos-dev-2. I will try if I can reproduce it there and fix the problem. Sorry for this problem.

brian-kelley · 2020-04-20T17:01:57Z

@ndellingwood I also observed it on Cuda_Serial on RIDE, but otherwise yes.

ndellingwood · 2020-04-20T17:16:34Z

@kyungjoo-kim it was on kokkos-dev (the older machine, not kokkos-dev-2), important to reproduce there with the Kepler GPUs as it doesn't seem to occur with Volta (or will occur with low enough frequency we haven't seen it yet)

kyungjoo-kim · 2020-04-20T17:31:45Z

@ndellingwood Unfortunately, I cannot reproduce the failure. I ran multiple times and the test passes. I also cannot access the jenkins failure link. Would you mind if I ask to put some more detailed error message from the jenkins ?

kyungjoo-kim · 2020-04-20T17:43:18Z

@brian-kelley How frequently did you see the failure on ride ? @ndellingwood says that is is randomly failed but you seem to see the error more often. When we test kokkoskernels, we always test it against kokkos master branch, right ? I just want to match the same environment exactly.

brian-kelley · 2020-04-20T17:46:43Z

@kyungjoo-kim On RIDE Kepler (rhel7F queue) it happened every run for me. I was testing against kokkos develop branch.

ndellingwood · 2020-04-20T17:55:46Z

When we test kokkoskernels, we always test it against kokkos master branch, right ?

@kyungjoo-kim nightlies test against kokkos' develop branch. As far as nightlies this seems to be random, it has only failed once since last week on kokkos-dev (which is testing on Kepler). Hopefully the White rhel7F queue is available since @brian-kelley sees this every run there, though I think there may only be 1 Kepler node remaining so it can get congested. On kokkos-dev, if you test with Brian's reproducer instructions (use --arch=Kepler35) hopefully it shows up consistently there as well?

ndellingwood · 2020-04-20T17:58:01Z

Would you mind if I ask to put some more detailed error message from the jenkins ?

@kyungjoo-kim sure, here's the snip of the test failure:

04:35:09 2: [ RUN      ] cuda.batched_scalar_team_vector_qr_double
04:35:09 2: /home/jenkins/slave/workspace/KokkosKernels_KokkosDev_CLANG7_CUDA9/kokkos-kernels/unit_test/../test_common/KokkosKernels_TestUtils.hpp:88: Failure
04:35:09 2: The difference between double(AT1::abs(val1)) and double(AT2::abs(val2)) is 2, which exceeds double(AT3::abs(tol)), where
04:35:09 2: double(AT1::abs(val1)) evaluates to 2,
04:35:09 2: double(AT2::abs(val2)) evaluates to 0, and
04:35:09 2: double(AT3::abs(tol)) evaluates to 2.2204460492503131e-13.
04:35:09 2: [  FAILED  ] cuda.batched_scalar_team_vector_qr_double (37 ms)
04:35:09 2: [----------] 48 tests from cuda (4911 ms total)
04:35:09 2: 
04:35:09 2: [----------] Global test environment tear-down
04:35:09 2: [==========] 48 tests from 1 test case ran. (4911 ms total)
04:35:09 2: [  PASSED  ] 47 tests.
04:35:09 2: [  FAILED  ] 1 test, listed below:
04:35:09 2: [  FAILED  ] cuda.batched_scalar_team_vector_qr_double
04:35:09 2: 
04:35:09 2:  1 FAILED TEST
04:35:09  2/10 Test  #2: batched_dla_cuda .................***Failed   10.10 sec

Edit: Adding Jenkins link

brian-kelley · 2020-04-20T18:01:04Z

@ndellingwood @kyungjoo-kim I actually only ran on RIDE, since its Kepler queue is much less busy than White's.

ndellingwood · 2020-04-20T18:11:27Z

Oop, I translated RIDE to White, my mistake.

kyungjoo-kim · 2020-04-20T18:13:30Z

@brian-kelley Okay. It is tricky. The nightly tests on white Kepler node do not report the same failure and kokkos-dev reports a random behavior (although I cannot reproduce the same) but the test on ride consistently reproduces the failure. The only way that I can fix this problem is to use ride but I do not have an access to the machine. sigh....

Ah... BTW.... It only reports a single entry comparison is wrong. The "2" means that there is a sign change. I think that this is from a trivial equation 1x1 matrix. I can take a look further but sometime I need help either from @brian-kelley and @ndellingwood who can access ride.

ndellingwood · 2020-04-20T18:17:46Z

The White nightlies are actually running on the Pascal queue due to the congestion issues, I had to also duplicate the Kokkos jobs to run on Pascal because the nightlies on Kepler queues were aborting.

kyungjoo-kim · 2020-04-20T18:22:29Z

@ndellingwood Is the failure indeed one comparison failed or your copy paste of one case ? My test code tests 1024 samples. If this indeed happens on a single problem, this means that the same code runs fine most of other cases and fails on a single sample with sign flipping.

ndellingwood · 2020-04-20T18:24:59Z

@kyungjoo-kim I double checked, it is one comparison failure, the copy+paste is the full failure message.

kyungjoo-kim · 2020-04-20T18:38:44Z

@ndellingwood Okay. It helps a lot and I can reduce the problem scope a lot.

@brian-kelley Can I ask you to test my branch on ride as the ride is the one consistently reports the failure ?

[kyukim @kokkos-dev] master > git remote -v 
origin	https://github.com/kyungjoo-kim/kokkos-kernels.git (fetch)
origin	https://github.com/kyungjoo-kim/kokkos-kernels.git (push)
[kyukim @kokkos-dev] master > git branch 
  master
* qr-develop
      1 commit 03c27cb37c409110d7c49b39b62578e08314bc6c
      2 Author: Kyungjoo Kim <kyukim@kokkos-dev.sandia.gov>
      3 Date:   Mon Apr 20 12:34:47 2020 -0600
      4 
      5     KokkosBatched - remove some meaningless fence
      6 
      7 commit df3369c8b6307fb899d6c5ab6e14316f638c9561
      8 Author: Kyungjoo Kim <kyukim@kokkos-dev.sandia.gov>
      9 Date:   Mon Apr 20 12:32:30 2020 -0600
     10 
     11     KokkosBatched - single might be required

brian-kelley · 2020-04-20T18:43:47Z

@kyungjoo-kim Yes, I'll let you know what happens.

brian-kelley · 2020-04-20T19:40:48Z

@kyungjoo-kim It's not completely done yet, but cuda-9.2.88-Cuda_OpenMP-release and cuda-9.2.88-Cuda_Serial-release have both passed so far, so I think you fixed it. I'll go in and run those unit tests in a loop to make sure.

kyungjoo-kim · 2020-04-20T20:00:05Z

@brian-kelley Thanks for letting me know that. If this fixes the problem, would you include th modifications into your PR ?

brian-kelley · 2020-04-20T20:39:18Z

@kyungjoo-kim Yes, I'll add it. I'm sure it fixed it because I ran KokkosKernels_batched_dla_cuda 100 times in a loop (RIDE/kepler) and they all passed.

brian-kelley · 2020-04-21T17:08:27Z

@srajama1 @ndellingwood I'm just gonna add float and complex_double to the KokkosKernels_KokkosDev2_CUDA10_1 build on SON jenkins, since that build only takes 6 minutes now, and this would have caught trilinos/Trilinos#7206 earlier.

ndellingwood · 2020-04-21T17:10:40Z

@brian-kelley sounds good, based on status in that issue that means the test will begin failing until the kokkos changes go in, does that sound right?

brian-kelley · 2020-04-21T17:12:11Z

@ndellingwood That's true, should I just change it back for now and wait for the kokkos fix? I don't want everybody to get spammed with emails.

ndellingwood · 2020-04-21T17:14:39Z

should I just change it back for now and wait for the kokkos fix

@brian-kelley how about changing that test back, then create a duplicate of that test with the float and complex_double coverage (no need to schedule it for a different time since the test is so short), but only add you and me to email list for now. We can add others to the list once it begins passing.

brian-kelley · 2020-04-21T17:15:59Z

@ndellingwood Sounds good.

brian-kelley · 2020-04-21T18:23:15Z

@ndellingwood OK, that build is set up and replicated the errors in trilinos/Trilinos#7206, so we'll know when the Kokkos issue gets fixed.

kyungjoo-kim added 3 commits April 16, 2020 09:11

KokkosBatched - with TPL, there is a testing code path which is not u…

9b4123c

…pdated.

KokkosBatched - remove aggressive vectorization for non-built in types

250f97c

KokkosBatched - make the test matrices diagonal dominant

31cb65d

As we use random matrices for testing, sometimes bad guys show up. We need to make the matrices more easier to solve and residual check should be done element-wise which does not increase with the problem size

kyungjoo-kim requested a review from ndellingwood April 16, 2020 20:09

kyungjoo-kim self-assigned this Apr 16, 2020

ndellingwood approved these changes Apr 16, 2020

View reviewed changes

ndellingwood merged commit 94f2e5a into kokkos:develop Apr 16, 2020

brian-kelley mentioned this pull request Apr 18, 2020

Intel 17 bug workaround, fix team QR random failure, fix several warnings #698

Merged

ndellingwood mentioned this pull request Apr 18, 2020

KokkosBatched QR PR breaking nightly tests #691

Closed

This was referenced Aug 7, 2024

Some compilers throw shadow warnings in static functions #2297

Merged

Update version to 4.4.99 #2302

Merged

Fix the problem in QR. #696

Fix the problem in QR. #696

Conversation

kyungjoo-kim commented Apr 16, 2020

ndellingwood left a comment

Choose a reason for hiding this comment

kyungjoo-kim commented Apr 16, 2020 • edited Loading

kyungjoo-kim commented Apr 17, 2020 • edited Loading

ndellingwood commented Apr 17, 2020

brian-kelley commented Apr 17, 2020

brian-kelley commented Apr 17, 2020

srajama1 commented Apr 17, 2020

brian-kelley commented Apr 18, 2020

srajama1 commented Apr 18, 2020

srajama1 commented Apr 18, 2020

iyamazaki commented Apr 18, 2020

ndellingwood commented Apr 18, 2020

ndellingwood commented Apr 18, 2020

srajama1 commented Apr 18, 2020

ndellingwood commented Apr 18, 2020

brian-kelley commented Apr 20, 2020 • edited Loading

brian-kelley commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

brian-kelley commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

brian-kelley commented Apr 20, 2020

ndellingwood commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

brian-kelley commented Apr 20, 2020

ndellingwood commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

brian-kelley commented Apr 20, 2020

ndellingwood commented Apr 20, 2020

ndellingwood commented Apr 20, 2020 • edited Loading

brian-kelley commented Apr 20, 2020

ndellingwood commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

ndellingwood commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

ndellingwood commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020 • edited Loading

brian-kelley commented Apr 20, 2020

brian-kelley commented Apr 20, 2020

kyungjoo-kim commented Apr 20, 2020

brian-kelley commented Apr 20, 2020

brian-kelley commented Apr 21, 2020

ndellingwood commented Apr 21, 2020

brian-kelley commented Apr 21, 2020

ndellingwood commented Apr 21, 2020

brian-kelley commented Apr 21, 2020

brian-kelley commented Apr 21, 2020

kyungjoo-kim commented Apr 16, 2020 •

edited

Loading

kyungjoo-kim commented Apr 17, 2020 •

edited

Loading

brian-kelley commented Apr 20, 2020 •

edited

Loading

ndellingwood commented Apr 20, 2020 •

edited

Loading

kyungjoo-kim commented Apr 20, 2020 •

edited

Loading