-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tpetra: Failing tests without UVM #8209
Comments
@cgcgcg can you add any other Tpetra tests that fail on the nightly test you added recently? |
Sure, the nightlies will run once PR #8191 allows the Tpetra build to go through. |
Here are the results from the nightly test. |
@cgcgcg thanks! Noting for the record that there are 103 failing Tpetra tests on the dashboard, while my list only has 92. I'll probably try harder to sort out what's up with that once we have fewer failing tests. |
Well, I got us down to 102 :) |
Tests failing in crsGraph's functions with UVM disabled, and their stack traces. This is incomplete, as valgrind isn't finished running unit tests. TpetraCore_BlockCrsMatrix.exe Stack traces: TpetraCore_BlankRowBugTest.exe TpetraCore_CrsGraph_UnitTests0.exe TpetraCore_CrsGraph_UnitTests1.exe TpetraCore_CrsGraph_UnitTests_Swap.exe TpetraCore_CrsGraph_ReindexColumns.exe TpetraCore_Issue601.exe TpetraCore_CrsGraph_insertGlobalIndicesFiltered.exe TpetraCore_CrsGraph_PackUnpack.exe TpetraCore_CrsGraph_getNumDiags.exe TpetraCore_CrsGraph_UnpackIntoStaticGraph.exe TpetraCore_CrsGraph_StaticImportExport.exe TpetraCore_CrsGraph_UnpackMerge.exe TpetraCore_CrsMatrix_UnitTests.exe TpetraCore_CrsMatrix_UnitTests2.exe TpetraCore_CrsMatrix_UnitTests3.exe TpetraCore_CrsMatrix_UnitTests4.exe TpetraCore_CrsMatrix_UnitTests_Swap.exe TpetraCore_CrsMatrix_NonlocalAfterResume.exe TpetraCore_CrsMatrix_LeftRightScale.exe TpetraCore_CrsMatrix_2DRandomDist.exe TpetraCore_CrsMatrix_WithGraph_Cuda.exe TpetraCore_CrsMatrix_ReplaceDomainMapAndImporter.exe TpetraCore_CrsMatrix_NonlocalSumInto.exe TpetraCore_CrsMatrix_NonlocalSumInto_Ignore.exe TpetraCore_CrsMatrix_gaussSeidel.exe TpetraCore_CrsMatrix_Bug5978.exe TpetraCore_CrsMatrix_Bug6069_1.exe TpetraCore_CrsMatrix_Bug6069_2.exe TpetraCore_CrsMatrix_Bug6171.exe TpetraCore_CrsMatrix_ReplaceLocalValues.exe TpetraCore_CrsMatrix_ReplaceDiagonal.exe TpetraCore_CrsMatrix_MultipleFillCompletes.exe TpetraCore_CrsMatrix_ReindexColumns.exe TpetraCore_CrsMatrix_TransformValues.exe TpetraCore_CrsMatrix_GetRowCopy.exe TpetraCore_CrsMatrix_PackUnpack.exe TpetraCore_Equilibration.exe TpetraCore_CrsMatrix_StaticImportExport.exe TpetraCore_sumIntoStaticProfileExtraSpace.exe TpetraCore_CrsMatrix_createDeepCopy.exe TpetraCore_CrsMatrix_UnpackMerge.exe TpetraCore_CrsMatrix_Bug7745.exe TpetraCore_CrsMatrix_RemoveEmptyProcesses.exe TpetraCore_Albany182.exe |
The rest of the crsGraph related failures causing sigabrts in the unit tests: TpetraCore_Distributor_CreateFromSendsAndRecvs.exe TpetraCore_FECrsGraph_UnitTests.exe TpetraCore_FEMultiVector_UnitTests.exe TpetraCore_ImportExport_ImportConstructExpert.exe TpetraCore_UnpackLongRows.exe TpetraCore_ExportToStaticGraphCrsMatrix.exe TpetraCore_MatrixMarket_Tpetra_CrsGraph_InOutTest.exe TpetraCore_MatrixMarket_Operator_Test.exe TpetraCore_Map_ExportTest_Bug5882.exe TpetraCore_MatrixMatrix_UnitTests.exe TpetraCore_FECrs_MatrixMatrix_UnitTests.exe TpetraCore_RowMatrixTransposer_test.exe TpetraCore_CrsMatrix_transpose_sortedRows.exe |
Non-crsGraph related failures and stack traces (where applicable) or results (where no stack trace is useful): TpetraCore_BlockView.exe TpetraCore_BugTests.exe TpetraCore_SubmapImportTests.exe TpetraCore_ReverseCommunication_Issue227.exe TpetraCore_ImportExport_UnitTests.exe TpetraCore_Import_Union.exe TpetraCore_SubmapExportTests.exe TpetraCore_Bug6170.exe TpetraCore_Issue_114.exe TpetraCore_Issue_3968.exe TpetraCore_MatrixMarket_Tpetra_CrsMatrix_InOutTest.exe TpetraCore_Bug6288.exe TpetraCore_MultiVector_UnitTests.exe Error, relErr(norms1[1],ans[1]) = relErr(0,4) = 1 <= tol = 0: failed! Error, relErr(norms1[2],ans[2]) = relErr(0,4) = 1 <= tol = 0: failed! Error, relErr(norms2[1],ans[1]) = relErr(0,4) = 1 <= tol = 0: failed! Error, relErr(norms2[2],ans[2]) = relErr(0,4) = 1 <= tol = 0: failed! Error, relErr(norms1[1],ans[1]) = relErr(0,8) = 1 <= tol = 0: failed! Error, relErr(norms1[2],ans[2]) = relErr(0,8) = 1 <= tol = 0: failed! Error, relErr(norms2[1],ans[1]) = relErr(0,8) = 1 <= tol = 0: failed! Error, relErr(norms2[2],ans[2]) = relErr(0,8) = 1 <= tol = 0: failed! TpetraCore_Bug7758.exe TpetraCore_Bug7745.exe TpetraCore_rcb.exe TpetraCore_Issue364.exe *TpetraCore_MV_reduce_strided.exe Error: Kokkos::deep_copy with no available copy mechanism: Z4 to Z4 [FAILED] (1.34e+03 sec) MultiVector_double_int_longlong_Kokkos_Compat_KokkosCudaWrapperNode_reduce_strided_UnitTest TpetraCore_RowMatrixTransposer_UnitTests.exe p=0: *** Caught standard std::exception of type 'std::invalid_argument' : /ascldap/users/gcdanie/Trilinos/packages/tpetra/core/inout/MatrixMarket_Tpetra.hpp:2288: Throw number = 2 Throw test that evaluated to true: bannerIsCorrect == 0 Attempt to read the Matrix Market file's Banner line threw an exception: /ascldap/users/gcdanie/Trilinos/packages/tpetra/core/inout/MatrixMarket_Tpetra.hpp:1034: Throw number = 1 Throw test that evaluated to true: readFailed Failed to get Matrix Market banner line from input. |
Bug Report
@trilinos/tpetra @cgcgcg
Description
Starting this issue to document failing tests in Tpetra when UVM is not the default Cuda memory space.
The current list on my local platform (V100) is:
(first column indicates someone is working on it, second column indicates that the call reaches down into CrsGraph/CrsMatrix)
27:TpetraCore_BlockMultiVector_MPI_4 (Timothy -- fixed by Tpetra: Add missing syncs/modifies in BlockMultiVector unit tests #8499)31:TpetraCore_BlockView_MPI_1 (passes on develop as of 3/2/21, not sure which PR to credit)102:TpetraCore_applyDirichlet_MPI_4 (Chris -- Fixed by Tpetra: applyDirichlet BCs helper routine update #8346)135:TpetraCore_Issue_607_MPI_4 (Timothy, Karen -- fixed by Tpetra: Fix some illegal device accesses in FixedHashTable #8267)150:TpetraCore_CooMatrix_MPI_2 (Timothy, Karen -- fixed by Tpetra: Fix some illegal device accesses in FixedHashTable #8267)153:TpetraCore_Map_Bug2431_MPI_4 (Timothy, Karen -- fixed by Tpetra: Fix some illegal device accesses in FixedHashTable #8267)156:TpetraCore_Map_Bug6051_MPI_2 (Timothy, Karen -- fixed by Tpetra: Fix some illegal device accesses in FixedHashTable #8267)159:TpetraCore_Map_OneToOne_MPI_2 (Timothy, Karen -- fixed by Tpetra: Fix some illegal device accesses in FixedHashTable #8267)161:TpetraCore_Map_Bug5822_2_MPI_2 (Timothy, Karen -- fixed by Tpetra: Fix some illegal device accesses in FixedHashTable #8267)177:TpetraCore_MultiVector_UnitTests_MPI_4 (Timothy -- fixed by Tpetra: Fix modify/sync issues in MultiVector and unit tests #8427)The text was updated successfully, but these errors were encountered: