Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zoltan2: memory leak in Zoltan2's AlgScotch with 32-bit IDs #9312

Closed
kddevin opened this issue Jun 18, 2021 · 0 comments
Closed

Zoltan2: memory leak in Zoltan2's AlgScotch with 32-bit IDs #9312

kddevin opened this issue Jun 18, 2021 · 0 comments
Assignees
Labels
pkg: Zoltan2 type: bug The primary issue is a bug in Trilinos code or tests

Comments

@kddevin
Copy link
Contributor

kddevin commented Jun 18, 2021

Bug Report

@trilinos/zoltan2

Description

@tasmith4 reports a memory leak when running Zoltan2 with 32-bit Scotch.

28 bytes in 1 blocks are definitely lost
==150320== 28 bytes in 1 blocks are definitely lost in loss record 8 of 19
==150320==    at 0x4C2AB28: operator new[](unsigned long) (vg_replace_malloc.c:433)
==150320==    by 0x660773: ASSIGN_ARRAY (Zoltan2_TPLTraits.hpp:98)
==150320==    by 0x660773: Zoltan2::AlgPTScotch::partition(Teuchos::RCP<zoltan2::partitioningsolution > const&) (Zoltan2_AlgScotch.hpp:423)
==150320==    by 0x664DB0: Zoltan2::PartitioningProblem::solve(bool) (Zoltan2_PartitioningProblem.hpp:639)
==150320==    by 0x630BFC: stk::balance::internal::get_multicriteria_graph_based_decomp(stk::mesh::BulkData const&, stk::mesh::Selector, ompi_communicator_t* const&, stk::balance::BalanceSettings const&, stk::mesh::impl::LocalIdMapperT const&, Zoltan2ParallelGraph&, Teuchos::ParameterList&, std::vector<std::pair, std::allocator<std::pair > >&) (privateDeclarations.cpp:816)
==150320==    by 0x63134D: stk::balance::internal::fill_decomp_using_graph_based_method(stk::mesh::BulkData&, std::vector > const&, ompi_communicator_t* const&, int, stk::balance::BalanceSettings const&, std::vector<std::pair, std::allocator<std::pair > >&) (privateDeclarations.cpp:1192)
==150320==    by 0x631B11: stk::balance::internal::calculateGeometricOrGraphBasedDecomp(stk::mesh::BulkData&, std::vector > const&, ompi_communicator_t* const&, int, stk::balance::BalanceSettings const&, std::vector<std::pair, std::allocator<std::pair > >&) (privateDeclarations.cpp:1298)
==150320==    by 0x56A89D: balance_mesh (UnitTestStkBalancePartitioning.cpp:53)
==150320==    by 0x56A89D: StkBalancePartitioning::balance_mesh_scotch(ompi_communicator_t* const&, int, std::vector > const&) (UnitTestStkBalancePartitioning.cpp:74)
==150320==    by 0x567872: StkBalancePartitioning_6Elem1ProcMesh_EntireDomain_Scotch_Test::TestBody() (UnitTestStkBalancePartitioning.cpp:310)
==150320==    by 0x1200909: HandleSehExceptionsInMethodIfSupported (gtest.cc:2433)                          
==150320==    by 0x1200909: void testing::internal::HandleExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2469)
==150320==    by 0x11F7730: testing::Test::Run() [clone .part.573] (gtest.cc:2508)
==150320==    by 0x11F7AE1: Run (gtest.cc:2499)
==150320==    by 0x11F7AE1: testing::TestInfo::Run() [clone .part.574] (gtest.cc:2684)
==150320==    by 0x11F7DE4: Run (gtest.cc:2810)
==150320==    by 0x11F7DE4: testing::TestSuite::Run() [clone .part.575] (gtest.cc:2816)
==150320==    by 0x11F8C4A: Run (gtest.cc:5652)
==150320==    by 0x11F8C4A: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:5338)
==150320==    by 0x11F8DCE: HandleSehExceptionsInMethodIfSupported (gtest.cc:2433)
==150320==    by 0x11F8DCE: HandleExceptionsInMethodIfSupported (gtest.cc:2469)
==150320==    by 0x11F8DCE: testing::UnitTest::Run() (gtest.cc:4925)
==150320==    by 0x431E7D: RUN_ALL_TESTS (gtest.h:2473)
==150320==    by 0x431E7D: run_all_tests (ngp_test.cpp:31)
==150320==    by 0x431E7D: main (UnitTestMain.cpp:76)
==150320==
==150320== LEAK SUMMARY:
==150320==    definitely lost: 28 bytes in 1 blocks
==150320==    indirectly lost: 0 bytes in 0 blocks
==150320==      possibly lost: 0 bytes in 0 blocks
==150320==    still reachable: 17,565 bytes in 7 blocks
==150320==         suppressed: 591 bytes in 18 blocks
==150320== Reachable blocks (those to which a pointer was found) are not shown.
==150320== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==150320==
==150320== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 5 from 5)

Steps to Reproduce

On ascicgpu030, build with gcc7.2, OpenMPI4.0.5, and Scotch with 32-bit IDs (provided by @Tasmit)

Bug is reproducible running Zoltan2 test Zoltan2_Partitioning1.cpp --inputFile=simple --method=scotch

@kddevin kddevin added type: bug The primary issue is a bug in Trilinos code or tests pkg: Zoltan2 labels Jun 18, 2021
@kddevin kddevin self-assigned this Jun 18, 2021
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Jun 20, 2021
…s:develop' (19b4402).

* trilinos-develop:
  Gracefully exist with 0 when build_stats.csv file is empty (trilinos#9314)
  zoltan2:  fix memory leak when sizeof(SCOTCH_Num) == sizeof(lno_t) trilinos#9312
  Xpetra/MueLu testing: change proxy settings
  MueLu adapters: Skip unneeded initialization
  Ifpack2 Chebyshev: Avoid MV initialization
  Tpetra norms: Avoid allocation (intracomm) or avoid initialization (intercomm)
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Jun 20, 2021
…s:develop' (19b4402).

* trilinos-develop:
  Gracefully exist with 0 when build_stats.csv file is empty (trilinos#9314)
  zoltan2:  fix memory leak when sizeof(SCOTCH_Num) == sizeof(lno_t) trilinos#9312
  Xpetra/MueLu testing: change proxy settings
  MueLu adapters: Skip unneeded initialization
  Ifpack2 Chebyshev: Avoid MV initialization
  Tpetra norms: Avoid allocation (intracomm) or avoid initialization (intercomm)
@kddevin kddevin closed this as completed Jun 29, 2021
seamill pushed a commit to seamill/Trilinos that referenced this issue Jul 28, 2021
…develop' (7591b32).

* trilinos/develop: (77 commits)
  zoltan2:  fix memory leak when sizeof(SCOTCH_Num) == sizeof(lno_t) trilinos#9312
  Tpetra: Remove some output from the Bug7758 test
  MueLu Stratimikos adapter: Enable half precision for factory-style PLs
  Tpetra: remove some deprecated usage
  Fixed some deprecated code
  MueLu Thyra adapter: Allow construction of half precision operator
  ROL: implement the apply function for Thyra Vector
  Piro: changes to ROL adapters comply with ROL changes
  Piro: bug-fix in Piro::NOX_Solver
  MueLu: Print Scalar in MG Summary for high and extreme verbosity
  Ifpack2: disabling tests causing build errors with extended scalar types (see issue trilinos#9280).
  Ifpack2: cleaning up unused variables in tests.
  Ctest: Adding Amesos2/Belos tests
  Ctest: Stuff failing on ride that worked on ascicgpu
  Ctest: Enabling non-UVM Ifpack2 tests
  Ifpack2: changing GO to the one in Tpetra_Details_DefaultTypes.hpp.
  Disable support for Makefile.export.* files (trilinos#8498)
  Tpetra: remove unused variable (copied too many times when breaking up a function)
  ats2: Comment out listing of long-broken XL builds (trilinos#9270, trilinos#7376)
  Ifpack2: adding missing logic for new tests.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: Zoltan2 type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

1 participant