-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAN leaks memory #13171
Comments
Related to open-mpi#13171 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
valgrind also found memory leaks associated with other compnents including btl and ucx osc. |
related to open-mpi#13171 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
I noticed many more memory leaks in HAN on the 5.0.x branch owing to its retaining the previous coll module(s) and not under all cases later releasing them. The result was a memory leak per iteration of the MPI_Win_create/MPI_Win_free call in the test case. There was a big refactor of HAN and other collective components in main that resulted in these per iteration memory leaks going away. However the refactoring was pretty significant so my recommendations for those using 5.0.x releases is to turn off the han collective component if they are observing significant leaks with MPI_WIn_create/free operations. |
If the communicators are correctly cleaned there should be no memory leaks due to the collective internal module use. However, if the communicators are not correctly freed by the user and we rely on the MPI_Finalize cleanup, bad things can happen. |
i think the should in the previous comment is the most important word. The user test is shown below. With 1000 iters of win_create/free valgrind reported that about 2 MB of memory was being leaked when using 5.0.7 release and HAN was allowed to be used. It was due to the fact that HAN was adding multiple references to TUNED modules so they weren't getting freed properly in MPI_Win_free's communicator destructor step. Disabling HAN causes the memory leak to vanish. As I note above, there was a lot of restructuring of HAN in main and now the create/free cycle no longer shows a memory leak when HAN is used.
|
related to open-mpi#13171 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
resolved via #13172 |
There are various cases where HAN leaks memory. At my site, users are complaining about memory leaks with MPI window creation but the underlying problem has to do with releasing of resources retained other components , which is not being done correctly.
Patch coming eminently.
The text was updated successfully, but these errors were encountered: