Skip to content
This repository has been archived by the owner on Oct 23, 2020. It is now read-only.

Fix framework config pool memory leak #1501

Closed
wants to merge 1 commit into from

Conversation

Larofeticus
Copy link
Contributor

This leak was found investigating the cause of the problem with:
E3SM-Project/E3SM#2041

The framework has some functions which manage config_pool storage used during halo exchanges. My halo data structure reuse code needs to directly modify some of these config values between exchanges but the framework currently only supports mpas_pool_add_config() and mpas_pool_remove_config() so that's what was used.

The add config allocates three elements: a mpas_pool_member_type, field of that which is mpas_pool_data_type, and a scalar member of that either int, real, char, or logical.

The remove config only deallocates the mpas_pool_member_type and the payload scalar, but forgets the mpas_pool_data_type. Along with a suspicious TODO comment speculating that there might be more to do.

Under previous circumstances this add_config to remove_config cycle didn't happen enough to drive an OOM, but the heavier usage by my reuse code does.

@jonbob
Copy link
Contributor

jonbob commented Feb 15, 2018

Tagging @douglasjacobsen , who originally developed this code

@douglasjacobsen
Copy link
Member

A few comments. While this does appear to be a memory leak, there was some consideration about this when we wrote this code. Fundamentally, the remove routines within the pool data structure were not intended to destroy data, just remove it from the pool. However, when configs and dimensions are added, they are duplicated (such that they wouldn't be synchronized between multiple pools, like fields would be).

So, this is a memory leak currently, but you are not fixing all of the memory leak. Look at the destroy pool routine for more information on how to fix it properly. However, some thought should be given to the way configs are added to pools instead of just adding this. It might be preferable to add configs and dimensions in a similar way to fields, and make the remove routines only remove them from the pool rather than deallocate data.

I think the comment you add is not needed in this PR as it really doesn't add anything helpful, but I'll leave that to @mgduda to decide on.

Some comments on your notes about the reusable halo exchanges:

Beyond that, your halo buffer reuse routines do not actually need to modify the config value. If you read through those routines, you'll notice that the config isn't actually used anywhere. Additionally, time levels are specified before buffers are built (i.e. time levels need to be specified before the buffer is built, because they determine the size of the buffers).

Finally, the halo exchanges didn't present a memory leak before because they were not continually adding and removing configs from a pool. Instead, after the halo exchange was finished the pool was destroyed. When a pool is destroyed, memory is deallocated properly and doesn't cause a memory leak.

If you're interested more, feel free to write me an email or tag me in an issue about it, but I don't think this PR is the right place to discuss it.

mark-petersen added a commit that referenced this pull request Mar 14, 2018
According to comment in #1501 config_pool entries created in halo
exchanges are no longer used and can be removed. Config_pool entries
when ported to halo data structure reuse also cause a memory leak.

According to comment in #1502 the code builds and passes testing.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants