-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix swig template memory leak #859
Conversation
@@ -164,6 +164,7 @@ | |||
temp.push_back(std::pair< std::string,std::string >()); | |||
PyObject *item = PySequence_GetItem($input, i); | |||
if (!PyTuple_Check(item) || PyTuple_Size(item) != 2) { | |||
Py_DECREF(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we already have a case:
sonic-swss-common/tests/test_redis_ut.py
Line 165 in 0044540
def test_SelectMemoryLeak(): |
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak #859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak #859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Issue has been fixed, thanks |
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity Description The issue reports a memory leak on the Redis set operations Reason Didn't decrease the reference count after PySequence_GetItem Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string Fix: Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1 Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string Add unit test To monitor there is no dramatic memory increment after a huge amount of Redis set operations. Signed-off-by: Ze Gan <zegan@microsoft.com>
Problem:
Device gets stuck if left switched on for prolonged period of time for more than 30 hrs.
Observation:
Pmon docker seems to be leaking memory over time, leading to system wide out-of-memory condition.
Python process in pmon such as psud, thermalctld were all leaking memory over time.
The issue was traced back to table.set() operations that were done in these python process.
We have swig template files which are used to generate the code mappings for python to cpp conversions.
In these constructor functions, we have allocated new references using PySequence_GetItem(), which these references were not marked with Py_DECREF, when done using them. This was causing stale objects to present which was leaking memory over time.
/usr/lib/python3/dist-packages/swsscommon/swsscommon.py:184: size=2851 B, count=49, average=58 B
/usr/lib/python3.9/tracemalloc.py:505: size=1456 B, count=24, average=61 B
/usr/lib/python3.9/tracemalloc.py:67: size=1344 B, count=21, average=64 B
/usr/lib/python3.9/tracemalloc.py:558: size=1224 B, count=24, average=51 B
/usr/lib/python3.9/tracemalloc.py:498: size=1104 B, count=23, average=48 B
/usr/lib/python3.9/tracemalloc.py:193: size=768 B, count=16, average=48 B
/usr/lib/python3.9/tracemalloc.py:533: size=576 B, count=1, average=576 B
/usr/lib/python3.9/threading.py:574: size=536 B, count=1, average=536 B
Fix:
Used Py_DECREF in constructor functions.
Unit test:
Tested using a custom py script which does table.set in while loop (and checking tracemalloc) to ensure memory leak doesnt happen.
Allowed the system to be up for sometime to ensure memory usage of pmon docker doesnt increase like earlier.
Check show platform fan/psu to ensure that pmon functionality isnt affected due to this change.