Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak #2047

Open
itachaaa opened this issue May 14, 2020 · 12 comments
Open

Memory leak #2047

itachaaa opened this issue May 14, 2020 · 12 comments
Assignees
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue

Comments

@itachaaa
Copy link

Please fill out the sections below to help us address your issue.

What issue did you see ?
When I call the AWS interface in large numbers, and cannot connect to the corresponding area due to network reasons, an EndpointConnectionError is thrown. As time goes by, the memory occupied by my process will continue to increase, and the maximum observed is currently 6G. Using gc and pyrasite to check, it is found that gc.garbage is [], and the data type that takes up the most memory is str or unicode. The unicode description is the document description of DescribeInstancesRequest in service-2.json.
The library versions I use: boto3-1.12.24, botocore-1.15.24, urllib3-1.21.1

Steps to reproduce
If you have a runnable example, please include it as a snippet or link to a repository/gist for larger code examples.

Debug logs
Full stack trace by adding

import botocore.session
botocore.session.Session().set_debug_logger('')

to your code.
image

@itachaaa itachaaa added guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels May 14, 2020
@itachaaa
Copy link
Author

itachaaa commented May 14, 2020

The way i get AWS client:

class AwsClient(object):

    def __init__(self, region='eu-central-1', server_name='ec2'):
        self.region = region
        self.server_name = server_name

    @property
    def client(self):
        return boto3.client(self.server_name,
                            region_name=self.region,
                            aws_access_key_id=ACCESS,
                            aws_secret_access_key=SECRET)

I used 2 process and multy green threads to make requests.And there is only one boto3.session.Session instance and one botocore.session.Session in one process.

@swetashre
Copy link
Contributor

@itachaaa - Thank you for your post. It is recommended to create a resource instance for each thread / process in a multithreaded or multiprocess application rather than sharing a single instance among the threads / processes.
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html?highlight=multithreading#multithreading-multiprocessing

Are you creating boto3 session from botocore session ? Can you please provide me your exact code sample that resulting in memory leak ?

@swetashre swetashre self-assigned this May 14, 2020
@swetashre swetashre added response-requested Waiting on additional info and feedback. and removed needs-triage This issue or PR still needs to be triaged. labels May 14, 2020
@itachaaa
Copy link
Author

Thanks for reply.
I use boto3.client() to get an instance and make requests.
And there is one process in my program, but with multy Coroutine rather than multy thread.
So there is one session for one process.

@itachaaa
Copy link
Author

And only when there is throwning lots of exception the memory will increase, otherwise won't.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label May 15, 2020
@swetashre
Copy link
Contributor

@itachaaa - Thanks for the reply. Is it possible for you to provide me a code sample so that i can try reproduce the issue ? Without looking at the code it is a little difficult for me to find out the exact cause.
Please make sure you are doing garbage collection as you are using multiple coroutine.

@swetashre swetashre added the response-requested Waiting on additional info and feedback. label May 15, 2020
@itachaaa
Copy link
Author

itachaaa commented May 18, 2020

here is my test code:

from memory_profiler import profile
from eventlet.greenpool import GreenPool
from instance import InstanceResource

manager = InstanceResource()

#@profile
def get_data():
    try:
        instances = manager.list_resource()
    except Exception as e:
        print(e)


#@profile
def loop_call():
    pool  = GreenPool(10000)
    times = 0
    for i in range(100000):
        pool.spawn_n(get_data)
        times += 1
        print(times)

    import time; time.sleep(10)

if __name__ == '__main__':
    loop_call()
from common import Resource


class InstanceResource(Resource):
    action = 'describe_instances'
    create_action = 'run_instances'

    # entity = 'Instances'

    @staticmethod
    def get_filters():
        params = {
            'Filters': [
                # {'Name': 'instance-id', 'Values': ['i-07a20968066e2ad87', ]}
            ],
        }
        return params
from client import AwsClient


class Resource(object):
    action = None  # 默认是查询
    create_action = None
    update_action = None
    delete_action = None
    entity = None

    def __init__(self, region='eu-central-1', server_name='ec2'):
        self._init(region=region, server_name=server_name)

    def _init(self, region='eu-central-1', server_name='ec2'):
        client = AwsClient(region=region, server_name=server_name)
        self.client = client.client
        self.resource = client.resource

    @staticmethod
    def get_filters():
        """
        获取GET接口的过滤参数
        :return:
        """
class AwsClient(object):

    def __init__(self, region='eu-central-1', server_name='ec2'):
        self.region = region
        self.server_name = server_name
        self.config = Config(retries=dict(max_attempts=2), connect_timeout=5, read_timeout=5)

    @property
    def client(self):
        return boto3.client(self.server_name,
                            region_name=self.region,
                            aws_access_key_id=ACCESS,
                            aws_secret_access_key=SECRET,
                            config=self.config)

and let it throw errors a lots such as EndpointConnectionError,then the memory will continue to increase.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label May 18, 2020
@swetashre swetashre added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels May 20, 2020
@swetashre
Copy link
Contributor

@itachaaa - Thank you for providing me the sample code. Marking this as bug. I am able to reproduce the issue with this script:

import os
import boto3
import botocore
import resource
import psutil
from resource import *
import matplotlib.pyplot as pp
import sys
from botocore.config import Config
from eventlet.greenpool import GreenPool

used =[]

def get_data():
        client = boto3.client('ec2',config = Config(retries={'max_attempts':0},connect_timeout=5, read_timeout=5))
        client.describe_instances()

pool  = GreenPool(10000)
for i in range(100000):
        process = psutil.Process(os.getpid())
        memory = process.memory_info().rss/1024/1024
        used.append(memory)
        pool.spawn_n(get_data)

pp.plot(used)
pp.show() 

test_memory_leak_0_retry

@swetashre swetashre added bug This issue is a confirmed bug. and removed guidance Question that needs advice or information. labels May 21, 2020
@itachaaa
Copy link
Author

itachaaa commented May 22, 2020

I would like to ask if you have a situation where the network environment is not very good and sometimes throws an exception when you reproduce it.
Because at that time there were frequent printing of network-related anomalies, such as EndpointConnectError, ReadTimeoutError and so on.I don't know if this is an influential factor.

@willbengtson
Copy link

I am also tracking down a memory leak. tracemalloc pointed me to https://github.com/boto/botocore/blob/develop/botocore/client.py#L322

When running a Flask application and looping through gc.garbage after a gc.collect() I am left with boto docs. I currently am using boto3 and creating clients as client = boto3.client('sts') as an example.

Running in lambda the following is a graph of memory from cloudwatch metrics filter:

image

@rl-ilasic
Copy link

Is there an update or workaround for this problem?

@mwek
Copy link

mwek commented Sep 19, 2022

Creating one session per thread and reusing them across ThreadPool mitigated the issue for us. Snippet for aiobotocore below:

import threading
from aiobotocore.session import AioSession, get_session

# NOTE: botocore has a memory leak in Session objects. Recommended workaround is to cache the session object locally per thread.
# See https://github.com/boto/botocore/issues/2047
_aio_session_cache = threading.local()


def _cached_session() -> AioSession:
    if not hasattr(_aio_session_cache, "session"):
        _aio_session_cache.session = get_session()
    return _aio_session_cache.session

@ericman93
Copy link

ericman93 commented Dec 8, 2022

I ran my FastAPI app and tracemalloc pointed me that python3.8/json/decoder.py file is leaking ~100MB every 15 minutes
looking into the tracebacks of that file I see that they are all being called by botocore

File \"/opt/venv/lib/python3.8/site-packages/botocore/session.py\", line 787
  return self._internal_components.get_component(name)
File \"/opt/venv/lib/python3.8/site-packages/botocore/session.py\", line 1081
  self._components[name] = factory()
File \"/opt/venv/lib/python3.8/site-packages/botocore/session.py\", line 188
  endpoints = loader.load_data('endpoints')
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 142
  data = func(self, *args, **kwargs)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 454
  found = self.file_loader.load_file(possible_path)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 194
  data = self._load_file(file_path + ext, open_method)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 181
  return json.loads(payload, object_pairs_hook=OrderedDict)
File \"/usr/local/lib/python3.8/json/__init__.py\", line 370
  return cls(**kw).decode(s)
File \"/usr/local/lib/python3.8/json/decoder.py\", line 337
  obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File \"/usr/local/lib/python3.8/json/decoder.py\", line 353
  obj, end = self.scan_once(s, idx)

and

File \"/opt/venv/lib/python3.8/site-packages/botocore/client.py\", line 202
  json_model = self._loader.load_service_model(
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 142
  data = func(self, *args, **kwargs)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 417
  model = self.load_data(full_path)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 142
  data = func(self, *args, **kwargs)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 454
  found = self.file_loader.load_file(possible_path)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 194
  data = self._load_file(file_path + ext, open_method)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 181
  return json.loads(payload, object_pairs_hook=OrderedDict)
File \"/usr/local/lib/python3.8/json/__init__.py\", line 370
  return cls(**kw).decode(s)
File \"/usr/local/lib/python3.8/json/decoder.py\", line 337
  obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File \"/usr/local/lib/python3.8/json/decoder.py\", line 353
  obj, end = self.scan_once(s, idx)

I'm using botocore==1.27.59 and can't upgrade it since aiobotocore is fixed to ^1.27
I'll try to update my python version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

8 participants