-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak #132
base: master
Are you sure you want to change the base?
Fix memory leak #132
Conversation
Do you know what's triggering the recreation of the storage object? AFAIK it's created once globally, and only changes if the storage backend setting changes |
This comment appears to suggest that the session can be shared between multiple threads: |
This commit (dcf7952) seems to explictely add a per-thread session instead of using the implicit (and presumably shared) session. I don't know if it's needed but I assumed that commit was probably made for a reason. But if not then that's good, it can definitely be changed to either use a module-level variable for the session or just not specify a session at all (as it did before that commit) which is a bit simpler! |
Ugh, this is a mess. The boto documentation and issue tracker make it very unclear what should be considered thread-safe, and what should be shareable. Maybe it's changed since I put in that fix? Maybe not. I've been distinctly unimpressed with boto3 as a library, ever since porting django-s3-storages over to it. The core issue seems to be two things:
We can't do anything about (1). Unfortunately, your solution here isn't useful in an environment that spawns lots of single-use threads, as each thread will create it's own session that'll hang around somehow even when the thread is GC'd and the associated thread locals destroyed. (2) is worth investigating further. My understanding is that the storage class should be created once at the start of the program, and used forever since. The strategy used by django-s3-storages is (I believe) the same as the one used by django-storages, and I'm not aware of any complaints about memory leaks in either library. This (possibly) implies there is something atypical about your code to be creating and throwing away storage backends so frequently. |
Creating a new session each time S3Storage is instantiated creates a memory leak. It seems that S3Storage can get created a bunch of times (I'm seeing it get created again and again as my app runs) and a boto3's Session takes loads of memory (see boto/boto3#1670) so my app eventually runs out of memory. This should fix the issue while still avoiding using the same session across different threads.
Creating a new session each time S3Storage is instantiated creates a memory leak. It seems that S3Storage can get created a bunch of times (I'm seeing it get created again and again as my app runs) and a boto3's Session takes loads of memory (see boto/boto3#1670) so my app eventually runs out of memory. This should fix the issue while still avoiding using the same session across different threads.