Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection using gc.collect() is not showing any effect #3541

Open
sudouser777 opened this issue Dec 24, 2022 · 10 comments
Open

Garbage collection using gc.collect() is not showing any effect #3541

sudouser777 opened this issue Dec 24, 2022 · 10 comments
Labels
needs-review p2 This is a standard priority issue sqs

Comments

@sudouser777
Copy link

sudouser777 commented Dec 24, 2022

Describe the bug

When we invoke boto3 client methods multiple times/ running in some kind of loop for n times, memory is getting accumulated with each iteration. Even if we call the gc.collect() it also not showing any effect

Expected Behavior

  1. Garbage collection should happen properly all the unused resources should be removed

Current Behavior

If we are running some boto3 code in loop for n times, memory is accumulating with each iteration
gc.collect() not releasing unused memory. At the end of the program gc.collect() returning 0 unreachable objects
but this also doesn't show any change in memory usage.

Reproduction Steps

import gc
import os
import boto3

gc.set_debug(gc.DEBUG_UNCOLLECTABLE)

boto3.set_stream_logger('')

def get_memory_usage():
    return psutil.Process(os.getpid()).memory_info().rss // 1024 ** 2


def test():
    queue_url = 'https://us-east-2.queue.amazonaws.com/916470431480/test.fifo'
    sqs = boto3.client('sqs')
    for i in range(10):
        message = sqs.receive_message(QueueUrl=queue_url)
        if message.Get ('Messages'):
            print(message)
            recept_handle = message['Messages'][0]['ReceiptHandle']
            sqs.delete_message(QueueUrl=queue_url, ReceiptHandle=recept_handle)

        print(f'Iteration - {i + 1} Unreachable Objects: {gc.collect()} and length: {len(gc.garbage)}')
        print(f'Memory usage After: {get_memory_usage()}mb')


for _ in range(5):
    print(f'Memory usage Before: {get_memory_usage()}mb')
    test()
    print(f'==================Unreachable Objects: {gc.collect()}==================')
    print(len(gc.garbage))
    print(f'Memory usage After: {get_memory_usage()}mb')

    print('\n' * 5)

Attached sample code we can reproduce the issue by running the above code

Possible Solution

No response

Additional Information/Context

Logs: log.txt

SDK version used

1.26.37

Environment details (OS name and version, etc.)

Linux 5.15.84-1-MANJARO

@sudouser777 sudouser777 added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Dec 24, 2022
@tim-finnigan tim-finnigan changed the title boto3 resources not grabage collected Garbage collection using gc.collect() is not showing any effect Dec 27, 2022
@tim-finnigan tim-finnigan self-assigned this Dec 27, 2022
@tim-finnigan
Copy link
Contributor

Hi @sudouser777 thanks for reaching out. I hope you don't mind that I reworded your issue title a bit so as not to confuse the boto3 resource interface with general resources.

There are a few other issues that mentioned attempts to use gc.collect(). Have you looked through any of these?

Issues involving memory usage can be difficult to fully replicate between different environments. If you could provide your debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script then that would help us investigate this further.

@tim-finnigan tim-finnigan added response-requested Waiting on additional information or feedback. sqs and removed bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Dec 27, 2022
@sudouser777
Copy link
Author

@tim-finnigan added the logs

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Dec 27, 2022
@tim-finnigan
Copy link
Contributor

Thanks for following up. Are you able to replicate this issue in different environments? And is it consistent across services and commands?

Also please let us know if you had a chance to read through those related issues to see if there is any overlap with what you're running into.

@tim-finnigan tim-finnigan added the response-requested Waiting on additional information or feedback. label Dec 28, 2022
@sudouser777
Copy link
Author

sudouser777 commented Dec 28, 2022

@tim-finnigan I'm able to reproduce the issue in all the below-mentioned cases

  1. different services (s3, sns, sqs)
  2. different environments (tested in fedora and manjaro also docker)
  3. different commands

I have checked the issues you mentioned before creating the bug, But those all are mostly happening in multithreading
I think those issues are not related to this problem

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Dec 28, 2022
@tim-finnigan
Copy link
Contributor

I was doing some research and think I found your corresponding Stack Overflow post: https://stackoverflow.com/questions/74911597/garbage-collection-is-not-happening-properly-when-using-boto3

For reference I'll also link the gc.collect() documentation: https://docs.python.org/3/library/gc.html#gc.collect, and here is some more information on the Garbage Collector Design: https://devguide.python.org/internals/garbage-collector/. Do you have any updates on your end as far as what you've tried?

@tim-finnigan tim-finnigan added the response-requested Waiting on additional information or feedback. label Dec 30, 2022
@sudouser777
Copy link
Author

@tim-finnigan no, I have read multiple articles about gc, the blog you mentioned I have also read that before
do you have any other suggestions on what could be the problem here?

@sudouser777
Copy link
Author

sudouser777 commented Dec 30, 2022

@tim-finnigan I have tested with this sample code. I am testing the memory usage here in 2 cases one where I'm not using boto3 in that case memory usage seems normal. Other case where I'm using boto3 in that case memory is not being released.

import gc
import os
import boto3
import psutil

def get_memory_usage():
    return psutil.Process(os.getpid()).memory_info().rss // 1024 ** 2

def test_without_boto3():
    print(f'Memory usage Before: {get_memory_usage()}mb')
    lst = [{'a': 'a' * 10000, 'b': 'b' * 10000} for _ in range(10 ** 4)]
    print(f'Memory usage After: {get_memory_usage()}mb')


def test_with_boto3():
    print(f'Memory usage Before: {get_memory_usage()}mb')
    lst = [boto3.client('sqs') for _ in range(10 ** 4)]
    print(f'Memory usage After: {get_memory_usage()}mb')

Output for test_without_boto3()

Memory usage Before: 27mb
Memory usage After: 221mb
Memory usage After gc.collect(): 28mb

Output for test_with_boto3()

Memory usage Before: 27mb
Memory usage After: 1478mb
Memory usage After gc.collect(): 269mb

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Dec 30, 2022
@tim-finnigan
Copy link
Contributor

Hi @sudouser777 do you have any updates as far as what you've tried? Have you used any tools to measure the memory usage? I'm still not sure why gc.collect() isn't working as you intended — it might help to narrow down the conditions in which this issue occurs. I'm going to unassign myself and give my colleagues the chance to investigate this as well.

@tim-finnigan tim-finnigan added the response-requested Waiting on additional information or feedback. label Jan 10, 2023
@tim-finnigan tim-finnigan removed their assignment Jan 10, 2023
@github-actions
Copy link

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jan 15, 2023
@sudouser777
Copy link
Author

Issue is still happening, I haven't found any solution so far

@github-actions github-actions bot removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional information or feedback. labels Jan 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-review p2 This is a standard priority issue sqs
Projects
None yet
Development

No branches or pull requests

2 participants