Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak issue while publishing message #734

Closed
bowenatbear opened this issue Jul 12, 2022 · 9 comments
Closed

Memory leak issue while publishing message #734

bowenatbear opened this issue Jul 12, 2022 · 9 comments
Assignees
Labels
api: pubsub Issues related to the googleapis/python-pubsub API. type: question Request for information or clarification. Not an issue.

Comments

@bowenatbear
Copy link

Issue details

After running the application for a long time, one process keeps taking up more and more memory and finally fail the whole system. It seems the problem comes from the Publishing method using the pubsub_v1.PublisherClient. I have seen quite a few tickets about this memory leak issue (E.g. https://bytemeta.vip/repo/googleapis/python-pubsub/issues/395) but could not find how to fix my problem exactly.

Environment details

  • OS: Ubuntu 16.04
  • google-cloud-pubsub==1.7.0
  • google-auth==1.34.0
  • six==1.16.0
  • Both Python 2.7.12 and Python 3.5.2 are used

We are using the codes similar to the following codes sample:

from google.auth import jwt
from google.cloud import pubsub_v1

import time
import ujson as json

class LogPublisher():
    
    def __init__(self, project_id, topic_id, gapp_credential):
        self._gapp_credential = gapp_credential
        self._topic_path = self.publisher.topic_path(project_id, topic_id)
        self._publisher = self._construct_pubsub_publisher()

    def _construct_pubsub_publisher(self):
        publisher_batch_settings = pubsub_v1.types.BatchSettings(
            max_messages=10,  # number of messages
            max_bytes=10240,  # KB
            max_latency=0.05, # seconds  
        )
        service_account_info = json.load(open(self._gapp_credential))
        publisher_audience = 'https://pubsub.googleapis.com/google.pubsub.v1.Publisher'
        credentials_pub = jwt.Credentials.from_service_account_info(
            service_account_info, audience=publisher_audience)
        publisher = pubsub_v1.PublisherClient(
            batch_settings=publisher_batch_settings,
            credentials=credentials_pub)
        return publisher

    def _get_callback(self, future):
        future.result()

    def publish(self, json_data):
        future = self._publisher.publish(self._topic_path, json_data)
        future.add_done_callback(lambda _: self._get_callback(future))


data = {
    'error_code': '123',
    'error_message': 'something wrong',
    'severity': 'low',
    'timestamp': '2022-01-01|12:34:56.654321'
}

publisher = LogPublisher('xxxx_project_id', 'xxxxx_topic_id', 'xxxxxx_credential_file')
def run():
    while True:
        publisher.publish(data)
        time.sleep(0.25)

The above code piece is a very simple version of my codes, but it contains the main logic. It seems leaving the codes running for a long time, the memory usage will keep increasing. I need some suggestions on how to fix this memory leak problem.
Thanks!

@product-auto-label product-auto-label bot added the api: pubsub Issues related to the googleapis/python-pubsub API. label Jul 12, 2022
@bowenatbear
Copy link
Author

@plamut Hi I just felt this issue might be related with few other issues (E.g. #395 and #406) you found previously, but would like confirm if it's the root cause for the package google-cloud-pubsub==1.7.0.

@plamut
Copy link
Contributor

plamut commented Jul 13, 2022

@bowenatbear I do not maintain this library anymore so I can't say for certain, I've been out of the loop for more than 6 months now.

The linked issue (#395) has been fixed back then, and I now see that the root cause, bug in CPython, has been fixed as well.

However, I see that you use quite dated Python and Pub/Sub versions - any chance of upgrading that?
Since Python 3.5.x is not maintained anymore, I imagine that it did not receive the abovementioned CPython fix, which is why you probably observed the leak.

@bowenatbear
Copy link
Author

@plamut Thanks for the updates. I doubt the same as the Python and Pub/Sub versions we are using is pretty dated. I think we are in the progress upgrading Python version but it might take some time (maybe long).

The reason why I am using google-cloud-pubsub==1.7.0 is because the Google document says that the last version of this library compatible with Python 2.7 is google-cloud-pubsub==1.7.0

@acocuzzo acocuzzo self-assigned this Jul 14, 2022
@acocuzzo acocuzzo added the type: question Request for information or clarification. Not an issue. label Jul 14, 2022
@acocuzzo
Copy link
Contributor

@bowenatbear Are you able to test your code with the new version to verify upgrading will fix you? Unfortunately because the fix was not in our library you would have to upgrade to get the fix.

@bowenatbear
Copy link
Author

@acocuzzo I see. So it seems we have to use a Python version which is later or equal to Python 3.7 to get this fix?

@acocuzzo
Copy link
Contributor

@bowenatbear yes, exactly.

@bowenatbear
Copy link
Author

The reason why most of the codes are in Python 2.7 now is that we are using ROS1 which is in Py2 and there are lots of dependency issues there.

@acocuzzo Thanks and I think then we need to find a way to run this piece of codes separately in > Python 3.7. Will try.

@acocuzzo
Copy link
Contributor

@bowenatbear Correction, it looks like google-cloud-pubsub=2.4.2 is the first version that has this fix, which requires Python 3.6. I hope that is helpful.

@acocuzzo acocuzzo reopened this Jul 14, 2022
@bowenatbear
Copy link
Author

@acocuzzo I see. Thanks for the updates!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the googleapis/python-pubsub API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants