-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@google-cloud/PubSub@1.x has increased memory usage compared to 0.29.1 #788
Comments
a memory issue has been found in the upstream @grpc/grpc-js library, this is most likely the root cause. See: @Legogris I will update you and close this thread as soon as the fix is rolled out. |
@Legogris could you share a snippet of code, demonstrating where you're initializing your PubSub client in logic, @alexander-fenster points out to me that one way memory can leak is if the client is being created multiple times in an application, rather than once at the entry-point of the application? @pmcnr-hx, likewise, I'm wondering if you might be bumping into a similar issue. |
@bcoe It's definitely only instantiated once (there are two per node-process though, each initialized at entry-point). Just to clarify, our symptoms are outbound network requests made from inside node but unrelated to PubSub / GRPC starting to fail. So if it's related to a memory leak issue, it would be resulting from use of sockets. I'll have to leave now, but I'll try to provide more details tomorrow AM. I posted in #789 (comment) that "@grpc/grpc-js": "0.5.4" with "@google-cloud/pubsub": "0.32.1" was working fine, but I see now that we do have sockets to local services also acting inconsistently. This one I am not certain I can ascribe to grpc/pubsub yet, will have to dig deeper for that. I can tell with certainty that the most current version of pubsub + grpc-js completely breaks, though, and that the older versions get less HTTP requests failing and that there are differences in how connections fail between the native |
@Legogris if you're able to, running the process with: process.env.GRPC_TRACE = 'all';
process.env.GRPC_VERBOSITY = 'DEBUG'; could provide some valuable information to share with the gprc folks. |
I am seeing the same increase in memory usage after upgrading a Node.js 8 Cloud Function that forwards POST data to a PubSub queue. The function is running on us-central1 with 512 MB RAM. After upgrading from |
@bsato212 what are you running the worker on that is consuming from the PubSub queue? (is it the worker that is running out of memory periodically?). |
I think I have a satisfactory explanation: The increased number of sockets due to the socket leak cause the connection pool to fill up and since we have global I will try to separate out the http agents and see if that performs differently. EDIT: As the maxSockets is per origin, this does not fully explain it after all (it does explain failing requests to PubSub, though). Grasping at straws here, but maybe it can have to do with the For reference, we're using https://www.npmjs.com/package/agentkeepalive with the following settings:
Is there a way to set http agent explicitly for pubsub? Can't find anything when looking through docs or source. |
Starting yesterday in our staging environment, between 2 deployments (so we're not convinced yet its a code change) we started seeing a flood of these errors
We're seeing it in production now as well, currently debugging, flagging it here in case anyone's seen similar and may have suggestions. |
@vigandhi are you seeing these errors after an extended period of time, accompanied with an increase in memory consumption? If you're seeing these errors immediately, and it doesn't seem to be tied to a gradual memory leak, I think it would be worth opening a new issue (as what you're seeing seems like different symptoms). We can try to get to the bottom of what's happening to you in that thread. |
@bcoe It seems to be due to an increase in memory consumption. As @vigandhi stated, we noticed the flood after shipping changes that added extensive usage of There are a few more tests we'd like to do on our end in an attempt to narrow down the change. Wiull report back once we have more info. |
@Legogris @vigandhi @ibarsi @bsato212 if anyone has the cycles to test the dependency out, could I bother you to try the dependency:
☝️ I've released version @ibarsi @vigandhi I have not floated a similar patch for |
@bcoe Cool, we will try out the patch thanks for the information. Can you elaborate on what you mean by not floating a patch for the same issue to bunyan? Is there a common piece of code that affects both products in relation to the memory issues that were being observed? I also was curious if using the native c++ grpc client mitigates this issues and if so can other libraries be made to use the native bindings as well. Thank you. |
@ericpearson we will soon have the updated version of
The C++ |
@bcoe Is this fixed? |
@Sandhyakripalani there have been some fixes to the |
Is this still a problem for everyone with the latest library release? (1.7.2) We've been held up a bit by some dependency timing issues, but I'm hoping we can get 2.0 out the door soonish. There are quite a few dependencies updates and generated code changes that other libraries have already received. |
I'm going to go ahead and close this for now since it's been idle, but please do re-open/comment if needed. |
This PR was generated using Autosynth. 🌈 Synth log will be available here: https://source.cloud.google.com/results/invocations/7a1b0b96-8ddb-4836-a1a2-d2f73b7e6ffe/targets - [ ] To automatically regenerate this PR, check this box.
@pmcnr-hx this memory issue sounds a bit different than what folks were initially seeing in #770 (which was a ceasing up of message processing), so I've created this new tracking issue.
Running In Debug Mode
If it's not too much of a bother, could I get you to run your system with debugging enabled:
☝️ I'm curious to see if there is any odd behavior immediately before the message fails to process.
Memory Usage
Do you have any graphs you could share of the memory spiking behavior caused by
@grpc/grpc-js
, vs.,grpc
.Environment details
@pmcnr-hx mind providing this information, with regards to OS/Runtime, are you on Google App Engine, Kubernetes Engine?
@google-cloud/pubsub
version: 1.1.3Steps to reproduce
@pmcnr-hx it sounds like the memory issue you run into happens after a few hours of processing, what does the workload look like?, e.g., messages per minute, what type of work happens as the result of a message.
see #770
The text was updated successfully, but these errors were encountered: