-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HUGE memory leak problem #768
Comments
Update: I tried updating to 0.5.4 based on the comments in this thread and it did not help Here is the new yarn.lock:
|
If it is of any help, the way I use firestore is as follows: I have a file called import Firestore from '@google-cloud/firestore';
export default new Firestore(); I then import firestore and use it in other places like this: import firestore from './modules/firestore';
const updateThing = async (req, res) => {
try {
const { docId, updatePayload } = req.body;
await firestore.doc(`Things/${docId}`).update(updatePayload);
return res.status(200).send();
} catch (err) {
console.error(err);
return res.status(500).send();
}
}; I can probably have a temporary workaround by just creating a new Firestore instance from within each function body, but fixing the underlying problem is a more sustainable approach. Also, my application doesn't have any event listeners. The only operations it uses are |
@jakeleventhal Thanks for the very detailed issue! Can you describe your work load a bit? It would help to know how many requests you are sending and if the memory usage tapers off once those requests are processed. |
I have a couple of observations here: The string that is being constructed almost a million times looks like a generated variant of the protobuf.js The |
@schmidt-sebastian So I have an API server that accepts HTTP requests from a another scheduling server. The scheduling server repeatedly sends jobs to the API to save user data. There are many operations that occur but, for the most part, they can all be summarized as this:
There can be anywhere from 1-1000 or so firestore calls per minute per user. Each document that is updated is unique so there is no contention going on here. I also have used log statements to see that functions are entering and exiting (and they all more or less follow a similar structure to what I wrote above). The scope of the firestore calls should be limited to just the function. Moreover, the firestore calls aren't throwing errors or anything and are updating documents successfully. Again, there are no event listeners anywhere in my code. I have also tried not using a single instance of Firestore that I import throughout my code - I tried creating a new instance of Firestore within the body of each function that uses firestore. I tracked the number of instances of Firestore objects through memory snapshots and was able to see that the Firestore objects were successfully being deleted/deallocated at the end of each function; however, the metadata with all those strings/arrays/objects still persisted outside the scope of the functions. I hope that clears up everything. |
@jakeleventhal we've hit with this kinds of problems too in cloud functions: firebase/firebase-functions#536 The problems originate from grpc-js, which is still experimental. What worked for us in the end was to downgrade our dependencies so it still uses the binary grpc which is much more mature and stable. There's also a mechanism to force firestore to use binary grpc with the newer client version described here: firebase/firebase-functions#536 (comment), however I have not tested that one as it is unsupported / untested if I understood correctly. That note also explains a bit why we are in this mess. So, after grpc-js matures a bit, we’ll hop back to the latest versions. |
@swftvsn what functionality will be lost/what differences will there be? also, what is the syntax with ‘@google-cloud/firestore’ |
@swftvsn @schmidt-sebastian why is firestore not using the binary method of grpc until the other version is stable? what’s to be gained? this memory leak cost me close to $1k on gcp and is now a bottleneck for launching my product. i’m sure i’m not the only one experiencing issues like this |
@jakeleventhal The reasoning to use grpc-js is explained in this comment: firebase/firebase-functions#536 (comment) We didn't lose any functionality we need when switching to earlier version, but of course we just checked everything we need. (We use typescript, so basically everything compiles -> you're good to go. ymmw etc. ofcourse.) |
Providing a different GRPC implementation via the constructor argument (as described in firebase/firebase-functions#536 (comment)) is fully supported. All currently available features in Firestore are compatible with both @murgatroid99 Do you have a rough idea of where in our stack this problem might best be addressed? |
I'm only really familiar with my end of the stack, so I can only say that it's somewhere above that. Probably this firestore library. Loading the same Similarly, constructing many clients for the same service and backend is not the intended usage of either grpc library, but that should be a much smaller problem with the new release of grpc-js that gax should be picking up soon. |
I spent some more time on this, but I am still not able to pinpoint the leak. What is interesting though is that this only seems to happen with set()/update()/delete() and not with a get() call. After 2000 operations, my memory Profile looks like this: The memory usage does not go down even when all operations resolve. |
Is this confirmed to be fixed using the regular grpc library? i.e. import grpc from 'grpc'
const firestore = new Firestore({ grpc }) |
@schmidt-sebastian I don't know if it's related, but there is a similar issue in BigQuery: googleapis/nodejs-bigquery#547 |
@jakeleventhal I see similar memory profiles with both |
It looks like the memory leak happens in one of our depend layers, so it might take a bit of work to track this down. The memory leak also exist when the Veneer client (which is an auto-generated SDK that we use for our request logic) is used directly. What is somewhat interesting is that we only seem to be leaking memory once we get responses from the backend - I see a steady increase of memory consumption only after requests are starting to get fulfilled. With 2000 writes, FirestoreClient_innerApiCalls has a retained size of 99680 bytes that is mostly unaccounted for. There are also 2000 PromiseWrap and Promise objects on the stack that take up roughly this amount of memory. |
Great to see the investigation resulting some fruit @schmidt-sebastian - and sorry that I jumped to conclusion and thus wrote bs above.. |
@schmidt-sebastian There is definitely a huge improvement using the legacy grpc ( You'll notice the memory is still climbing in the second example - but its very very minimal compared to the |
@schmidt-sebastian Any update on "next week" or did you have one of these weeks in mind https://www.mentalfloss.com/article/51370/why-our-calendars-skipped-11-days-1752 |
But in all seriousness, I'm spending a doggone fortune because of this leak.. |
I am very sorry, we are still working on this. We are making progress every day and I think I can see the finish line. |
Is there any update on this? Bleeding cash for me. |
Yeah, this has been nearly 3 months... come on... |
Version 3.0.0 should be released on Monday morning (Pacific time). With this version, we close the GRPC client when a channel becomes idle. If you request load frequently causes new channels to be spawn and released, this should reduce the memory burden described in this issue. I am sorry and somewhat embarrassed by how long this took. I hope this solves the root issue described in here. |
Any idea how soon firebase-admin-node will be upgraded from 2.6v to start using 3.0v of nodejs-firestore? @schmidt-sebastian I'm really glad a solution is in the pipe here, this has caused us a lot of grief, thank you for your efforts! |
Experiencing over 15x reduction in memory footprint with 3.0.0 👍🏼 |
@bfitec Unfortunately, we may have another breaking change coming up shortly as Node 8 goes end of life end of the year. We still have to figure out what this means for us and if we have to bump the major version for Firestore again, but we don't want to bump the major version for Firebase Admin twice in short succession and hence we haven't updated Firebase Admin. |
@schmidt-sebastian I changed In my |
In our service we are listening for some dozens of thousands of streams. We noticed the more streams we need to be listening to the more memory our process needs. It is currently requiring +4Gb and increasing as new clients join. This should not happen as 98% of the streams have no messages being exchanged other that the first couple. As I said above, we attempted forcing the 3.0.0 firestore cloud version by editing the package.lock. |
@schmidt-sebastian You advised using shrinkwrap. Is using npm shrinkwrap for this case functionally different from changing package.lock? |
@jmacedoit I added a branch to
The bad news is that this might not address your specific memory concerns. Our networking infrastructure only supports 100 concurrent operations over a single connection and as such, we need to create new GRPC connection for every 100 operations. The changes in this PR merely address the memory usage of GRPC clients that are no longer needed, but if you have a steady count of thousands of listeners, then some high memory usage is expected (at least for now). |
@schmidt-sebastian Thanks for the branch! Any tips on how can this be addressed? The project in question is a chatbot gateway. It needs to listen to the streams of thousands of users (1 stream per user), in order to send the message to a core so that it is processed and the answer is placed back in the client stream. Naturally, most of the time the streams will have no activity. (Including people that send 1 message and never interact with the service again) Is there anything that can be done to mitigate the memory needed to maintain these connections? You say
The memory increases as soon as I start listening to all the streams even without any incoming messages. Is this expected? |
Two options come to mind:
If you have engineering resources to spare, we could also add an option to issue multiple listeners over the same stream. This is supported by the backend and currently used by the Android, iOS and Web SDK. We deliberately decided to use one stream per listeners as this provides isolation for all operations in the Server SDKs, but we could optionally bundle listeners together in the Node Server SDK. Unfortunately, this is a significant effort and will likely not be staffed by us in the near term. As for your last question - yes, a new operation is spun once |
@schmidt-sebastian Collection groups would be great, unfortunately the client applications are already deployed Edit: Actually, on a closer look I may be able to use collection groups! Thanks |
Looks like this issue generated yet another issue 🤦🏻♂️ |
The problem is still not solved. I am using firebase functions 3.6.1. My package.json file is: My function file is:
|
@siddhant-mohan If you turn on logging ( |
@schmidt-sebastian hey can you help me how to turn on logging to check GRPC clients my function is using? Sorry I am quite new to the firebase functions |
It should be:
|
Environment details
@google-cloud/firestore
version: 2.3.0I'm still experiencing a memory leak potentially related to #661. The following is from my yarn.lock:
Quick note: because of the grpc problems that keep appearing here, we should probably put grpc version as part of the issue template
So it is clear that I'm actually using the "correct" grpc. My application makes many calls to firestore and I'm experiencing massive memory leaks.
I'm not actually seeing the error message described previously (
MaxListenersExceededWarning
), however, the memory usage on my application slowly and steady is increasing. I've taken several heap snapshots from local tests over the course of several hours and when comparing snapshots from a few hours apart, I notice that the top culprits for memory allocations are all related to grpc.Here you can clearly see that the top 4 culprits for the memory delta between snapshots 12 and 4 are vastly more than everything else
Here are the contents of each from largest to smallest:
""(function anonymous( ) { return function BeginTransactionRequest(p){ if(p)for(var ks=Object.keys(p),i=0;i<ks.length;++i)if(p[ks[i]]!=null) this[ks[i]]=p[ks[i]] } })"
. And this all comes frommessage.js:13
. This is a a file from the packageprotobufjs
which is a dependency of grpc.message.js:13
.I also have tens of thousands of objects that look like this (from grpc)
type.js:31
andnamespace.js:95
, both of which are also part of protobufjs."Field" is also related to the same line numbers and is actually directly linked to grpc
The text was updated successfully, but these errors were encountered: