-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in StringBuilder cache for Point.lineprotocol() #521
Comments
Yes, but do you have a idea to prevent that situation ? Having a massive amount of measurements is of course memory intensive, but the performance gain we have actually is massive. |
I tried to replace private static final ThreadLocal<Map<String, MeasurementStringBuilder>> with simple thread local shared StringBuilder and it performs better cca 10% for large number of unique measurements (20k+). Performance for less complex lineprotocol is not degraded. I used inch-java to measure this. private static final ThreadLocal<StringBuilder> CACHED_STRINGBUILDERS Version without thread local map:
2.13 version with thread local map:
|
@rhajek I have some questions here:
I think the whole purpose of having a
|
Answering my own question: after debugging some tests I discovered that internally |
In our use case it is more than 50k+ unique measurements. In the cache is stored StringBuilder 50k times for each thread and each entry contains the whole line protocol. It cannot be GC until the thread is alive. StringBuilder.setLength(0) hack and reusing instance performs faster then creating the new instance of StringBuilder. StringBuilder.setLength() does not reduce already allocated internal char array in StringBuilder. I found following about StringBuilder new instance / setLength in StackOverflow: Caching of StringBuilder per each measurement is not reasonable to me, because it saves only writing several bytes (measure length) into that array. I am unable to find test scenario, when caching per measurement performs better than simple StringBuilder per thread. |
This simple example ends with java.lang.OutOfMemoryError cca after 3500 iterations if you run java with -Xmx64m. With removed cache it performs without leaking.
|
@rhajek you are using a scenario that I'm not sure it matches with any real use case: you are persisting 1mil. points in 10k different measurements and limiting the memory allocation pool to 64MB. Is this correct? Also, did you check the commit responsible for adding the |
Our scenario is from real world, we have long living threads that writes large number (50k +) of unique measurements and during profiling we found that memory leak. I understand that caching of StringBuilder reduces internal array allocation and pressure on GC, but I don't understand what is the benefit to cache it per measurement. Here is the commit with overoptimalization: 901bf4f org.influxdb.PerformanceTests#testMaxWritePointsPerformance performs the same with/without that map. |
@rhajek my guess based on what I found with the branch from the dev who wrote this code is that his original intention was to use an internal pool of objects. |
- fixing memory leak and performance #521
Hello,
Point.CACHED_STRINGBUILDERS is per thread cache to store a StringBuilder instances for constructing lineprotocol. As the key in the map is used measurement name, the value is StringBuilder instance. The instance of StringBuilder is reused for all points with the same measurememnt.
This optimalization reduces GC activity when writing large number of Points but increases the memory consumption when we have large number of different measurements. The cache can memory leake, because it is never evicted programmatically and it is garbage collected only when the holding thread dies. StringBuilder.setLength(length) is used for reusing instance, but is does not free the builder memory.
OutOfMemory can be reproduced by writing large number of Points with unique measurements with long line protocol length in one thread.
The text was updated successfully, but these errors were encountered: