Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span finalizer can cause a memory leak with G1 #1074

Closed
JonasKunz opened this issue Apr 3, 2020 · 8 comments
Closed

Span finalizer can cause a memory leak with G1 #1074

JonasKunz opened this issue Apr 3, 2020 · 8 comments

Comments

@JonasKunz
Copy link
Contributor

Hi,

We observed an issue with the finalize method of the span implementation of the SDK. When under high CPU load, the finalizer can prevent the garbage collection of spans, leading to a highly increased heap consumption.

The root cause for the problem is the following finalizer:
https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/trace/RecordEventsReadableSpan.java#L522
Finalizers block objects from being GC'd until they are executed. It seems like under high CPU load the JDK 11 G1 Collector with default settings decides to postpone the execution of the span's finalizer, causing the span to block heap memory.

We observed and analyzed the problem when using OpenCensus, which has exactly the same finalizer for spans, therefore we are reporting this issue here. Our JavaAgent internally uses OC to collect trace data, which is where the issue occured. After some heap dump analysis we did an experiment:
We removed the finalizer from the span implementation and changed nothing else. This caused the issue to disappear. For the full details of the problem and the analysis please have a look at the corresponding issue at our repo.

The quickfix for this problem would be to simply remove the finalizer, which however also removes the log messages for non-ended spans.
The "correct" fix would be to use WeakReferences alongside with a ReferenceQueue to detect when non-ended spans are GC'd, as WeakReferences do not block the object they are referring to from being freed.

@Oberon00
Copy link
Member

Oberon00 commented Apr 3, 2020

We could implement this as providing an optional WarnOnLeakedSpan SpanProcessor that would maintain that WeakReference queue and print the warnings. Then the finalizer could be removed. EDIT: #697 may have some code which is reusable for this.

@jkwatson
Copy link
Contributor

jkwatson commented Apr 3, 2020

Decision from the java SIG meeting today: We're going to remove the finalizer ASAP. And, I'll create an issue to track creating a span processor that will do what @Oberon00 suggests that can be used for checking for non-finished spans.

@bogdandrutu
Copy link
Member

@jkwatson let's fix the issue first then do the SpanProcessor because this may affect users.

@carlosalberto
Copy link
Contributor

@bogdandrutu To be clear: implement a solution like the one suggested by @Oberon00 and only after that, remove the finalizer?

@Oberon00
Copy link
Member

Oberon00 commented Apr 6, 2020

@carlosalberto I read it the other way round: Removing the finalizer has priority and the Processor can be implemented as soon as someone has time.

@jkwatson
Copy link
Contributor

jkwatson commented Apr 6, 2020

Issue created to create the leak-warning span processor: #1084

@jkwatson
Copy link
Contributor

jkwatson commented Apr 8, 2020

@JonasKunz we've got the finalizer removed. Are you ok with us closing the issue?

@JonasKunz
Copy link
Contributor Author

Sure! I really like the quick actions taken here :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants