-
jcmd is used for multiple tasks such as taking heapdumps, print threads along with stack traces, etc... It was working fine, but now getting Permission denied for
GC.heap_dump
rest are workingsudo -u <user-id> jcmd <pid> GC.heap_dump heapdump.hprof (not working) sudo -u <user-id> jcmd <pid> GC.heap_info sudo -u <user-id> jcmd <pid> Thread.print
user-id
with which processing is running. -
jmap is another tool to take heap dumps
sudo -u <user-id> jmap -dump:file=./heapdump.hprof <pid>
-
VisualVM is used to analyze heap dumps
- VisualVM can be used to analyze above taken heap dumps
- VisualVM can also be used to attach to existing PID if JMX is enabled
- create a ssh tunnel
- connect to local port using jmx
-
With AkkaHttp 10.0.13 and Akka 2.4.20 combination, Source.queue left some garbage which is not cleaned
-
pmap
is useful command how much OS allocated memory to particular process, for example
sudo pmap 7056
sudo pmap -x 7056
sudo pmap -X 7056
sudo pmap -XX 7056
In the output, you can see total memory allocated , maybe it is good to mention explicit stack size with
-Xss1m
1 MB is default value
- Following code caused memory leak
val (streamPublisher, pub) = Source
.actorRef[AkkaStreamerResponse](16, OverflowStrategy.fail)
.withAttributes(Attributes.name("mystream-publisher"))
.toMat(Sink.asPublisher(fanout = true))(Keep.both)
.run()
createActor(streamPublisher) ! StartStreaming
when ever I want to tear down streaming connection from server point of view, sending Success
message to streamPublisher
but in some cases I am not getting Terminated
message (I am watching streamPublisher
). Maybe some times message might lose
on either side (Success
or Terminated
) so implemented retry mechanism
streamPublisher ! Status.Success(Done)
I am not sure KillSwitch is required or not
val ((streamPublisher, killSwitch), pub) = Source
.actorRef[AkkaStreamerResponse](16, OverflowStrategy.fail)
.withAttributes(Attributes.name("mystream-publisher"))
.viaMat(KillSwitches.single)(Keep.both)
.toMat(Sink.asPublisher(fanout = true))(Keep.both)
.run()
createActor(streamPublisher, killSwitch) ! StartStreaming
class MyActor(streamPublisher: ActorRef, killSwitch: KillSwitch) extends Actor {
// Not that important code here ...
def shutdown(remainingRetries: Int): Receive = handleActorTerminations orElse {
case Shutdown if remainingRetries <= 0 => cleanUp()
case Shutdown =>
context.watch(streamPublisher)
streamPublisher ! Status.Success(Done)
context.system.scheduler.scheduleOnce(5.seconds, self, Shutdown)
context.become(shutdown(remainingRetries - 1))
}
private def shutdownStreamerWithRetry(): Unit = {
streamPublisher ! Status.Success(Done)
context.become(shutdown(10))
self ! Shutdown
}
private def cleanUp(): Unit = {
killSwitch.shutdown()
self ! PoisonPill
}
// Not that important code here ...
}
- Still low memory alters are coming so enabled
-XX:NativeMemoryTracking=detail
and collecting memory details every 10 seconds with./native-memory
script
jemalloc is better memory allocator than malloc
- Download the latest version https://github.com/jemalloc/jemalloc/releases/download/5.2.1/jemalloc-5.2.1.tar.bz2
- Unzip, configure and make (if there is no gcc install with yum install gcc ... )
.so
can be found inlib
directory
cd jemalloc-5.2.1
./configure --enable-prof & make
cp lib/libjemalloc.so.1 /usr/lib64/
ln -s /usr/lib64/libjemalloc.so.1 /usr/lib64/libjemalloc.so
export LD_PRELOAD=/usr/lib64/libjemalloc.so
export MALLOC_CONF="prof:true,prof_prefix:jeprof.out,lg_prof_interval:30"
- Take only live heap dump (before taking heap dump full GC will be triggered)
sudo jmap -dump:all,format=b,file=/tmp/heapdump.hprof $pid
jcmd
with optional-all
dumps all objects, including unreachable objects
I am very happy to fix this memory leak issue, there were so many suspects on the way but none of the suspects caused the issue.
In short, the problem is, OS says a process is using too much memory (pmap -x
) but jvm says it is not using
memory (jcmd 30284 VM.native_memory
). Then I came to know about allocator issue which is mentioned in the
stackoverflow answer
gave me more insight about the memory allocation. I am not aware of the memory allocator issue. It seems there is a problem with malloc
,
it requests big chunks of memory (64 MB) which lead to fragmentation and excessive memory usage. Based on the suggestion I build jemalloc
and
exported LD_PRELOAD
as environment variable. Then things started working better, thanks to Andrei Pangin
for his stackoverflow answer and talk.
export LD_PRELOAD=/usr/lib64/libjemalloc.so
export MALLOC_CONF="prof:true,prof_prefix:jeprof.out,lg_prof_interval:30"
In the process I found there are some streaming actors are not closed properly so introduced retry mechanism with kill switch (code blocks are in the same page above). I am not sure it is good practice or not, but it works. I do not see those actors anymore. How will I know it is good practice or not?
I do not think this is main issue, it is kind of bonus.
- Took
heapdump
and try to analyse with VisualVM, could not find any issue in the heap - With port forwarding connected remote jvm process to VisualVM for better insights and hoped to find something
- With
pmap
confirmed that issue is in the same process. Other process are not causing memory leak - Thought bytebuffers causing the problem, but they are not
- Used jxray for better heap analysis, it pointed me to potential memory leak, but it is not. But in the process I found few actors were not closed properly fixed them.
- Suspected jvm arguments and try to tweak them a bit
- For sure, I thought native memory is causing the issue so enabled NMT and took periodically NMT metrics, but I could not see anything fishy
- Finally with the help of stackoverflow answer
and Andrei talk I found that
malloc
has on impact on RSS as well not just increase virtual address space. Usedjemalloc
to fix the issue
wget https://github.com/jemalloc/jemalloc/releases/download/3.0.0/jemalloc-3.0.0.tar.bz2
tar xjf jemalloc-3.0.0.tar.bz2
cd jemalloc-3.0.0
sudo yum install gcc --assumeyes
./configure --enable-prof
make
cp lib/libjemalloc.so.1 /usr/lib64/
ln -s /usr/lib64/libjemalloc.so.1 /usr/lib64/libjemalloc.so
- jmap documentation
- jcmd documentation
- jcmd dzone blog
- jemalloc
- AkkaHttp Memory leak looks interesting, but it is with
10.0.11
andakka-2.5.7
, in my case my akka http version is10.0.13
andakka-2.4.20
- Java command line options
- Memory Footprint of a Java Process by Andrei Pangin very good talk
- Stack over flow with nice explanation
- jhiccup