-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement in Category callAppenders() #51
Comments
@lauredogit Which version of reload4j are you using? |
Production was using in 1.2.20. |
It appears the threads were piling up due to slowness induced by a remote diagnostic agent running at the time (RDA). It is strange that it materialized by showing a bottleneck in the synchronized block of Category#callAppenders(). Nonetheless, it might be a performance improvement to use an open call to invoke the appendLoopOnAppenders(...) method instead of doing it while holding the lock. |
TLDR: [as of "update 2"] definitely appears to behave differently than old log4j, but in Apache Spark 3.0.x with
hello. FWIW i'm also running into thread pileup on this lock. Using reload4j 1.2.25 retrofitted into an Apache Spark 3.0.1 application where Spark is used from Symptoms are that on a GCP VM with 56 cores the application has long periods (maybe 90% of wall-clock runtime in aggregate) of only using 6 cores compared to the same tasks on stock Spark 3.0.1 which is using 50+ cores. Overall impact is that elapsed time for the task is ~8 hours vs 1.5 hours. Jdk 8 (but reload4j scenario also tested with 11). I'm not aware of any JVM debug agents attached / running. (there's maybe some excess info below for search engines / anyone else that might walk this way) based on the stacks look something like this:
In my app's case parquet writing as part of
and in our setup i note that we use just the spark deployment so it says the following at startup:
[update: no change in behavior after loading native library.
The warning disappeared and verified with [update 2:
one or two in "isInfoEnabled()" and several in just in callAppenders().
CPU usage is more variable and more time is spent in iowait; the impression is that the task is doing less overall work. elapsed time is now as fast or shorter than stock log4j (though those tests didn't have this init file). Area under the CPU usage graph looks similar so my impression may be invalid. This is probably where i'm going to leave things. |
Hi,
We just got a bottleneck in production with 2000+ threads blocked waiting on the lock in the synchronized block of callAppenders() in Category.
This never happend before to us, including in stress tests.
However, I seem to think a small performance improvement could be made with a simple open call (if I'm reading the code correctly).
Since aai is final and is a thread-safe data structure, we only want correct visibility of the aai and additive fields and we don't need to block the whole category object while calling appendLoopOnAppenders.
What do you think?
Best regards,
Dominique
The text was updated successfully, but these errors were encountered: