-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async log4j2 log events are not counted accurately #2176
Comments
Note that the issue is not that the filter is added multiple times. The same filter is invoked multiple times. You can for example see this in |
So this only applies to async loggers? |
@bol-com-pschmitz thank you for the report. Just so we're explicit about the scenario and that a fix properly addresses it, would you be able to write a test demonstrating it like other tests in our Log4j2MetricsTest? |
I'm not sure, possibly. That's how it occurred in my case. It doesn't matter though; |
I'll take a look. |
Hi all, hello from Belgium! :) I just create a test for this issue and indeed the filter have been called multiple times for same event. |
Thanks @dennysfredericci ! I was about to start doing that. |
There is a method isEndOfBatch on Log4jEventLog which can be used to see if this is the "last" call.
I changed the implementation to test this idea, but it breaks all other tests. |
Hi Guys, When we have an async logger configured a call to log.info or any other level triggers 2 times the filter, i found a way to use the isEndOfBatch attribute from LogEvent to check if this is the last call and then metrics counter will be incremented only once instead of multiple times. I also create a simple spring-boot project to test it locally. You can see it here: https://github.com/dennysfredericci/springboot-micrometer-issue-2183 |
Hi @shakuzen Just noticed now, I did the PR for master branch should I change it to 1.1.X branch? |
Resolved by #2183 |
@bol-com-pschmitz would you confirm if the issue is fixed for you? The fix is available in versions 1.1.16, 1.3.11, and it will be in the upcoming 1.5.3 release (you can try the latest snapshot version 1.5.3-SNAPSHOT for now). |
I'm testing it. At first glance it looks like now it might be undercounting. I'm not sure what constitutes a "batch", but if it can result in more than one log entry then this is not going to work. |
@bol-com-pschmitz Thanks for testing. Let us know if there's another issue. If we can get a unit test to reproduce, that'll be best so we can make sure it gets fixed properly and stays fixed. |
Unfortunately this approach does not work. It counts batches of log events, not log events. This small change to the unit test added by @dennysfredericci demonstrates it: assertThat(registry.get("log4j2.events").tags("level", "info").counter().count()).isEqualTo(0);
logger.info("Hello, world!");
logger.info("Hello, world!");
logger.info("Hello, world!");
assertThat(registry.get("log4j2.events").tags("level", "info").counter().count()).isEqualTo(3); |
This means the issue has not been fixed and should be reopened. |
Indeed I can see it as well 😞 |
I'm no log4j2 expert, so I'm hoping we can get some help on a solution for this from someone more knowledgeable than me on this area. I've marked the issue as |
So, let's became a log4j2 expert 😄 I just started a thread on log4j2 discussion list to see if is possible to use a filter for that use case. |
Hi @bol-com-pschmitz this seems an unexpected behavior, I got the message below on log4j mail list
I just included a reconfiguration call and now the filter is called just once, but I am getting the exception below.
In summary, this can be an issue in how we add the filters to log4j configuration. |
As is not possible to use log4j filters to count log events properly I requested a statistics feature for the log4j team. With some luck, we can have these counters direct from log4j API. |
apache/logging-log4j2#1550 has been resolved and is marked for inclusion in log4j2 2.20.1. Once that is released and we've upgraded, we'll add a test along the lines of #2176 (comment) to ensure the issue no longer affects us. Hopefully then we can close this issue. |
In case of async loggers, the MetricsFilter#filter method is/was called multiple times. Because of this, a check was introduced in gh-2183 assuming that if the event has the isEndOfBatch flag set to true, that is the last filter method call for that event. Unfortunately, it turned out that this approach did not work, since it did not just filter out the unwanted multiple calls on one event but it also filtered out all the filter method calls on events that were not at the end of the async batch. So Log4j2Metrics counted batches of events, not the individual events. Fortunately multiple filter invocations was fixed in Log4j2, see apache/logging-log4j2#1550 and apache/logging-log4j2#1552. Since now there will be only one filter method call, the check introduced in gh-2183 can and should be removed (the call to the filter method is before the isEndOfBatch flag is set so the flag will always return false). Closes gh-2176 See gh-2183 See gh-4253
In case of async loggers, the MetricsFilter#filter method is/was called multiple times. Because of this, a check was introduced in gh-2183 assuming that if the event has the isEndOfBatch flag set to true, that is the last filter method call for that event. Unfortunately, it turned out that this approach did not work, since it did not just filter out the unwanted multiple calls on one event but it also filtered out all the filter method calls on events that were not at the end of the async batch. So Log4j2Metrics counted batches of events, not the individual events. Fortunately multiple filter invocations was fixed in Log4j2, see apache/logging-log4j2#1550 and apache/logging-log4j2#1552. Since now there will be only one filter method call, the check introduced in gh-2183 can and should be removed (the call to the filter method is before the isEndOfBatch flag is set so the flag will always return false). Closes gh-2176 See gh-2183 See gh-4253
In case of async loggers, the MetricsFilter#filter method is/was called multiple times. Because of this, a check was introduced in gh-2183 assuming that if the event has the isEndOfBatch flag set to true, that is the last filter method call for that event. Unfortunately, it turned out that this approach did not work, since it did not just filter out the unwanted multiple calls on one event but it also filtered out all the filter method calls on events that were not at the end of the async batch. So Log4j2Metrics counted batches of events, not the individual events. Fortunately multiple filter invocations was fixed in Log4j2, see apache/logging-log4j2#1550 and apache/logging-log4j2#1552. Since now there will be only one filter method call, the check introduced in gh-2183 can and should be removed (the call to the filter method is before the isEndOfBatch flag is set so the flag will always return false). Closes gh-2176 See gh-2183 See gh-4253
It seems log4j2 Our
|
I think we're in a bit of a hard spot. I was hoping that the revised test would pass on new versions of log4j2 without changes to our main code. It seems that isn't the case. The changes made to main code reintroduce an issue (see #2183) of over-counting async logs for anyone on versions of log4j2 prior to 2.21.0. As you pointed out, the count isn't right either way, but it's somewhat better to under-count than to over-count, I think. And changing the behavior to over-counting in a patch release where the only remedy is to upgrade to a new minor version of a dependency doesn't feel great either. For those reasons, I am leaning toward applying the changes only in I think the commit message and changes look good to me. |
In case of async loggers, the MetricsFilter#filter method is/was called multiple times. Because of this, a check was introduced in gh-2183 assuming that if the event has the isEndOfBatch flag set to true, that is the last filter method call for that event. Unfortunately, it turned out that this approach did not work, since it did not just filter out the unwanted multiple calls on one event but it also filtered out all the filter method calls on events that were not at the end of the async batch. So Log4j2Metrics counted batches of events, not the individual events. Fortunately multiple filter invocations was fixed in Log4j2, see apache/logging-log4j2#1550 and apache/logging-log4j2#1552. Since now there will be only one filter method call, the check introduced in gh-2183 can and should be removed (the call to the filter method is before the isEndOfBatch flag is set so the flag will always return false). Closes gh-2176 See gh-2183 See gh-4253
Micrometer provides the
Log4j2Metrics
meter binder to add metrics for Log4j2 log events. Unfortunately the implementation is flawed. It is implemented as a filter, but filters can be invoked multiple times and must therefore not have any side effects.In practice the filter is invoked multiple times by Log4j2 in many circumstances (e.g. in the presence of multiple and/or async loggers). The result is that the counter is increased multiple times for each single log event and the counter values become too high.
The text was updated successfully, but these errors were encountered: