What is the best way to be informed of errors that may have occurred during parsing? #361
Replies: 6 comments 7 replies
-
Short story, currently log lines don't make it past the parsers. The reasoning is that historically there were a couple of cases that needed to be handled. First, in preunified logging there were approximately 60 different flags that would affect the format of the file. Each flag or combination of flags has a different affect on the format of the logs file. Accommodating all flags let alone all combination of flags is simply a nightmare. This is especially true when arbitrary changes were frequently introduced with each release. Additionally, GC logs might be collected from stdout and consequently be mixed with output from other sources. With Unified logging the logs can contain all kinds of data that isn't GC/memory related and if the output is collected on stdout, it can be mixed with output from other sources. Thus, the select what we recognize and ignore/log everything else was the best solution I could come up with at the time. That the data source (GC log) is separate from the parsing which is separate from Aggregation/views helped me cope with the changes in the logs that had no bearing on the data was to be viewed. This is why log lines don't make it to Aggregator. |
Beta Was this translation helpful? Give feedback.
-
I think the issue, @kcpeppe, is that any log lines not parsed are just swallowed by the parser with some log message. There are also logged lines that are swallowed as "not interesting". I think @jlittle-ptc is say that there should be a way of passing those un-parsed lines on for futher handling. |
Beta Was this translation helpful? Give feedback.
-
The challenge is, if the parser doesn't recognize some input it's hard to say if that input has negatively impacted the analysis or not. All it knows is that it encountered something that it doesn't know how to deal with. If that "something" is meaningful is a question that would need to be answered by inspection. My current thinking is that the a summary of the error information could end up in the Diary. The diary is injected into GCLogParser. The JavaVirtualMachine class has access to the diary as does the end user client. That would create a path to the information. Another option would be to create a new event (ParsingErrorEvent) that would carry error information with it. One could aggregate that. Of the two I think I prefer the former option. The error complicates the event hierarchy which isn't a reason to not create the new event, it's just something that needs to be considered. Also, if an error is pub'ed, I'm not sure what you'd do with in an Aggregator so, suggestions are welcome. As an FYI, notYetImplemented() was intended to be used only during development for debugging purposes. |
Beta Was this translation helpful? Give feedback.
-
Here is an idea. With Unified logging one could inspect the tags if the tags are being included in the decorator set. If there are no tags/decorators in the line, then it's likely output from something else. If there are tags, they can be checked to see if they are in the supported set. If the tags are in the supported set then you'd make the calculations as suspect. The open question is what to do when the decorators don't include the tags. |
Beta Was this translation helpful? Give feedback.
-
Another option might be to chain another Channel into a parser. If Channel#consume returned a boolean whether the message was "published", then the parser could move on to the next Channel in the chain. Something like that. By "publish", I mean that it created an event or whatever "success" means to the Channel. |
Beta Was this translation helpful? Give feedback.
-
With respect to Vertx, we abstract that away with com.microsoft.gctoolkit.message.* |
Beta Was this translation helpful? Give feedback.
-
While GCToolkit has generally handled any file I've thrown at it, there have been a couple of times where I've needed to do a bit of troubleshooting to figure out what was going on.
The first step I've generally used is to use the -Dmicrosoft.debug flag to capture "not implemented" messages in the log, which usually indicates either a possible problem with the regex (#352 ) or that the parser doesn't know the pattern in question.
However, I have not figured out how I might be able to capture this information such that my application can react programmatically. For example, having the missed lines/events logged into a data structure that can be queried to determine if there were any issues parsing the file.
Is there a way to capture this sort of information during the Aggregation process, or is it something that would need to be added to the parser?
Some of the use cases I can envision:
Beta Was this translation helpful? Give feedback.
All reactions