-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cherry-pick #21240 to 7.x: Fixes for new 7.10 rsa2elk datasets #21379
Conversation
* Fix bad unicode character used in juniper/netscreen Some parsers from netwitness wrongly use ’ XML entity as a quote character. This entity translates to UNICODE codepoint U+0092 (PRIVATE USE 2), which is not printable and can cause problems. My understanding is that this is the result of either: - Device logs are encoded in the windows-1252 codepage, or - Log parsers originally written in windows-1252 codepage. In this codepage, \x92 represents a quotation mark similar to the ASCII \x27 single quotation mark ('). I believe someone misunderstood XML's &#xNNN entity as escaping a byte value, instead of a UNICODE codepoint. As it is unclear if the original logs contain this special quote, or it's the result of writting the parsers in a Windows editor, it's better to replace it's usage with empty captures that skip over this quote. * Update pipelines for new 7.10 rsa2elk datasets The original pipelines had been generated with some debugging comments in them, which made them much larger than necessary. (cherry picked from commit 24e972f)
Pinging @elastic/siem (Team:SIEM) |
💔 Tests FailedExpand to view the summary
Build stats
Test stats 🧪
Test errorsExpand to view the tests failures
Steps errorsExpand to view the steps failures
Log outputExpand to view the last 100 lines of log output
|
Ignoring build failures: x-pack/filebeat tests passed and this only touches .js pipelines. |
Cherry-pick of PR #21240 to 7.x branch. Original message:
What does this PR do?
This updates the Javascript pipelines in the new rsa2elk datasets for 7.10.
Why is it important?
There were two problems with the original pipelines:
juniper/netscreen:
This pipeline used ’ XML entity as a quote character. This entity translates to UNICODE codepoint U+0092 (PRIVATE USE 2) (�), which is not printable and can cause problems.
My understanding is that this is the result of either:
and
In this codepage, \x92 represents a quotation mark similar to the ASCII \x27 single quotation mark ('). The correct codepoint to use for this character would have been U+2019 (’, RIGHT SINGLE QUOTATION MARK, ’).
As it is unclear if the original logs contain this special quote, or it's the result of writing the parsers in a Windows editor, it's better to replace it's usage with empty captures that skip over the quote.
The original pipelines had been generated with some debugging comments that made them much larger than necessary.
Checklist
My code follows the style guidelines of this projectI have commented my code, particularly in hard-to-understand areasI have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added tests that prove my fix is effective or that my feature worksI have added an entry inCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
It's OK as long as it passes the tests.