-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Beats fingerprint processor should generate same fingerprints as Logstash fingerprint filter #18542
Comments
Pinging @elastic/integrations-services (Team:Services) |
I did check the implementations for potential bugs and did some testing. Each rely on the runtime specific string/value formatting when streaming the "message" to the hash function (e.g. differences in encoding special symbols might gives us slightly different results). Simple test ruby script:
And go code (go playground):
The ruby script gets us this output:
and for the go script we get:
I also verified the fingerprint processor returning the same result by adding a custom unit test. All in all, it looks like the implementations give us similar results. Potential directions to investigate why we have different hashes:
Without details about the full setup I can't really tell what we are seeing here. I added thread safety and unit tests from the sample script in #18738. |
Thanks for your answer @urso. I use Winlogbeat (On a Windows 7 device) which sends logs to Logstash (implemented on a Debian 10) and this one sends logs to Elastisearch (Implemented on the same Debian). Here is my configuration from winlogbeat.yml:
Here is my configuration from logstash-beat.yml:
Even with these configurations (of fingerprint plugin) which seem to be the same, i got two differents Hash from both fingerprint. Thanks. |
Maybe we can trim down the test case a little to make it more reproducible (best would be if we can construct unit tests with test input). You winlogbeat configuration has two
A good test to see if we have a threading issue would be to disable all event_logs but 1. For example
I'm not even sure if the In logstash we can print the message field to the console by adding a ruby filter with this script (you need to add "require json" to the
Having these messages allows us to see if the fingerprint processors really get the same message as input. |
Thanks for your answer @urso Even if I had two processor's sections at top-level the fingerprint works well, I guess. But thanks I will know it for the future to avoid wrong YAML syntax. I've tried to disable all event_logs except 1 and the fingerprint are always different (between logstash and winlogbeat). In logstash configuration I added ruby filter script like this:
But the console prints:
So i've put an "add_fields" module in fingerprint module to see what the message field looks like when it get fingerprinted and it looks like the same as the input. So I looked at the fingerprint code of logstash and winlogbeat. I am wondering on those scripts beats and logstash: Thanks. |
Comparing the Beats and logstash implementation, I think you should be able to get the same fingerprint in logstash if you set:
If |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
They don't produce the same hashes when the fields are nested and a different syntax is employed, e.g. |
According to https://discuss.elastic.co/t/integrity-issue-between-winlogbeat-and-logstash-fingerprint/232654, it appears the Beats fingerprint processor does not generate the same fingerprints (for the same fields, with the same hashing algorithm) as the Logstash fingerprint filter.
Looking at the two implementations, specifically the concatenation code (Beats | Logstash), it looks like the intent was for the two fingerprints to be the same.
The text was updated successfully, but these errors were encountered: