-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve thread safety of fingerprint processor #18738
Conversation
Pinging @elastic/integrations-services (Team:Services) |
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Steps errorsExpand to view the steps failures
Log outputExpand to view the last 100 lines of log output
|
// The fields array must be sorted, to guarantee that we always | ||
// get the same hash for a similar set of configured keys. | ||
// The call `ToSlice` always returns a sorted slice. | ||
fields := common.MakeStringSet(config.Fields...).ToSlice() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is there a difference in calling .ToSlice()
here vs. how it was before, a few lines below? Either way the value of p.fields
would be sorted, no? I'm just wondering if you are making this change to try and do all the operations on config.Fields
in one place (this line) or if there's some other more fundamental reason that could prevent a bug or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved it here in order to 'cleanly' add the code comment. Operationally it makes no difference. It's not obvious that ToSlice
already sorts and not immediately obvious that we always need the keys to be sorted, so we should document this (at first I thought we have a bug thanks to StringSet storing keys in randomized order).
Often I prefer to have fully 'initialized' variables that I don't need to modify or execute other operations on when constructing another object. This gives each code block a self-contained purpose (almost like a function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the change with not sharing the hash function to allow for concurrent use of this processor.
jenkins please run tests. |
jenkins run tests |
944979d
to
db11ece
Compare
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Test errorsExpand to view the tests failures
Steps errorsExpand to view the steps failures
Log outputExpand to view the last 100 lines of log output
|
The fingerprint processor used to reuse state between calls, but not being threadsafe means that the processor might get into an invalid state if it is used by multiple go-routines by accident, creating invalid, non-reproducible hash values. This change does not reuse the hash state between runs. Running the benchmarks seems to suggest that we don't create any extra allocations with this change (tested with go1.13): $ go test -bench . -benchmem goos: darwin goarch: amd64 pkg: github.com/elastic/beats/v7/libbeat/processors/fingerprint BenchmarkHashMethods/sha384-4 1000000000 0.123 ns/op 0 B/op 0 allocs/op BenchmarkHashMethods/sha512-4 1000000000 0.125 ns/op 0 B/op 0 allocs/op BenchmarkHashMethods/xxhash-4 1000000000 0.0539 ns/op 0 B/op 0 allocs/op BenchmarkHashMethods/md5-4 1000000000 0.0925 ns/op 0 B/op 0 allocs/op BenchmarkHashMethods/sha1-4 1000000000 0.112 ns/op 0 B/op 0 allocs/op BenchmarkHashMethods/sha256-4 1000000000 0.134 ns/op 0 B/op 0 allocs/op
db11ece
to
aba010a
Compare
Test failures are unrelated. |
(cherry picked from commit 6c4734b)
(cherry picked from commit 6c4734b)
…t processor (elastic#19653) (cherry picked from commit a73ba61)
What does this PR do?
Remove the reuse of the hash function between runs in the hash processor.
Running the benchmarks seems to suggest that we don't create any extra
allocations with this change (tested with go1.13):
Why is it important?
The fingerprint processor used to reuse state between calls, but not
being threadsafe means that the processor might get into an invalid
state if it is used by multiple go-routines by accident, creating
invalid, non-reproducible hash values. This change does not reuse the
hash state between runs.
Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues