-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a race condition on add_host_metadata #8653
Conversation
f4e46ca
to
7ae21ca
Compare
Can we use an atomic reference to the map and guarantee that is in immutable? I think a mutex is overkill but I may not be understanding the problem |
I wonder if |
Not sure about using an atomic pointer, we still need to check and update the last update timestamp under the same mutex zone 🤔 |
cf3b552
to
813d222
Compare
} | ||
return p, nil | ||
} | ||
|
||
// Run enriches the given event with the host meta data | ||
func (p *addHostMetadata) Run(event *beat.Event) (*beat.Event, error) { | ||
p.loadData() | ||
event.Fields.DeepUpdate(p.data.Clone()) | ||
event.Fields.DeepUpdate(p.data.Get()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces another potential race. Better: p.data.Get().Clone()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for my understanding, does that mean that the write on data could be coming from some processor in the pipeline? this processor itself is only using set, while treating the inserted MapStr as immutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Processors or even the output can add/modify the event. E.g. users adding more fields to host (this did actually happen):
fields:
host.whatever: x
fields_under_root: true
If an user does this to one input only, then the shared data must not be updated at all (so to guarantee data consistency of other modules/inputs). Plus: modifying shared data can race with the serialisation in our outputs.
Working with events and shared data + custom processors is full of pitfalls and potential race conditions. We will have to think about some 'event-type' that can share data (such that we do not create this much garbage), but yet, does prevent developers from accidentally overwriting shared data.
As processors can be run in parallel, caching with concurrent updates is another pitfall in processors.
These issues are known to the Core Team. We hope to provide API/libs dealing with these potential races/pitfalls for you, such that input/processor developers don't have to deal with these kind of issues. At times It's hard to get this right, even if you are aware of these issues.
Using |
@jsoriano Thanks for making that changes, I didn't thought that just cloning wasn't enought but yes in that case using I think we have to come up with a better story with immutability inside the pipeline and we deal with events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the usage of the pointer and using clone on Get this sound like a better strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion
I have created the following issue #8662 to discuss a core implementation of caching. |
add_host_metadata keeps a cache of the host data collected, this cache is now updated atomically. (cherry picked from commit 74b9c6c)
add_host_metadata keeps a cache of the host data collected, this cache is now updated atomically. (cherry picked from commit 74b9c6c)
add_host_metadata keeps a cache of the host data collected, this cache is now updated atomically. (cherry picked from commit 6d25fd9)
Not sure if adding a mutex at this level will be too blocking for events processing, I have also though on other options to reduce contention on reads:
Continues with #8223