Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a race condition on add_host_metadata #8653

Merged
merged 8 commits into from
Oct 22, 2018

Conversation

jsoriano
Copy link
Member

@jsoriano jsoriano commented Oct 19, 2018

Not sure if adding a mutex at this level will be too blocking for events processing, I have also though on other options to reduce contention on reads:

  • Using a RWLock, but we'd still need to write-lock first to check expiration and update
  • Using a RWLock and a go-routine to do periodic updates, but afaik we wouldn't have a way to stop it if the processor is removed.

Continues with #8223

@adriansr
Copy link
Contributor

adriansr commented Oct 19, 2018

Can we use an atomic reference to the map and guarantee that is in immutable?
As in, loadData should allocate a new map and update the atomic reference, and don't update it once the reference is set (i.e. don't do p.data.Put(...)).

I think a mutex is overkill but I may not be understanding the problem

@exekias
Copy link
Contributor

exekias commented Oct 19, 2018

I wonder if MapStrPointer would help with these cases: https://github.com/elastic/beats/blob/master/libbeat/common/mapstr_pointer.go

@jsoriano
Copy link
Member Author

jsoriano commented Oct 19, 2018

Not sure about using an atomic pointer, we still need to check and update the last update timestamp under the same mutex zone 🤔

}
return p, nil
}

// Run enriches the given event with the host meta data
func (p *addHostMetadata) Run(event *beat.Event) (*beat.Event, error) {
p.loadData()
event.Fields.DeepUpdate(p.data.Clone())
event.Fields.DeepUpdate(p.data.Get())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces another potential race. Better: p.data.Get().Clone()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my understanding, does that mean that the write on data could be coming from some processor in the pipeline? this processor itself is only using set, while treating the inserted MapStr as immutable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Processors or even the output can add/modify the event. E.g. users adding more fields to host (this did actually happen):

fields:
  host.whatever: x
fields_under_root: true

If an user does this to one input only, then the shared data must not be updated at all (so to guarantee data consistency of other modules/inputs). Plus: modifying shared data can race with the serialisation in our outputs.

Working with events and shared data + custom processors is full of pitfalls and potential race conditions. We will have to think about some 'event-type' that can share data (such that we do not create this much garbage), but yet, does prevent developers from accidentally overwriting shared data.

As processors can be run in parallel, caching with concurrent updates is another pitfall in processors.

These issues are known to the Core Team. We hope to provide API/libs dealing with these potential races/pitfalls for you, such that input/processor developers don't have to deal with these kind of issues. At times It's hard to get this right, even if you are aware of these issues.

@jsoriano
Copy link
Member Author

jsoriano commented Oct 19, 2018

Using MapStrPointer for atomic reference of data, not sure about lastUpdate, I guess that at least we can have multiple updates of data if time expired.

@ph
Copy link
Contributor

ph commented Oct 19, 2018

@jsoriano Thanks for making that changes, I didn't thought that just cloning wasn't enought but yes in that case using MapStrPointer sound like a better idea.

I think we have to come up with a better story with immutability inside the pipeline and we deal with events.

Copy link
Contributor

@ph ph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the usage of the pointer and using clone on Get this sound like a better strategy.

Copy link
Contributor

@adriansr adriansr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion

@ph
Copy link
Contributor

ph commented Oct 19, 2018

I have created the following issue #8662 to discuss a core implementation of caching.

@jsoriano jsoriano added needs_backport PR is waiting to be backported to other branches. v6.5.0 v6.4.3 labels Oct 19, 2018
@jsoriano jsoriano merged commit 74b9c6c into elastic:master Oct 22, 2018
@jsoriano jsoriano deleted the fix-race-add-host-metadata branch October 22, 2018 11:59
@jsoriano jsoriano removed the needs_backport PR is waiting to be backported to other branches. label Oct 22, 2018
jsoriano added a commit to jsoriano/beats that referenced this pull request Oct 22, 2018
add_host_metadata keeps a cache of the host data collected, this cache
is now updated atomically.

(cherry picked from commit 74b9c6c)
jsoriano added a commit that referenced this pull request Oct 23, 2018
add_host_metadata keeps a cache of the host data collected, this cache
is now updated atomically.

(cherry picked from commit 74b9c6c)
jsoriano added a commit to jsoriano/beats that referenced this pull request Oct 23, 2018
add_host_metadata keeps a cache of the host data collected, this cache
is now updated atomically.

(cherry picked from commit 74b9c6c)
jsoriano added a commit that referenced this pull request Oct 23, 2018
add_host_metadata keeps a cache of the host data collected, this cache
is now updated atomically.

(cherry picked from commit 74b9c6c)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
add_host_metadata keeps a cache of the host data collected, this cache
is now updated atomically.

(cherry picked from commit 6d25fd9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants