Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheusremotewrite: release lock when wal empty #29297

Closed

Conversation

frzifus
Copy link
Member

@frzifus frzifus commented Nov 16, 2023

Description:

Link to tracking Issue:

Closes #19363

Testing:

Documentation:

cc @sh0rez @MovieStoreGuy @Aneurysm9

sh0rez and others added 4 commits November 16, 2023 01:39
Previously, the wal tailing routine held an exclusive lock over the
entire wal, while waiting for new sections to be written.

This led to a deadlock situtation, as a locked wal naturally can not be written to.

To remedy this, the lock is now only held during actual read attempts,
not in between.

Furthermore, inotify watching of the wal dir has been replaced with
channel-based signaling between the producers and the tailing routine,
so that the latter blocks until any new writes have happened.
Initializes the WAL with a default nop logger, instead of previous nil
which led to panics.
Signed-off-by: Benedikt Bongartz <bongartz@klimlive.de>
Comment on lines 338 to 355
func (prwe *prweWAL) try(index uint64) (*prompb.WriteRequest, error) {
prwe.mu.Lock()
defer prwe.mu.Unlock()
prwe.log.Debug("read", zap.Uint64("index", index))
protoBlob, err := prwe.wal.Read(index)
if err != nil {
return nil, err
}

var req prompb.WriteRequest
if err := proto.Unmarshal(protoBlob, &req); err != nil {
return nil, err
}

prwe.rWALIndex.Add(1)
return &req, nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why move this up? isn't this highly specific functionality internal to the readFromWAL method?

}

return &wal, nil
log: logger,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the beginning of the run() method replaces this logger with one from the context – may lead to unexpected inconsistencies

probably best to remove it from run

Copy link
Contributor

github-actions bot commented Dec 1, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Dec 1, 2023
@frzifus frzifus removed the Stale label Dec 1, 2023
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Copy link
Contributor

github-actions bot commented Jan 4, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 4, 2024
@frzifus frzifus removed the Stale label Jan 16, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 31, 2024
@frzifus frzifus removed the Stale label Feb 5, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 20, 2024
@dashpole dashpole removed the Stale label Feb 20, 2024
Copy link
Contributor

github-actions bot commented Mar 6, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Mar 6, 2024
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exporter/prometheusremotewrite: wal leads to oom under high load
3 participants