-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-5632] cluster: Reset cloud storage metric in the cluster::partition::remove_persistent_state
#21581
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from a pointer safety perspective. But a couple questions still
src/v/cluster/partition.cc
Outdated
@@ -902,6 +902,7 @@ partition::get_cloud_term_last_offset(model::term_id term) const { | |||
} | |||
|
|||
ss::future<> partition::remove_persistent_state() { | |||
_cloud_storage_probe = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if this belongs in partition::stop()
? Also wondering if we should be calling _cloud_storage_probe->clear_metrics()
, as we do with the partition::_probe
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added clear metrics method
cluster::partition::remove_persistent_state
cluster::partition::remove_persistent_state
... in the partition::remove_persistent state. The probe has access to the global state so it's not safe to keep it after the partition is stopped and `remove_persistent_state` method is invoked. In cases when the partition is recreated with different revision id the 'double metric registration' error could be triggered. The double metric registration exception prevents normal parittion removal and subsequent partition creation may trigger assertion.
52fb95a
to
51af1b4
Compare
/backport v24.2.x |
/backport v24.1.x |
The probe has access to the global state so it's not safe to keep it after the partition is stopped and
remove_persistent_state
method is invoked. In cases when the partition is recreated with different revision id the 'double metric registration' error could be triggered.When the exception is triggered the
partition_manager
is not able to remove the partition from the collection and subsequent re-creation of the partition with different ntp produces assertion because ntp already exists.Backports Required
Release Notes