-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use remote recovery when topic is configured to use remote data #22908
Conversation
f461d52
to
837455f
Compare
837455f
to
1321799
Compare
new failures in https://buildkite.com/redpanda/redpanda/builds/53044#01915b88-9afa-4011-9e70-1a139c563e0f:
new failures in https://buildkite.com/redpanda/redpanda/builds/53548#01918f2f-9bd9-49ec-b693-5be4f3e5348e:
new failures in https://buildkite.com/redpanda/redpanda/builds/54942#01921e82-34f4-4454-997c-574449687a2a:
|
1321799
to
12c75b9
Compare
admin_client = admin_client or self._admin | ||
if tolerate_stopped_nodes: | ||
started_node_ids = {self.node_id(n) for n in self.started_nodes()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I find having a variable declared/undeclared depending on the condition somewhat error-prone and harder to read too. What do you think of something like the following?
if tolerate_stopped_nodes:
started_node_ids = {self.node_id(n) for n in self.started_nodes()}
node_check_predicate = lambda n: n in started_node_ids
else:
node_check_predicate = lambda n: True
...
ready = all([n['config_version'] >= config_version for n in status if node_check_predicate(n)])
12c75b9
to
010e107
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only nits left, so approving in case it's really urgent to merge
010e107
to
2cadadd
Compare
c766d9d
to
fef7f59
Compare
Signed-off-by: Michał Maślanka <michal@redpanda.com>
Signed-off-by: Michał Maślanka <michal@redpanda.com>
Added log entry emitted when partition instance is being created. The entry will allow us to quickly identify partition configuration. Signed-off-by: Michał Maślanka <michal@redpanda.com>
When topic is configured to use remote data the data can be used when force recovering partitions that lost majority. In this case instead of creating an empty partition replica instance we customize arguments passed into the `partition_manger::manage` method to enable remote recovery of replica data. Signed-off-by: Michał Maślanka <michal@redpanda.com>
fef7f59
to
b964da2
Compare
/ci-repeat 1 |
unrelated test failure: https://redpandadata.atlassian.net/issues/CORE-7002 |
Added replicating some data and waiting for then to be uploaded to the cloud when executing node wise recovery. This way a test is able to verify if cloud storage data are used when force re-configuring partitions with lost majority. Signed-off-by: Michał Maślanka <michal@redpanda.com>
b964da2
to
de93ab9
Compare
/ci-repeat 1 |
2 similar comments
/ci-repeat 1 |
/ci-repeat 1 |
// topic being cloud enabled implies existence of overrides | ||
ntp_config.get_overrides().recovery_enabled | ||
= storage::topic_recovery_enabled::yes; | ||
rtp.emplace(*initial_rev, cfg->partition_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with initial_version perhaps @ztlpn can take another look.
/ci-repeat 1 |
Retry command for Build#56133please wait until all jobs are finished before running the slash command
|
known ci failure: |
When topic is configured to use remote data the data can be used when
force recovering partitions that lost majority. In this case instead of
creating an empty partition replica instance we customize arguments
passed into the
partition_manger::manage
method to enable remoterecovery of replica data.
Backports Required
Release Notes
Improvements