Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use the latest configuration for followers metadata #19964

Merged
merged 4 commits into from
Jun 26, 2024

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Jun 24, 2024

When install snapshot request in processed by the follower it may not
always replace the follower log content. If the snapshot last included
offset is smaller than the follower dirty offset the follower should
prefix truncate all the data up to the snapshot last included offset but
keep all the entries which offset is greater than snapshot last included
offset.

Fixed the update of follower metadata state as it was always using the
snapshot configuration instead the latest from the configuration
manager.

Fixes: https://github.com/redpanda-data/core-internal/issues/1310

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • fixed handling of delayed snapshot requests that might lead to an assertion

Removed comment mentioning that we should add applying snapshot to stm
as this is already being done

Signed-off-by: Michał Maślanka <michal@redpanda.com>
When install snapshot request in processed by the follower it may not
always replace the follower log content. If the snapshot last included
offset is smaller than the follower dirty offset the follower should
prefix truncate all the data up to the snapshot last included offset but
keep all the entries which offset is greater than snapshot last included
offset.

Fixed the update of follower metadata state as it was always using the
snapshot configuration instead the latest from the configuration
manager.

Fixes: #core-internal/issues/1310

Signed-off-by: Michał Maślanka <michal@redpanda.com>
When install snapshot request is received by the current leader it must
unconditionally step down. This is the same behavior as in the case of
receiving an append entries request.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
When install snapshot request is delayed and deliver to the node after
it already made progress it can not lead to state inconsistencies. Added
test validating handing of delayed `install_snapshot` requests. The test
is validating behavior of both follower and the leader.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
@mmaslankaprv
Copy link
Member Author

unit test failure: #18496
test failure: #16332

@@ -2247,6 +2247,7 @@ ss::future<> consensus::hydrate_snapshot() {
co_await truncate_to_latest_snapshot(truncate_cfg.value());
}
_snapshot_size = co_await _snapshot_mgr.get_snapshot_size();
update_follower_stats(_configuration_manager.get_latest());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we should even hydrate a snapshot if the included_index < start offset, just return success and make it idempotent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this again, I think the fix makes sense but I think the issue though is L2241 already updates follower stats from snapshot config, so we are effectively doing it twice? Perhaps we should just swap L2242 & L2241 and make update_offset_from_snapshot() use latest config?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've removed the stats update from update_offset_from_snapshot. I am not sure if the stats should be updated from within that method as its name suggest updating offsets.

@mmaslankaprv mmaslankaprv requested a review from bharathv June 24, 2024 19:23
@mmaslankaprv mmaslankaprv force-pushed the fix-assert-snapshot branch from 4d47efa to e73aa28 Compare June 25, 2024 05:34
@mmaslankaprv
Copy link
Member Author

/ci-repeat 1

@mmaslankaprv mmaslankaprv merged commit 41a2e05 into redpanda-data:dev Jun 26, 2024
18 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-19964-v23.3.x-531 remotes/upstream/v23.3.x
git cherry-pick -x 2f287e296ac64465d119e36df261eb1976c4a130 c69709f4ab6a9ce049946610b43b818b3bcbe3e5 571aa19d3182c64275deb584d1d9be3c01d09210 e73aa28ad7fe07293752a7cffaad5b0d67056857

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants