Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.3.x][backport] tx/producer eviction: fix a bug with incorrect eviction using stale pids #24878 #24879

Merged
merged 6 commits into from
Jan 27, 2025

Conversation

bharathv
Copy link
Contributor

pid is currently captured in the lambda could become stale if it got fenced
(with an epoch bump). The change forces to provide a pid as a part of
the eviction hook, which would be the current pid at the time of
eviction.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

Bug Fixes

  • Fixes an issue where transactions incorrectly timeout due incorrect cleanup of evicted producers.

.. in the presence of evictions

Cherry picked from 64e47ca
Currently exceptions are thrown which are propagated as generic (and
confusing) RPC server errors which are prone to misinterpretation by the
callers.

(cherry picked from commit f264502)
Today this is caught by the rpc_server and propagated as a server error,
instead this should be retriable from the caller side.

(cherry picked from commit 176241e)
…cers

After 4ee6b02 cleanup happens asynchronously after eviction. If
there is a request for a new producer_id and the associated producer got
evicted, clean it up to make room for a new producer.

(cherry picked from commit 21781c7)
pid is captured in the lambda could be a stale if the pid got fenced
(with an epoch bump). The change forces to provide a pid as a part of
the eviction hook, which would be the current pid at the time of
eviction.

(cherry picked from commit 79362fa)
@bharathv
Copy link
Contributor Author

/dt

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jan 22, 2025

CI test results

test results on build#61016
test_id test_kind job_url test_status passed
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61016#01948b4c-82c7-4332-93f2-f11ca147aa61 FLAKY 1/2
test results on build#61033
test_id test_kind job_url test_status passed
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61033#01948cf4-aeb9-4836-bc76-c712b627518b FLAKY 1/2
test results on build#61072
test_id test_kind job_url test_status passed
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61072#01948fd8-2735-4877-90c4-057423f83faa FAIL 0/2
rptest.tests.shard_placement_test.ShardPlacementTest.test_node_join.disable_license=True ducktape https://buildkite.com/redpanda/redpanda/builds/61072#01949034-022e-4994-ae13-50f26bca6ca7 FLAKY 1/2

@bharathv bharathv requested a review from mmaslankaprv January 22, 2025 07:37
@bharathv bharathv marked this pull request as ready for review January 22, 2025 07:37
@lf-rep lf-rep merged commit 6e19abc into redpanda-data:v24.3.x Jan 27, 2025
15 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants