Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrations infra, part 3: shard-local worker logic #20840

Merged
merged 8 commits into from
Jul 18, 2024

Conversation

bashtanov
Copy link
Contributor

Backend to gather from migration definition and to provide to worker information necessary to perform per-partition work.
Worker to spawn, retry and gather results from partition operations.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@bashtanov
Copy link
Contributor Author

/dt

With reconciliation loop, raft0 leadership updates and migration updates
running concurrently and stuffed with yield points, we need to spread them
apart to avoid race conditions.
In addition to NTP and sought migration state, worker may need some
information from the migration definition. Adding data types for pieces
of this information related to individual partitions.
… state

To pull topic-specific details from migration definition later
… info

Build info packs to be dispatched to workers on shards when scheduling
partition work.
Spawn, retry and gather results from partition operations on shards.
Since we process RPC replies asynchronously, the reply map may be written into
while we read from it. Move it to process.
@bashtanov bashtanov force-pushed the migrations-infra-abw branch from 5f4cccb to 79ec19a Compare July 15, 2024 10:25
@vbotbuildovich
Copy link
Collaborator

@bashtanov
Copy link
Contributor Author

test failure: #21376

// this call must only tinker with `it` within the current seastar task,
// it may be invalidated later!
ssx::spawn_with_gate(_gate, [this, it]() {
return do_work(it).then([ntp = it->first,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think here we should use then_wrapped to handle exception thrown from do_work, wdyt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks, we need to make sure we don't lose it. I'd rather wrap everything in do_work into a try-catch so it never throws, since it returns an error code. Do you think it'd be okay?

@mmaslankaprv
Copy link
Member

overal lgtm, one comment on exception handling

@bashtanov bashtanov requested a review from mmaslankaprv July 17, 2024 13:25
@bashtanov bashtanov merged commit 5658417 into redpanda-data:dev Jul 18, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants