Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpk: support node-local core assignment API #21573

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

r-vasquez
Copy link
Contributor

@r-vasquez r-vasquez commented Jul 22, 2024

Redpanda clusters are switching to node-local core assignments, which means that the API for dispatching cross-core partition moves will be different.

This is guarded behind the new feature: node_local_core_assignment. Those clusters with the feature enabled will issue the core movements independently from the node change.

If it's not enabled/present, we still do the old API calls.

Fixes #21562

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

  • none

@r-vasquez r-vasquez requested a review from daisukebe July 22, 2024 23:58
@r-vasquez r-vasquez force-pushed the partition-move-changes branch 2 times, most recently from d1e65a9 to 66a94ee Compare July 23, 2024 23:55
@r-vasquez r-vasquez changed the title WIP: rpk: support node-local core assignment API rpk: support node-local core assignment API Jul 23, 2024
@r-vasquez r-vasquez marked this pull request as ready for review July 23, 2024 23:56
Comment on lines 81 to 84
for i, newa := range newAssignmentList {
i := i
newa := newa
i, newa := i, newa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for the reassignment here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or rather, is there a reason to reassign to a local with the same name?

Copy link
Contributor Author

@r-vasquez r-vasquez Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An old habit of avoiding mistakingly sharing the loop variable across different routines: https://go.dev/blog/loopvar-preview (also, this original code was written before Go 1.22)

This is fixed in Go 1.22 though so we should be fine removing it from the codebase (we have other instances of this across rpk), but if it's ok for you, I will leave this one here until we have proper ducktape tests for this particularly dangerous command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That I'm familiar with, I guess I'm just used to renaming the loop variables when I do it.

gene-redpanda
gene-redpanda previously approved these changes Jul 25, 2024
Copy link
Contributor

@gene-redpanda gene-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is non blocking, this looks good.

@r-vasquez
Copy link
Contributor Author

Force Push:

  • Delete -1 change to 0 in core values for new clusters with node_local_core_assignment on.
  • Improved the error message for replication factor changes

@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/52042#0190eadc-2bed-47ed-8292-a3c570c1e8f9:

"rptest.tests.availability_test.AvailabilityTests.test_recovery_after_catastrophic_failure"

Redpanda clusters are switching to a node-local
core assignments, which means that the API for
dispatching cross-core partition moves will be
different.

This is guarded behind the new feature:
node_local_core_assignment. Those clusters with
the feature enabled will issue the core movements
independently from the node changed.

If it's not enabled/present, we still do the old
API calls.
@r-vasquez
Copy link
Contributor Author

Force Push:

  • Exit early if there are no movements required.
  • Add debug logs for core movements.
  • Only print the success message if there is at least 1 successful change.

@r-vasquez
Copy link
Contributor Author

@twmb twmb merged commit 46fc276 into redpanda-data:dev Jul 25, 2024
19 of 22 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rpk: support node-local core assignment APIs for dispatching cross-core partition moves
6 participants