Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partition_balancer: futurize ticks to avoid reactor stalls with large number of topics #11183

Merged
merged 6 commits into from
Jun 13, 2023

Conversation

bharathv
Copy link
Contributor

@bharathv bharathv commented Jun 3, 2023

Fixes https://github.com/redpanda-data/core-internal/issues/550

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

  • none

@bharathv bharathv force-pushed the async_pb branch 2 times, most recently from b8d558e to 6a62042 Compare June 5, 2023 15:38
@bharathv bharathv changed the title futurize partition balancer partition_balancer: futurize ticks to avoid reactor stalls with large number of topics Jun 5, 2023
@bharathv bharathv marked this pull request as ready for review June 5, 2023 15:46
@bharathv bharathv requested review from ztlpn and mmaslankaprv June 5, 2023 15:58
@bharathv bharathv self-assigned this Jun 5, 2023
@bharathv
Copy link
Contributor Author

bharathv commented Jun 5, 2023

/ci-repeat 2
skip-units
dt-repeat=10
tests/rptest/tests/nodes_decommissioning_test.py
tests/rptest/tests/partition_balancer_test.py

@bharathv
Copy link
Contributor Author

bharathv commented Jun 6, 2023

Repeat runs had a failure in the following test that is interesting.

Module: rptest.tests.partition_balancer_test
Class:  PartitionBalancerTest
Method: test_rack_awareness

That seems like a bug in node_status_backend's RPC connection cache.. the backend uses its own connection cache and not application level cache.. it doesn't invalidate connections to dead brokers . The test restarts (picks a random action one of which is a restart) brokers and from that point on, the transport to the broker is broken and never replaced.. balancer eventually considers them as unavailable.. this is a serious bug, will fix in a separate patch..

INFO  2023-06-05 20:57:27,649 [shard 0] cluster - partition_balancer_planner.cc:195 - node 2 is unresponsive, time since last status reply: 11547 ms
INFO  2023-06-05 20:57:27,649 [shard 0] cluster - partition_balancer_planner.cc:195 - node 1 is unresponsive, time since last status reply: 27901 ms
INFO  2023-06-05 20:57:27,649 [shard 0] cluster - partition_balancer_planner.cc:195 - node 2 is unresponsive, time since last status reply: 11547 ms
INFO  2023-06-05 20:57:27,650 [shard 0] cluster - partition_balancer_planner.cc:195 - node 1 is unresponsive, time since last status reply: 27902 ms
INFO  2023-06-05 20:57:27,650 [shard 0] cluster - partition_balancer_planner.cc:195 - node 2 is unresponsive, time since last status reply: 11548 ms

src/v/cluster/topic_table_stable_iterator.h Outdated Show resolved Hide resolved
src/v/cluster/partition_balancer_planner.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_table.h Outdated Show resolved Hide resolved
src/v/cluster/partition_balancer_planner.cc Outdated Show resolved Hide resolved
src/v/cluster/partition_balancer_planner.cc Outdated Show resolved Hide resolved
src/v/cluster/partition_balancer_planner.cc Show resolved Hide resolved
src/v/cluster/partition_balancer_planner.cc Outdated Show resolved Hide resolved
@bharathv bharathv force-pushed the async_pb branch 3 times, most recently from c86edb9 to 45686ac Compare June 12, 2023 05:10
@bharathv bharathv requested a review from ztlpn June 12, 2023 05:48
for (auto it = topics.topics_iterator_begin();
it != topics.topics_iterator_end();
++it) {
for (const auto& a : it->second.get_assignments()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think topic table revision check on operator dereference is insufficient here, because it will happen only once at the beginning of the loop, and we need to check after each yield (due to create_partition_cmd).

bharathv added 3 commits June 12, 2023 13:59
This adaptor guards against iterator invalidatons across scheduling
points by taking a user provided revision function as input. The
expection is that the revision function returns a new revision anytime
the underlying container changes in a way that can cause iterator
invalidations.
Copy link
Contributor Author

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to rebase / force-push due to conflicts, waiting for CI.

src/v/utils/stable_iterator_adaptor.h Outdated Show resolved Hide resolved
bharathv added 3 commits June 12, 2023 16:43
mostly mechanical change that yields the control to avoid stalls.
.. to make the planner tick abortable.
aborts the balancer tick on node state changes.
@piyushredpanda piyushredpanda merged commit a867d89 into redpanda-data:dev Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants