Skip to content

storcon: revise fill logic to prioritize AZ #10411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 16, 2025
Merged

storcon: revise fill logic to prioritize AZ #10411

merged 3 commits into from
Jan 16, 2025

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Jan 15, 2025

Problem

Node fills were limited to moving (total shards / node_count) shards. In systems that aren't perfectly balanced already, that leads us to skip migrating some of the shards that belong on this node, generating work for the optimizer later to gradually move them back.

Summary of changes

  • Where a shard has a preferred AZ and is currently attached outside this AZ, then always promote it during fill, irrespective of target fill count

@jcsp jcsp added t/feature Issue type: feature, for new features or requests c/storage/controller Component: Storage Controller labels Jan 15, 2025
Copy link

github-actions bot commented Jan 15, 2025

7326 tests run: 6947 passed, 0 failed, 379 skipped (full report)


Flaky tests (3)

Postgres 17

  • test_storage_controller_node_deletion[False]: debug-x86-64

Postgres 16

Postgres 14

  • test_physical_replication_config_mismatch_max_locks_per_transaction: release-x86-64

Code coverage* (full report)

  • functions: 33.7% (8429 of 25029 functions)
  • lines: 49.2% (70497 of 143355 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
47d527d at 2025-01-16T17:40:20.275Z :recycle:

@jcsp jcsp marked this pull request as ready for review January 16, 2025 10:22
@jcsp jcsp requested a review from a team as a code owner January 16, 2025 10:22
@jcsp jcsp requested review from arpad-m and VladLazar January 16, 2025 10:22
Copy link
Contributor

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update test_graceful_cluster_restart such that the nodes have configured AZs?

@jcsp
Copy link
Contributor Author

jcsp commented Jan 16, 2025

Adding test separately in #10427 because I wanted to use num_azs test parameter that I added in #10420

@jcsp jcsp enabled auto-merge January 16, 2025 14:38
@jcsp jcsp added this pull request to the merge queue Jan 16, 2025
Merged via the queue into main with commit da13154 Jan 16, 2025
80 checks passed
@jcsp jcsp deleted the jcsp/storcon-az-fills branch January 16, 2025 17:34
github-merge-queue bot pushed a commit that referenced this pull request Feb 11, 2025
## Problem

In #10411 fill logic changes
such that it benefits us to test it with & without AZs set up. I didn't
extend the test inline in that PR because there were overlapping test
changes in flight to add `num_az` parameter.

## Summary of changes

- Parameterise test on AZ count (1 or 2)
- When AZ count is 2, use a different balance check that just asserts
the _tenants_ are balanced (since AZ affinity is chosen on a per-tenant
basis)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/controller Component: Storage Controller t/feature Issue type: feature, for new features or requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants