Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clamp auto-expand replicas to the closest value #87505

Merged
merged 8 commits into from
Jun 13, 2022

Conversation

pxsalehi
Copy link
Member

@pxsalehi pxsalehi commented Jun 8, 2022

Ensure that the number of replicas chosen for an auto-expand-able index
is within the range of the available data nodes, i.e., excluding those
nodes that cannot be assigned a replica.

Closes #84788

@pxsalehi pxsalehi added >bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Jun 8, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @pxsalehi, I've created a changelog YAML for you.

@pxsalehi
Copy link
Member Author

pxsalehi commented Jun 9, 2022

@elasticmachine update branch

pxsalehi and others added 3 commits June 9, 2022 19:41
Ensure that the number of replicas chosen for an auto-expand-able shard
is within the range of the available data nodes, i.e., excluding those
nodes that cannot be assigned a replica.

Closes elastic#84788
@pxsalehi pxsalehi force-pushed the ps-auto-expand-issue-84788 branch from 11e398e to 1c1b9ea Compare June 9, 2022 17:44
@pxsalehi pxsalehi marked this pull request as ready for review June 9, 2022 17:53
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jun 9, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@pxsalehi pxsalehi requested a review from henningandersen June 9, 2022 17:53
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, otherwise this looks good.

equalTo(String.valueOf(initialReplicas))
);
});
ensureGreen(indexName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unimportant to the test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, at least for the sake of making it clear what the test expects, we should leave it here. Removing it didn't seem to really speedup the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not about test speed. But waiting for green here signals that it is important to the test. But the test will pass without this, since our auto-expand calculation is independent of the status of the index. Hence I think we should simply remove it.

});

ensureGreen(indexName);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also remove the setting and see that it jumps back to the initialReplicas value, thus checking that it clamps to the higher bound too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ec7d7f9.


// Package private only for testing purposes!
int calculateDesiredNumberOfReplicas(int numMatchingDataNodes) {
final int min = minReplicas();
final int max = getMaxReplicas(numMatchingDataNodes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Math.min done in getMaxReplicas seems redundant now, I think we should remove it, it is somewhat confusing now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good point! It also simplifies the code and the parsing test. Done in ec7d7f9.

@@ -282,4 +282,16 @@ public void testOnlyAutoExpandAllocationFilteringAfterAllNodesUpgraded() {
terminate(threadPool);
}
}

public void testCalculateDesiredNumberOfReplicas() {
String settingValue = randomFrom("0-all", "0-3");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explore the space a bit more here, for instance by choosing both lower and upper bound randomly like:

int lowerBound = between(0,9);
int upperBound = between(lowerBound, 10);
String settingValue = lowerBound + "-" + upperBound;

Having a dedicated test (or random variation) with "all" also makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ec7d7f9.

pxsalehi and others added 2 commits June 13, 2022 09:18
…pandReplicas.java

Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>
…metadata/AutoExpandReplicasIT.java

Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>
@pxsalehi
Copy link
Member Author

@elasticmachine update branch

@pxsalehi pxsalehi requested a review from henningandersen June 13, 2022 12:59
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, provided tests pass

equalTo(String.valueOf(initialReplicas))
);
});
ensureGreen(indexName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not about test speed. But waiting for green here signals that it is important to the test. But the test will pass without this, since our auto-expand calculation is independent of the status of the index. Hence I think we should simply remove it.

@pxsalehi
Copy link
Member Author

Thanks Henning! If index status is irrelevant here, then I agree with you. I removed them! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-expand replicas should clamp to closest value
4 participants