Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull index routing into strategy object #77211

Merged
merged 6 commits into from
Sep 10, 2021

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 2, 2021

This pulls the calculation of the shard id for an (id, routing) pair
into a little strategy class, IndexRouting. This is easier to test and
should be easier to extend.

My hope is that this is an incremental readability improvement. My
ulterior motive is that this is where I want to land our new
routing-by-dimensions work for tsdb.

This pulls the calculation of the shard id for an (id, routing) pair
into a little strategy class, `IndexRouting`. This is easier to test and
should be easier to extend.

My hope is that this is an incremental readability improvement. My
ulterior motive is that this is where I want to land our new
routing-by-dimensions work for tsdb.
@nik9000 nik9000 added >non-issue :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v8.0.0 v7.16.0 labels Sep 2, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 2, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member Author

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel bad adding another per index object, but ultimately tsdb is going to need a larger object in the same place. In my mind its just a different subclass of IndexRouting. And it'll contain a compiled matcher-like-thing. So I think I want to build some way to deduplicate these things sooner rather than later.

}
}

public void testPartitionedIndex() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and all tests below it duplication OperationRoutingTests. They are slightly lower level in that they don't test the creation of the IndexRouting from information in the IndexMetadata. But they are close. Paranoia drove me to duplicate them rather than move them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've reworked how this is integrated I've moved these test from OperationRoutingTests .

@henningandersen
Copy link
Contributor

We discussed this on another channel and Nik will work on a different solution, trying to just share the IndexRouting per bulk request at least in the initial iteration.

Copy link
Member Author

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henningandersen I've updated this based on our conversation this morning.

}
}

public void testPartitionedIndex() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've reworked how this is integrated I've moved these test from OperationRoutingTests .

@nik9000
Copy link
Member Author

nik9000 commented Sep 9, 2021

run elasticsearch-ci/part-2

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Nik.

Comment on lines +479 to +485
IndexRouting indexRouting = indexRoutings.computeIfAbsent(
concreteIndex,
idx -> IndexRouting.fromIndexMetadata(clusterState.metadata().getIndexSafe(idx))
);
ShardId shardId = clusterService.operationRouting()
.indexShards(clusterState, concreteIndex.getName(), indexRouting, docWriteRequest.id(), docWriteRequest.routing())
.shardId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a future refinement (not necessarily in this PR), I wonder if OperationRouting should create the IndexRouting and the IndexRouting object then should have the indexShards (and more) method(s)? So the interaction here would be something like:

Suggested change
IndexRouting indexRouting = indexRoutings.computeIfAbsent(
concreteIndex,
idx -> IndexRouting.fromIndexMetadata(clusterState.metadata().getIndexSafe(idx))
);
ShardId shardId = clusterService.operationRouting()
.indexShards(clusterState, concreteIndex.getName(), indexRouting, docWriteRequest.id(), docWriteRequest.routing())
.shardId();
ShardId shardId = indexRoutings.computeIfAbsent(
concreteIndex,
idx -> clusterService.operationRouting().indexRouting(clusterState, idx))
).indexShards(docWriteRequest.id(), docWriteRequest.routing())
.shardId();

I do see the problem in doing so for ShardSplittingQuery though. And it may be affected by your future work too, where you may want bulk to only create a new IndexRouting instance when the base parameters are different. So I am OK leaving this as is for now if you prefer to not tackle that now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving the methods like indexShards into makes sense. I liked that the class as it stands now doesn't need to know about stuff like the routing table. I think which was is right will show up more clearly after I add the source based routing stuff so I'll keep it as it is for now. Once I have a proposal for source based routing.

* Build the routing from {@link IndexMetadata}.
*/
public static IndexRouting fromIndexMetadata(IndexMetadata indexMetadata) {
if (indexMetadata.getRoutingPartitionSize() == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, why not use the isRoutingPartitionedIndex:

Suggested change
if (indexMetadata.getRoutingPartitionSize() == 1) {
if (indexMetadata.isRoutingPartitionedIndex() == false) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth. I suppose I had the effects of the partition size in my head and liked the way I had it better. But I think you are right its more clear that way. I'll push the change.

@nik9000 nik9000 added auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) auto-backport-and-merge labels Sep 10, 2021
@elasticsearchmachine elasticsearchmachine merged commit b0b5cbd into elastic:master Sep 10, 2021
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.x Commit could not be cherrypicked due to conflicts

To backport manually run backport --upstream elastic/elasticsearch --pr 77211

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Sep 10, 2021
This pulls the calculation of the shard id for an (id, routing) pair
into a little strategy class, `IndexRouting`. This is easier to test and
should be easier to extend.

My hope is that this is an incremental readability improvement. My
ulterior motive is that this is where I want to land our new
routing-by-dimensions work for tsdb.
elasticsearchmachine pushed a commit that referenced this pull request Sep 10, 2021
This pulls the calculation of the shard id for an (id, routing) pair
into a little strategy class, `IndexRouting`. This is easier to test and
should be easier to extend.

My hope is that this is an incremental readability improvement. My
ulterior motive is that this is where I want to land our new
routing-by-dimensions work for tsdb.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.16.0 v8.0.0-alpha2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants