Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor IndexAbstraction to not use IndexMetadata #79080

Merged
merged 19 commits into from
Oct 15, 2021

Conversation

martijnvg
Copy link
Member

Most users of an IndexAbstraction instance doesn't need to use the
IndexMetadata instances that getIndices() and getWriteIndex() returns.

Cluster state variables/parameters can be used in places that access to
IndexMetadata is required.

By changing the getIndices() and getWriteIndex() methods to return Index
instance, the indices lookup can be reused across different cluster states.
This should be possible in cases that don't change an index hidden status or
open and closes indices or when adding / removing aliases, data streams or indices.

This change should allow for #79004

Most users of an `IndexAbstraction` instance doesn't need to use the
`IndexMetadata` instances that `getIndices()` and `getWriteIndex()` returns.

Cluster state variables/parameters can be used in places that access to
`IndexMetadata` is required.

By changing the `getIndices()` and `getWriteIndex()` methods to return `Index`
instance, the indices lookup can be reused across different cluster states.
This should be possible in cases that don't change an index hidden status or
 open and closes indices or when adding / removing aliases, data streams or indices.

This change should allow for elastic#79004
@martijnvg martijnvg added :Data Management/Indices APIs APIs to create and manage indices and templates >refactoring v7.16.0 labels Oct 14, 2021
@martijnvg martijnvg marked this pull request as ready for review October 14, 2021 09:31
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Oct 14, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg this looks really good. I just have a couple of smaller suggestions/questions on performance. Maybe they're irrelevant but I'm paranoid now :)

AliasMetadata aliasMetadata = index.getAliases().get(indexAbstraction.getName());
for (Index index : indexAbstraction.getIndices()) {
String concreteIndex = index.getName();
AliasMetadata aliasMetadata = state.metadata().index(concreteIndex).getAliases().get(indexAbstraction.getName());
if (norouting.contains(concreteIndex) == false) {
if (aliasMetadata != null && aliasMetadata.searchRoutingValues().isEmpty() == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move:

                    AliasMetadata aliasMetadata = state.metadata().index(concreteIndex).getAliases().get(indexAbstraction.getName());

down here right to not do the newly added map lookup more often than necessary?

}
} else if (indexAbstraction.getType() == IndexAbstraction.Type.DATA_STREAM && context.isResolveToWriteIndex()) {
IndexMetadata writeIndex = indexAbstraction.getWriteIndex();
IndexMetadata writeIndex = metadata.index(indexAbstraction.getWriteIndex());
if (addIndex(writeIndex, context)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should make addIndex just accept Metadata and Index instead of IndexMetadata, then we can do without the additional map lookup altogether except for when we resolve with ignored_throttled==true?

private final List<String> referencedByDataStreamAliases;

public DataStream(org.elasticsearch.cluster.metadata.DataStream dataStream,
List<IndexMetadata> dataStreamIndices,
List<String> aliases) {
this.dataStream = dataStream;
this.dataStreamIndices = List.copyOf(dataStreamIndices);
this.writeIndex = dataStreamIndices.get(dataStreamIndices.size() - 1);
this.dataStreamIndices = dataStreamIndices.stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance does matter a little here if I remember correctly (might just be our benchmarks with a ton of DS though). Might be worth it to just pre-size an array here and copy or wrap that or to adjust the constructor outright because we might be able to just pass the Index list here to begin with?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just pass down a List here. The place where this gets constructed can just create a List instead of List.

}

public Alias(org.elasticsearch.cluster.metadata.DataStreamAlias dataStreamAlias,
List<IndexMetadata> indicesOfAllDataStreams,
IndexMetadata writeIndexOfWriteDataStream) {
this.aliasName = dataStreamAlias.getName();
this.referenceIndexMetadatas = indicesOfAllDataStreams;
this.writeIndex = writeIndexOfWriteDataStream;
this.referenceIndexMetadatas = indicesOfAllDataStreams.stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reservations as with the DS class, we should be careful not to alter the performance characteristics of this too much shouldn't we?

@martijnvg
Copy link
Member Author

@elasticmachine run elasticsearch-ci/bwc

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the iterations @martijnvg !

@original-brownbear
Copy link
Member

@elasticmachine run elasticsearch-ci/bwc (unrelated failure that reproduces on master also)

@martijnvg
Copy link
Member Author

@elasticmachine update branch

@martijnvg
Copy link
Member Author

@elasticmachine update branch

@martijnvg
Copy link
Member Author

@elasticmachine run elasticsearch-ci/part-1

@martijnvg martijnvg merged commit bedeb2c into elastic:master Oct 15, 2021
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Oct 15, 2021
Backport of elastic#79080 to 7.x branch.

Most users of an `IndexAbstraction` instance doesn't need to use the
`IndexMetadata` instances that `getIndices()` and `getWriteIndex()` returns.

Cluster state variables/parameters can be used in places that access to
`IndexMetadata` is required.

By changing the `getIndices()` and `getWriteIndex()` methods to return `Index`
instance, the indices lookup can be reused across different cluster states.
This should be possible in cases that don't change an index hidden status or
 open and closes indices or when adding / removing aliases, data streams or indices.

This change should allow for elastic#79004
martijnvg added a commit that referenced this pull request Oct 15, 2021
Backport of #79080 to 7.x branch.

Most users of an `IndexAbstraction` instance doesn't need to use the
`IndexMetadata` instances that `getIndices()` and `getWriteIndex()` returns.

Cluster state variables/parameters can be used in places that access to
`IndexMetadata` is required.

By changing the `getIndices()` and `getWriteIndex()` methods to return `Index`
instance, the indices lookup can be reused across different cluster states.
This should be possible in cases that don't change an index hidden status or
 open and closes indices or when adding / removing aliases, data streams or indices.

This change should allow for #79004
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >refactoring Team:Data Management Meta label for data/management team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants