Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metadata Immutability] Change different indices lookup objects from array type to lists #14723
base: main
Are you sure you want to change the base?
[Metadata Immutability] Change different indices lookup objects from array type to lists #14723
Changes from all commits
d2289d2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 521 in server/src/main/java/org/opensearch/action/admin/cluster/health/TransportClusterHealthAction.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/action/admin/cluster/health/TransportClusterHealthAction.java#L521
Check warning on line 339 in server/src/main/java/org/opensearch/action/admin/indices/datastream/GetDataStreamAction.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/action/admin/indices/datastream/GetDataStreamAction.java#L339
Check warning on line 709 in server/src/main/java/org/opensearch/cluster/metadata/IndexNameExpressionResolver.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/cluster/metadata/IndexNameExpressionResolver.java#L709
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be expensive operation to copy the set to a list every time for index resolution.
Have run micro benchmark on your changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here ConcreteAllIndices(allIndices) are unmodifiableSet, so List.copyOf() will not really create a copy .
There is an implementation Note in javadoc of this API - "If the given Collection is an unmodifiable List, calling copyOf will generally not create a copy". Also if we look in to the internal implementation of Lisy.copyOf() it's clear that, it's not creating a copy if collection is an AbstractImmutableList.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are doing
Collections.unmodifiableSet(allIndices);
which creates new object ofUnmodifiableSet
.UnmodifiableSet
extendsUnmodifiableCollection
and it is not extendingAbstractImmutableList
. It will end up copying the elements. Can you cross check?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are correct. So what do you suggest here?
concrete indices treated as Set in the existing implementation -
OpenSearch/server/src/main/java/org/opensearch/cluster/metadata/Metadata.java
Line 1636 in 67a2e4c
That's why I kept only concrete indices as Set and all other indices as List. If we keep concrete indices also as List, then this conversion won't be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes right, lets check the usage in the code if it has to be Set, else i agree keeping it as list would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @akolarkunnu for performing the perf runs. Since there is a change in ClusterStateHealth, we might need a similar run for
health
API calls. Could we also capture the memory allocations corresponding to the duration of run and compare them before / after change ?Also, trying to understand as to why 50K aliases were created and not the actual index ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also create indices, its just you need to modify node limits on max indices/shards per node - plus the the change is cluster state is identical (as in same/similar objects are modified).
I don't think we would be able to capture changes (improvements actually) here, the only thing we have omitted here is back and worth list-set conversions for some objects (stored as one type, retrieved as different). [Recalling from my own PR when I had tried to benchmark some changes in cluster state]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Health API iterates over the IndexMetadata to compute the overall cluster health. Hence i was thinking that creating indexes rather than alias would have more entries to iterate for health computation. For cluster health computation, there were less memory allocations seen when it was switched to size based array iteration. Ref PR :- #15492
Please take a look at the memory allocations in flame graph in the referenced PR. Could we verify that switching from array to set iterator does not incur additional memory allocations for health API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akolarkunnu Can you please take these changes to closure by profiling health API as in #15492
That should bring confidence on whether or not to get these changes in code or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will resume this work soon.
Check warning on line 168 in server/src/main/java/org/opensearch/index/recovery/RemoteStoreRestoreService.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/index/recovery/RemoteStoreRestoreService.java#L168