-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize SLM Policy Queries #79341
Optimize SLM Policy Queries #79341
Conversation
Same as elastic#79321 but for SLM policies. Enhances RepositoryData accordingly to enable the optimization.
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok but I think we can simplify those nested higher-order generic types a bit.
@@ -156,7 +156,8 @@ public void testEnforcedCooldownPeriod() throws IOException { | |||
SnapshotState.SUCCESS, | |||
SnapshotsService.SHARD_GEN_IN_REPO_DATA_VERSION.minimumCompatibilityVersion(), | |||
0L, // -1 would refresh RepositoryData and find the real version | |||
0L // -1 would refresh RepositoryData and find the real version | |||
0L, // -1 would refresh RepositoryData and find the real version, | |||
null // null would refresh RepositoryData and find the real version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment suggests that this shouldn't be null
.
@@ -666,7 +666,7 @@ private static SnapshotsInRepo sortSnapshots( | |||
} | |||
} | |||
|
|||
private static Predicate<SnapshotInfo> filterBySLMPolicies(String[] slmPolicies) { | |||
private static Tuple<Predicate<SnapshotInfo>, BiPredicate<SnapshotId, RepositoryData>> filterBySLMPolicies(String[] slmPolicies) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tuple<Predicate<SnapshotInfo>, BiPredicate<SnapshotId, RepositoryData>>
is getting a bit convoluted, really it's just the following but without the names or anywhere to hang docs:
interface SomeName {
boolean method1(SnapshotInfo snapshotInfo);
boolean method2(SnapshotId snapshotId, RepositoryData repositoryData);
}
I think the names & docs would be useful, let's make this a thing. Maybe we should be subclassing SnapshotPredicates
rather than having these higher-order fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ I did this for this specific case, now but I agree, it's a worthwhile cleanup all over :) I'll do it in a follow-up after FF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (I'm assuming we're ok with adding an extra field to RepositoryData
here)
Thanks David!
I think that's just fine. The extra bytes in the serialized objects hardly matter and on heap we're deduplicating the strings so they don't matter either. The performance gain when filtering get-snapshots by slm policy on the other hand is quite significant (particularly for the expected case of listing hundreds of backups but maybe also having thousands of ILM created frozen tier snapshots in the repo). |
Same as elastic#79321 but for SLM policies. Enhances RepositoryData accordingly to enable the optimization.
* upstream/master: Validate tsdb's routing_path (elastic#79384) Adjust the BWC version for the return200ForClusterHealthTimeout field (elastic#79436) API for adding and removing indices from a data stream (elastic#79279) Exposing the ability to log deprecated settings at non-critical level (elastic#79107) Convert operator privilege license object to LicensedFeature (elastic#79407) Mute SnapshotBasedIndexRecoveryIT testSeqNoBasedRecoveryIsUsedAfterPrimaryFailOver (elastic#79456) Create cache files with CREATE_NEW & SPARSE options (elastic#79371) Revert "[ML] Use a new annotations index for future annotations (elastic#79151)" [ML] Use a new annotations index for future annotations (elastic#79151) [ML] Removing legacy code from ML/transform auditor (elastic#79434) Fix rate agg with custom `_doc_count` (elastic#79346) Optimize SLM Policy Queries (elastic#79341) Fix execution of exists query within nested queries on field with doc_values disabled (elastic#78841) Stricter UpdateSettingsRequest parsing on the REST layer (elastic#79227) Do not release snapshot file download permit during recovery retries (elastic#79409) Preserve request headers in a mixed version cluster (elastic#79412) Adjust versions after elastic#79044 backport to 7.x (elastic#79424) Mute BulkByScrollUsesAllScrollDocumentsAfterConflictsIntegTests (elastic#79429) Fail on SSPL licensed x-pack sources (elastic#79348) # Conflicts: # server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
`SnapshotPredicate` and `SnapshotPredicates` do almost the same thing and the way that the `SnapshotPredicates` is constructed is quite convoluted. This commit combines the two classes into one and simplifies the construction logic. Relates elastic#79341
`SnapshotPredicate` and `SnapshotPredicates` do almost the same thing and the way that the `SnapshotPredicates` is constructed is quite convoluted. This commit combines the two classes into one and simplifies the construction logic. Relates #79341
`SnapshotPredicate` and `SnapshotPredicates` do almost the same thing and the way that the `SnapshotPredicates` is constructed is quite convoluted. This commit combines the two classes into one and simplifies the construction logic. Relates #79341
Same as #79321 but for SLM policies. Enhances RepositoryData accordingly
to enable the optimization.