Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshotsAction #78032

original-brownbear · 2021-09-20T18:43:40Z

Better to filter as early as possible to release the memory asap and not even fetch things we don't need to fetch to begin with.
There's still a bunch of spots remaining where similar optimizations can be added quickly before we implement the system index for the remaining searching/fetching that we can't easily exclude up-front.
This already gives vastly improved performance for many requests though for obvious reasons.

relates #74350

…Action Better to filter as early as possible to release the memory asap.

elasticmachine · 2021-09-20T18:43:42Z

Pinging @elastic/es-distributed (Team:Distributed)

tlrx

This looks good, I've left minor comments only.

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

tlrx · 2021-09-21T10:38:01Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

                predicate = fromSortValuePredicate.and(predicate);
            }
        }
        return predicate;
    }

+    // Builds a predicate that can be applied to a combination of snapshot id and repository data to filter out snapshots that are not


Regular Javadoc blocks look better in most tools/IDE, why not use them?

++ moved to regular docs all around :)

tlrx · 2021-09-21T10:42:24Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

-                        ? snapshotInfo -> after.compareTo(snapshotInfo.snapshotId().getName()) <= 0
-                        : snapshotInfo -> after.compareTo(snapshotInfo.snapshotId().getName()) >= 0;
+                    // already covered by preflight filtering
+                    return null;


Maybe add the Nullable annotation to the returned value of the method.

refactored this a little with Henning's suggestions now and added nullable wherever it was appropriate

tlrx · 2021-09-21T10:51:47Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

                } else {
-                    isAfter = order == SortOrder.ASC
+                    // TODO: cover via pre-flight predicate


Would it be too complex/noisy to address in this PR too?

Annoyingly enough yes because of #78032 (comment)

henningandersen

I left a few comments, otherwise this looks good.

henningandersen · 2021-09-20T20:02:49Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

+                    if (details == null) {
+                        return true;
+                    }
+                    return after <= details.getStartTimeMillis();


I think we need to check for -1?

++ added the check here and for the duration logic

henningandersen · 2021-09-20T20:04:21Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

+                    if (details == null) {
+                        return true;
+                    }
+                    return after >= details.getStartTimeMillis();


I would also check for -1 explicitly here.

++ added the check here and for the duration logic

henningandersen · 2021-09-21T05:22:46Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

                } else {
-                    isAfter = order == SortOrder.ASC
+                    // TODO: cover via pre-flight predicate


Did you intend to solve this case in this PR?

No I was going for a follow-up to keep this one shorter.

I should point out that the main reason adding this would blow up the size of the PR is a the peculiar behavior we have for after at the moment. The after parameter currently behaves differently from the from_sort_value in the count that it reports. It always reports the same overall count (the number of snapshots matching the query without after) while the from_sort_value simply reports the count of everything matching the request period.

I'm not sure the behavior of after is needed (as that param is just there for iteration anyway and I don't see how the total count even matters in use cases) but I think we should discuss + clean that up in a separate PR :)

I suppose it makes sense to report back the count on every iteration/pagination, since this is a moving target. So you could see 51 first time for a page of 50 and then after pagination you may see 55. I think it would be nice to have the count in a UI be somewhat consistent with the last page of data shown?

henningandersen · 2021-09-21T10:54:37Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

+    // required to answer the request before loading their SnapshotInfo from the repository
+    // TODO: extend this method to cover the pagination after value as well where possible
+    @Nullable
+    private static BiPredicate<SnapshotId, RepositoryData> buildSnapshotPreflightPredicate(


Can we instead have a single method that builds both the early and the late filter? I think that would make it easier to instinctly verify that the return nulls in buildFromSortValuePredicate are correct.

I made it build both predicates in the same code path now to make this easier to follow :)

henningandersen · 2021-09-21T10:55:55Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

+                final long after = Long.parseLong(fromSortValue);
+                return order == SortOrder.ASC ? (snapshotId, repositoryData) -> {
+                    final RepositoryData.SnapshotDetails details = repositoryData.getSnapshotDetails(snapshotId);
+                    if (details == null) {


Do we have tests generating data with no details as well as with details but -1 timestamp?

We don't unfortunately. I'll try adding some. We should have the infrastructure to do it with reasonable effort.

We should have the infrastructure to do it with reasonable effort

Turns out that we do not. We have infrastructure for testing this, but it's broken in a subtle way causing it to not actually test missing information here (it's build using the test-repo BlobStoreRepositoryTest but doesn't work because the repo never writes RepositoryData without the details at this point).
Adding tests for this case requires fixing that functionality which is a bigger piece of work I think. Given how the fact that the repository regenerates the snapshot details so efficiently these days it seems like quite the edge case and may be ok to test in a follow-up? :)

…hots-api

original-brownbear · 2021-09-25T09:52:25Z

Thanks @henningandersen and @tlrx I addressed what I could from your comments and tried to explain the open points. This should be good for another round whenever you have some time :)

henningandersen

LGTM.

henningandersen · 2021-09-26T19:11:35Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

@@ -438,7 +470,11 @@ private void snapshots(
                snapshotIdsToIterate,
                ignoreUnavailable == false,
                task::isCancelled,
-                (context, snapshotInfo) -> snapshotInfos.add(snapshotInfo),
+                predicate == null ? (context, snapshotInfo) -> snapshotInfos.add(snapshotInfo) : (context, snapshotInfo) -> {


I wonder if we could use a special predicate object instead of nul to avoid checking for null. We could still have a specific predicate we use to allow the assertions.

Slightly similar to what Henning suggested, I wonder if we could fold the logic of checking the nullity of predicates and then test directly into SnapshotPredicates, which would provide a preFilter and filter methods

Will do in a follow-up. I have a queued up round of improvements here that really want the special case null (where it would be the same code complexity if I used a placeholder predicate and checked for that) :)

henningandersen · 2021-09-26T19:19:00Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

+                        fromSortValuePredicate = null;
+                        break;
+                    case REPOSITORY:
+                        preflightPredicate = null;


Perhaps add a comment here explaining where this is handled?

henningandersen · 2021-09-26T19:24:30Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

-                        : (info -> compareName(snapshotName, repoName, info) > 0);
-                }
-                break;
+                // TODO: cover via pre-flight predicate


I am not sure you can unless we change how the count works? Though I guess we could possibly avoid getting the snapshot info for the "before" snapshots if they are not needed.

…hots-api

tlrx

LGTM, I've left a suggestion to move more logic - is possible - into SnapshotPredicates.

tlrx · 2021-09-27T08:17:08Z

...n/java/org/elasticsearch/action/admin/cluster/snapshots/get/TransportGetSnapshotsAction.java

@@ -438,7 +470,11 @@ private void snapshots(
                snapshotIdsToIterate,
                ignoreUnavailable == false,
                task::isCancelled,
-                (context, snapshotInfo) -> snapshotInfos.add(snapshotInfo),
+                predicate == null ? (context, snapshotInfo) -> snapshotInfos.add(snapshotInfo) : (context, snapshotInfo) -> {


Slightly similar to what Henning suggested, I wonder if we could fold the logic of checking the nullity of predicates and then test directly into SnapshotPredicates, which would provide a preFilter and filter methods

original-brownbear · 2021-09-27T09:23:37Z

Thanks Henning & Tanguy!

…Action (elastic#78032) Better to filter as early as possible to release the memory asap and not even fetch things we don't need to fetch to begin with. There's still a bunch of spots remaining where similar optimizations can be added quickly before we implement the system index for the remaining searching/fetching that we can't easily exclude up-front. This already gives vastly improved performance for many requests though for obvious reasons.

…Action (#78032) (#79321) Better to filter as early as possible to release the memory asap and not even fetch things we don't need to fetch to begin with. There's still a bunch of spots remaining where similar optimizations can be added quickly before we implement the system index for the remaining searching/fetching that we can't easily exclude up-front. This already gives vastly improved performance for many requests though for obvious reasons.

Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshots…

06486ee

…Action Better to filter as early as possible to release the memory asap.

original-brownbear added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Sep 20, 2021

elasticmachine added the Team:Distributed Meta label for distributed team (obsolete) label Sep 20, 2021

original-brownbear requested review from henningandersen and tlrx September 20, 2021 19:42

tlrx reviewed Sep 21, 2021

View reviewed changes

henningandersen reviewed Sep 21, 2021

View reviewed changes

original-brownbear added 14 commits September 21, 2021 13:34

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

9ddf33b

…hots-api

comments and handle -1 + nulls

6b1897f

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

a49a903

…hots-api

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

c351bee

…hots-api

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

cb22308

…hots-api

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

f693844

…hots-api

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

602b927

…hots-api

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

5271f2e

…hots-api

merge predicate building

74d10b3

cleaner

b641bce

nicer

014517f

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

c78cd2a

…hots-api

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

1cbac70

…hots-api

add docs

83f2950

original-brownbear requested review from tlrx and henningandersen September 25, 2021 09:51

henningandersen approved these changes Sep 26, 2021

View reviewed changes

Merge remote-tracking branch 'elastic/master' into optimize-get-snaps…

af76c5b

…hots-api

comment

1b0006a

tlrx approved these changes Sep 27, 2021

View reviewed changes

original-brownbear added the backport pending label Sep 27, 2021

original-brownbear merged commit 377f546 into elastic:master Sep 27, 2021

original-brownbear deleted the optimize-get-snapshots-api branch September 27, 2021 09:23

original-brownbear mentioned this pull request Oct 17, 2021

Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshotsAction (#78032) #79321

Merged

original-brownbear removed the backport pending label Oct 17, 2021

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshotsAction #78032

Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshotsAction #78032

original-brownbear commented Sep 20, 2021 •

edited

Loading

elasticmachine commented Sep 20, 2021

tlrx left a comment

tlrx Sep 21, 2021

original-brownbear Sep 25, 2021

tlrx Sep 21, 2021

original-brownbear Sep 25, 2021

tlrx Sep 21, 2021

original-brownbear Sep 24, 2021

henningandersen left a comment

henningandersen Sep 20, 2021

original-brownbear Sep 24, 2021

henningandersen Sep 20, 2021

original-brownbear Sep 24, 2021

henningandersen Sep 21, 2021

original-brownbear Sep 21, 2021

original-brownbear Sep 24, 2021

henningandersen Sep 26, 2021

henningandersen Sep 21, 2021

original-brownbear Sep 24, 2021

henningandersen Sep 21, 2021

original-brownbear Sep 21, 2021

original-brownbear Sep 24, 2021

original-brownbear commented Sep 25, 2021

henningandersen left a comment

henningandersen Sep 26, 2021

tlrx Sep 27, 2021

original-brownbear Sep 27, 2021

henningandersen Sep 26, 2021

henningandersen Sep 26, 2021

tlrx left a comment

tlrx Sep 27, 2021

original-brownbear commented Sep 27, 2021

Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshotsAction #78032

Filter Unneeded SnapshotInfo Instances Early in TransportGetSnapshotsAction #78032

Conversation

original-brownbear commented Sep 20, 2021 • edited Loading

elasticmachine commented Sep 20, 2021

tlrx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henningandersen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear commented Sep 25, 2021

henningandersen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlrx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear commented Sep 27, 2021

original-brownbear commented Sep 20, 2021 •

edited

Loading