Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in #5626

stephen-crawford · 2022-12-23T16:00:16Z

Description

Adds a custom comparator-based sort call before parsing indices using the SnapshotUtils class. Previously, the lack of sort meant that mixing negated and normal indices in the "indices" field of the REST calls would create different behavior based on the ordering of the indices. The added sorting logic corrects this by always parsing negated indices after all normal indices (during these situations, the indices parsing had been acting as expected).

In addition, the sorting logic includes a removal of empty strings from the index list since these strings would break the comparator's sort.

Three new tests were added to the SnapshotUtilsTests class in order to make sure that the ordering of the negative and normal indices did not change the behavior (and that it was the expected behavior).

The new behavior is as follows:

(Querying a single negated snapshot): If the body of the request reads indices: "-bar", the query will return all indices which are not "bar".
(Querying a single negated snapshot wildcard): If the body of the request reads indices: "-bar*", the query will return all indices which are not a subset of "bar*".
(Querying multiple negated snapshots): If the body of the request reads indices: "-bar, -baz", the query will return all indices which are not "bar" or `"baz".
(Querying multiple negated snapshot wildcards): If the body of the request reads indices: "-bar*, -baz*", the query will return all indices which are not a subset of "bar*" or a subset of `"baz*".
(Querying a single snapshot): If the body of the request reads indices: "bar", the query will return "bar".
(Querying multiple snapshot wildcards): If the body of the request reads indices: "bar*, baz*", the query will return all indices which are a subset of "bar*" or a subset of `"baz*".
(Querying multiple mixed snapshot wildcards): If the body of the request reads indices: "-bar*, ba*", the query will return all indices which are a subset of "ba*" but not a subset of "bar*". This was not the case previously.
(Querying multiple mixed snapshot wildcards): If the body of the request reads indices: "ba*, -bar*", the query will return all indices which are a subset of "ba*" but not a subset of "bar*".

Now behavior is persevered irregardless of the order of positive and negated wildcards queries.

Issues Resolved

Resolves issue
opensearch-project/security#1652

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- ~~New functionality has javadoc added~~
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

stephen-crawford · 2022-12-28T14:23:32Z

Hi @andrross, thank you for reviewing. To the best of my knowledge the only discussion of the query syntax is found here. It says "You can use , to create a list of indices, * to specify an index pattern, and - to exclude certain indices. Don’t put spaces between items. Default is all indices."

I have not read any other documentation that mentions the use of the special characters. I cannot speak for everyone using OpenSearch but given this issue, it seems like at least a fair number of users expect the behavior to be as this issue would make it. I also think that in general, unless we added documentation and an explanation for why this behavior exists, a change is necessary for user experience etc.

andrross · 2022-12-28T18:00:30Z

@scrawfor99 We're going to have to be super clear about the behavior change here, and then make a judgement call as to whether this is a bug fix versus a breaking change that must be deferred to the next major version. Can you clearly enumerate how this changes the behavior of the API in the commit message and PR description? Something similar to this comment is what I'm thinking.

stephen-crawford · 2022-12-28T20:14:27Z

Hi @andrross, thank you for your suggestion. I went ahead and added documentation into the body like you requested. Let me know if you need anything else.

Happy Holidays!

andrross

Thanks @scrawfor99! So the upshot here is that any call that started with one or more exclusion patterns and also contained explicit inclusion patterns would in effect ignore those inclusions and instead include everything. Is that right? Assuming that is the only behavior change then I do think that is a bug and something worth fixing in a minor version.

andrross · 2022-12-29T17:58:03Z

server/src/main/java/org/opensearch/snapshots/SnapshotUtils.java

@@ -69,6 +69,20 @@ public static List<String> filterIndices(List<String> availableIndices, String[]
        if (IndexNameExpressionResolver.isAllIndices(Arrays.asList(selectedIndices))) {
            return availableIndices;
        }
+
+        selectedIndices = Arrays.stream(selectedIndices).filter(s -> !s.isEmpty()).toArray(a -> new String[a]); // Remove all empty strings


I really want to rewrite the logic in this method because it is very difficult to follow, but I'm going to resist that impulse for now. I am a bit concerned about possible implications of removing empty strings with the change here. I'm inclined to not sort the list, and instead just move the exclusions to the end. What do you think? That can be done with something like:

// Move the exclusions to end of list to ensure they are processed // after explicitly selected indices are chosen. final List<String> excludesAtEndSelectedIndices = Stream.concat( Arrays.stream(selectedIndices) .filter(s -> s.isEmpty() || s.charAt(0) != '-'), Arrays.stream(selectedIndices) .filter(s -> !s.isEmpty() && s.charAt(0) == '-')) .collect(Collectors.toUnmodifiableList());

Then just tweak the logic below to use this list instead of the selectedIndices array.

If you think that this is a better option that seems fine. It should save a bit of time on larger indices lists (though the indices list probably won't be large enough that this is not negligible). I can swap this over :)

As far as the code logic, I only just started looking at this class and its methods but if you feel there is a way it can be improved that you would like to be done just let me know and I will rewrite it. I have just been trying to make the targeted changes with as little modification as possible so as to be less likely to break anything but I agree that the code could be a bit easier to follow.

I have just been trying to make the targeted changes with as little modification as possible

This is definitely the right instinct and why we probably shouldn't completely rewrite this :)

CHANGELOG.md

server/src/main/java/org/opensearch/snapshots/SnapshotUtils.java

… and excluded indices are treated the same regardless of the order they are listed in Signed-off-by: Stephen Crawford <steecraw@amazon.com>

github-actions · 2022-12-30T15:09:24Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/8628/
CommitID: 6beb70e
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

github-actions · 2022-12-30T15:16:29Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/8629/
CommitID: 39c7679
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

github-actions · 2022-12-30T15:46:47Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/8630/
CommitID: 4ea9c63
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

andrross · 2022-12-30T18:47:31Z

server/src/main/java/org/opensearch/snapshots/SnapshotUtils.java

+
+        // Move the exclusions to end of list to ensure they are processed
+        // after explicitly selected indices are chosen.
+        final List<String> excludesAtEndSelectedIndices = Stream.concat(


You need to replace every usage of selectedIndices below with excludesAtEndSelectedIndices

Ooops--clearly I had vacation brain hah.

andrross · 2022-12-30T18:48:15Z

CHANGELOG.md

@@ -79,6 +78,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 - Remove LegacyESVersion.V_7_6_ and V_7_7_ Constants ([#4837](https://github.com/opensearch-project/OpenSearch/pull/4837))
 - Remove LegacyESVersion.V_7_10_ Constants ([#5018](https://github.com/opensearch-project/OpenSearch/pull/5018))
 - Remove Version.V_1_ Constants ([#5021](https://github.com/opensearch-project/OpenSearch/pull/5021))
+- Remove --enable-preview feature flag since Apache Lucene now patches class files ([#5642](https://github.com/opensearch-project/OpenSearch/pull/5642))


This should not be here

Sorry, must have grabbed it from an update pull.

andrross · 2022-12-30T18:56:38Z

CHANGELOG.md

@@ -61,7 +60,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 - Changed http code on create index API with bad input raising NotXContentException from 500 to 400 ([#4773](https://github.com/opensearch-project/OpenSearch/pull/4773))
 - Change http code for DecommissioningFailedException from 500 to 400 ([#5283](https://github.com/opensearch-project/OpenSearch/pull/5283))
 - Pre conditions check before updating weighted routing metadata ([#4955](https://github.com/opensearch-project/OpenSearch/pull/4955))
- Remove --enable-preview feature flag since Apache Lucene now patches class files ([#5642](https://github.com/opensearch-project/OpenSearch/pull/5642))
+- Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in ([#5626](https://github.com/opensearch-project/OpenSearch/pull/5626))


The idea here is that a user can scan this list and quickly determine whether a change might impact them. It should also be as concise as possible, which can obviously be a challenge. At a minimum, it should list what changed from a user's perspective, which in this case I think is the snapshot restore and clone APIs. What do you think of:

"Fix index exclusion behavior in snapshot restore and clone APIs"

Users that don't use those APIs or don't use exclusions can quickly identify this change as not impacting them, otherwise they'll need to dig into the specifics of the linked issue to get more info.

Also, I believe we should backport this change to 2.x, so the changelog entry should go in the "Unreleased 2.x" section.

Gotcha. I had thought that you had requested that the CHANGELOG entry be swapped to the new description alongside the PR title. My misunderstanding. I agree that the new option is much easier to quickly read over.

I will move it to the Unreleased section as well.

andrross · 2022-12-30T18:58:23Z

server/src/main/java/org/opensearch/snapshots/SnapshotUtils.java

@@ -69,6 +69,20 @@ public static List<String> filterIndices(List<String> availableIndices, String[]
        if (IndexNameExpressionResolver.isAllIndices(Arrays.asList(selectedIndices))) {
            return availableIndices;
        }
+
+        selectedIndices = Arrays.stream(selectedIndices).filter(s -> !s.isEmpty()).toArray(a -> new String[a]); // Remove all empty strings


I have just been trying to make the targeted changes with as little modification as possible

This is definitely the right instinct and why we probably shouldn't completely rewrite this :)

andrross · 2022-12-30T19:06:31Z

@reta @dblock

Just want to raise this issue as I believe we should backport this as a bug fix, though it does change the behavior of the snapshot restore (and clone) APIs.

The upshot here is that, given indexes "foo", "bar", "baz" in a snapshot, and the user provides the pattern "-bar*, ba*" in the restore API, the current behavior results in restoring "foo" and "baz". This is a bug. The correct behavior is to restore only "baz" (select indexes that match "ba*" and exclude those that match "bar*").

There is definitely a chance that some users are (intentionally or not) relying on this buggy behavior and the fix will break them. I think we should backport this as it is clearly wrong. What do you think?

dblock · 2023-01-02T21:36:01Z

Looks like a bug fix to me, so yes to backport. When users accidentally or purposely use behavior that is a side effect of a bug, it's still a bug, and not a breaking change. I also think we should release note it well.

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

github-actions · 2023-01-03T14:58:38Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/8732/
CommitID: c7d07bd

stephen-crawford · 2023-01-03T15:05:48Z

@andrross should be all set!

andrross · 2023-01-03T22:04:19Z

Thanks @scrawfor99, nice work!

cwperks · 2023-01-05T14:10:55Z

Great work @scrawfor99 and thank you for following this through! 🎸

stephen-crawford and others added 5 commits December 22, 2022 17:10

Added test cases

7a127d9

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

reset snapshot utils file

827870c

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

Merge branch 'opensearch-project:main' into restore-snapshot

b1fa1fa

Add comparator

6a63feb

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

Add comparator and sorting method

a9a9c14

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

stephen-crawford added 2 commits December 23, 2022 11:01

Add changelog

36deec5

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

fix import resolving

019a88f

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

stephen-crawford mentioned this pull request Dec 23, 2022

[BUG] Restoring snapshot: indices exclusion triggers security_exception (creating OK, listing OK) opensearch-project/security#1652

Closed

Spotless apply

2a43f9d

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

andrross reviewed Dec 29, 2022

View reviewed changes

stephen-crawford changed the title ~~Fix index parsing in SnapshotUtils to allow correct reading of Snapshot Restore calls~~ Standardize snapshot indices parsing so that combination of included and excluded indices are treated the same irregardless of the order they are listed in Dec 30, 2022

stephen-crawford and others added 2 commits December 30, 2022 09:46

Standardize snapshot indices parsing so that combinations of included…

6beb70e

… and excluded indices are treated the same regardless of the order they are listed in Signed-off-by: Stephen Crawford <steecraw@amazon.com>

Merge branch 'main' into restore-snapshot

39c7679

Spotless Apply

4ea9c63

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

andrross reviewed Dec 30, 2022

View reviewed changes

Fix index exclusion behavior in snapshot restore and clone APIs

c7d07bd

Signed-off-by: Stephen Crawford <steecraw@amazon.com>

andrross approved these changes Jan 3, 2023

View reviewed changes

andrross merged commit 57d4485 into opensearch-project:main Jan 3, 2023

andrross added the backport 2.x Backport to 2.x branch label Jan 3, 2023

opensearch-trigger-bot bot mentioned this pull request Jan 3, 2023

[Backport 2.x] Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in #5683

Merged

stephen-crawford mentioned this pull request Jan 3, 2023

[BUG] SnapshotUtils has inconsistent behavior when selectedIndices has negative patterns #5627

Closed

stephen-crawford deleted the restore-snapshot branch March 29, 2023 15:16

stephen-crawford restored the restore-snapshot branch March 29, 2023 15:16

stephen-crawford deleted the restore-snapshot branch March 30, 2023 16:00

stephen-crawford restored the restore-snapshot branch March 30, 2023 16:00

stephen-crawford deleted the restore-snapshot branch June 9, 2023 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in #5626

Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in #5626

stephen-crawford commented Dec 23, 2022 •

edited

Loading

stephen-crawford commented Dec 28, 2022

andrross commented Dec 28, 2022

stephen-crawford commented Dec 28, 2022

andrross left a comment •

edited

Loading

andrross Dec 29, 2022

stephen-crawford Dec 30, 2022

andrross Dec 30, 2022

github-actions bot commented Dec 30, 2022

github-actions bot commented Dec 30, 2022

github-actions bot commented Dec 30, 2022

andrross Dec 30, 2022

stephen-crawford Jan 3, 2023

andrross Dec 30, 2022

stephen-crawford Jan 3, 2023

andrross Dec 30, 2022

andrross Dec 30, 2022

stephen-crawford Jan 3, 2023

andrross Dec 30, 2022

andrross commented Dec 30, 2022

dblock commented Jan 2, 2023 •

edited

Loading

github-actions bot commented Jan 3, 2023

stephen-crawford commented Jan 3, 2023

andrross commented Jan 3, 2023

cwperks commented Jan 5, 2023

Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in #5626

Standardize snapshot indices parsing so that combinations of included and excluded indices are treated the same regardless of the order they are listed in #5626

Conversation

stephen-crawford commented Dec 23, 2022 • edited Loading

Description

Issues Resolved

Check List

stephen-crawford commented Dec 28, 2022

andrross commented Dec 28, 2022

stephen-crawford commented Dec 28, 2022

andrross left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 30, 2022

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Dec 30, 2022

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Dec 30, 2022

Gradle Check (Jenkins) Run Completed with:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrross commented Dec 30, 2022

dblock commented Jan 2, 2023 • edited Loading

github-actions bot commented Jan 3, 2023

Gradle Check (Jenkins) Run Completed with:

stephen-crawford commented Jan 3, 2023

andrross commented Jan 3, 2023

cwperks commented Jan 5, 2023

stephen-crawford commented Dec 23, 2022 •

edited

Loading

andrross left a comment •

edited

Loading

dblock commented Jan 2, 2023 •

edited

Loading