-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Implement pruning for neural sparse search #988
base: main
Are you sure you want to change the base?
Conversation
1e55b7c
to
46b9d9a
Compare
This PR is ready for review now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide an overview of how the overall API will look? I initially thought this change would only affect the query side, but it seems it will also modify the parameters for neural_sparse_two_phase_processor
.
Additionally, the current implementation appears to be focused on two-phase processing with different strategies for splitting vectors, rather than a combination of pruning and two-phase processing?
src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java
Outdated
Show resolved
Hide resolved
Based on our benchmark results in #946 , when searching, applying prune to 2-phase search has superseded applying it to neural sparse query body, on both precision and latency. Therefore, enhancing the existing 2-phase search pipeline makes more sense.
The existing two-phase use max_ratio prune criteria. And now we add supports for other criteria as well |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #988 +/- ##
============================================
+ Coverage 80.47% 81.27% +0.79%
- Complexity 1000 1054 +54
============================================
Files 78 80 +2
Lines 3411 3535 +124
Branches 578 611 +33
============================================
+ Hits 2745 2873 +128
+ Misses 425 423 -2
+ Partials 241 239 -2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from minor comment, why this PR is trying to merge into main
?
If this changes API that used to define the processor, it should be checked with application security and for that we need to merge to feature branch in main repo, and only after that's cleared from feature branch to main.
src/main/java/org/opensearch/neuralsearch/processor/NeuralSparseTwoPhaseProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/processor/SparseEncodingProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/processor/factory/SparseEncodingProcessorFactory.java
Outdated
Show resolved
Hide resolved
); | ||
} else { | ||
// if we don't have prune type, then prune ratio field must not have value | ||
if (config.containsKey(PruneUtils.PRUNE_RATIO_FIELD)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can merge this if with a previous else
and have one single else if
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can merge this if with a previous else and have one single else if block
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.
We want to validate that the PRUNE_RATIO field is not provided. Any values will be illegal
src/main/java/org/opensearch/neuralsearch/util/prune/PruneType.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
switch (pruneType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you think of modifying this into a map of <prune_type> -> <functional_interface>, so instead of switch structure we use map.get()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we can, but what's the advantage by doing this?
From readability perspective, switch-based method is more straightforward and have good readability.
From the performance perspective, the switch on enum will be optimized to operation on lookup table and can be executed on O(1) complexity. I tried to execute both methods for 100k times, and switch-based takes less time than map-based approach. (0.18ms vs 0.63ms)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test code:
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/
package org.opensearch.neuralsearch.util.prune;
import org.opensearch.test.OpenSearchTestCase;
import java.util.HashMap;
import java.util.Map;
public class PrunePerfTests extends OpenSearchTestCase {
private static final int ITERATIONS = 100_000;
interface PruneHandler {
void handle(PruneType type);
}
private static final Map<PruneType, PruneHandler> handlerMap = new HashMap<>();
static {
handlerMap.put(PruneType.NONE, type -> handleNone());
handlerMap.put(PruneType.TOP_K, type -> handleTopK());
handlerMap.put(PruneType.ALPHA_MASS, type -> handleAlphaMass());
handlerMap.put(PruneType.MAX_RATIO, type -> handleMaxRatio());
handlerMap.put(PruneType.ABS_VALUE, type -> handleAbsValue());
}
public void testPerf() {
warmup();
long switchStart = System.nanoTime();
testSwitch();
long switchEnd = System.nanoTime();
long mapStart = System.nanoTime();
testMap();
long mapEnd = System.nanoTime();
System.out.printf("Switch method took: %.2f ms%n", (switchEnd - switchStart) / 1_000_000.0);
System.out.printf("Map method took: %.2f ms%n", (mapEnd - mapStart) / 1_000_000.0);
}
private static void warmup() {
for (int i = 0; i < 1000; i++) {
testSwitch();
testMap();
}
}
private static void testSwitch() {
PruneType[] types = PruneType.values();
for (int i = 0; i < ITERATIONS; i++) {
PruneType type = types[i % types.length];
switch (type) {
case NONE:
handleNone();
break;
case TOP_K:
handleTopK();
break;
case ALPHA_MASS:
handleAlphaMass();
break;
case MAX_RATIO:
handleMaxRatio();
break;
case ABS_VALUE:
handleAbsValue();
break;
}
}
}
private static void testMap() {
PruneType[] types = PruneType.values();
for (int i = 0; i < ITERATIONS; i++) {
PruneType type = types[i % types.length];
handlerMap.get(type).handle(type);
}
}
private static void handleNone() {
}
private static void handleTopK() {
}
private static void handleAlphaMass() {
}
private static void handleMaxRatio() {
}
private static void handleAbsValue() {
}
}
|
||
switch (pruneType) { | ||
case TOP_K: | ||
return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio); | |
return pruneRatio > 0 && pruneRatio == Math.rint(pruneRatio); |
this is more reliable for float numbers, otherwise there is a chance of false positive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem correct to replace the floor to rint, from the definition, rint will give a even number if there are two values same close to the input value, I tested with input 3.5, floor result is 3 but rint result is 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please give an example of false positive?
} | ||
} | ||
|
||
switch (pruneType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, can we use map instead of switch?
@martin-gaievski Thanks for the comments. We didn't create feature branch because there is no other contributors working on this and we regard the PR branch as feature branch. I'm on PTO this week, will follow the app sec issue and solve the comments next week. |
|
||
switch (pruneType) { | ||
case TOP_K: | ||
return pruneRatio > 0 && pruneRatio == Math.floor(pruneRatio); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem correct to replace the floor to rint, from the definition, rint will give a even number if there are two values same close to the input value, I tested with input 3.5, floor result is 3 but rint result is 4.
src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java
Outdated
Show resolved
Hide resolved
* @param pruneType The type of prune strategy | ||
* @throws IllegalArgumentException if prune type is null | ||
*/ | ||
public static String getValidPruneRatioDescription(PruneType pruneType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] this can be refactored to a static map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to the discussion with Martin at above
); | ||
} else { | ||
// if we don't have prune type, then prune ratio field must not have value | ||
if (config.containsKey(PruneUtils.PRUNE_RATIO_FIELD)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else means PruneType is NONE right? It seems can be moved to https://github.com/opensearch-project/neural-search/pull/988/files#diff-8453ea75f8259ba96c246d483b2de9e21601fb9c3d033e8902756f5d101f2238R262 when validating the input ratio.
src/main/java/org/opensearch/neuralsearch/util/prune/PruneUtils.java
Outdated
Show resolved
Hide resolved
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
2a3e2cf
to
0d928a9
Compare
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
Description
Implement prune for sparse vectors, to save disk space and accelerate search speed with small loss on search relevance. #946
Related Issues
#946
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.