Skip to content

Commit

Permalink
Merge branch '2.x' into backport/backport-852-to-2.x
Browse files Browse the repository at this point in the history
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
  • Loading branch information
martin-gaievski authored Oct 14, 2024
2 parents 7bf1d16 + 69545ac commit 3e00c25
Show file tree
Hide file tree
Showing 47 changed files with 1,348 additions and 139 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/backwards_compatibility_tests_workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ jobs:
matrix:
java: [ 11, 17, 21 ]
os: [ubuntu-latest,windows-latest]
bwc_version : ["2.9.0","2.10.0","2.11.0","2.12.0","2.13.0","2.14.0","2.15.0","2.16.0"]
opensearch_version : [ "2.17.0-SNAPSHOT" ]
bwc_version : ["2.9.0","2.10.0","2.11.0","2.12.0","2.13.0","2.14.0","2.15.0","2.16.0","2.17.0"]
opensearch_version : [ "2.18.0-SNAPSHOT" ]

name: NeuralSearch Restart-Upgrade BWC Tests
runs-on: ${{ matrix.os }}
Expand All @@ -42,8 +42,8 @@ jobs:
matrix:
java: [ 11, 17, 21 ]
os: [ubuntu-latest,windows-latest]
bwc_version: [ "2.11.0","2.12.0","2.13.0","2.14.0","2.15.0", "2.16.0" ]
opensearch_version: [ "2.17.0-SNAPSHOT" ]
bwc_version: [ "2.11.0","2.12.0","2.13.0","2.14.0","2.15.0", "2.16.0", "2.17.0" ]
opensearch_version: [ "2.18.0-SNAPSHOT" ]

name: NeuralSearch Rolling-Upgrade BWC Tests
runs-on: ${{ matrix.os }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/changelog_verifier.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: "Changelog Verifier"
on:
pull_request:
types: [opened, edited, review_requested, synchronize, reopened, ready_for_review, labeled, unlabeled]
types: [opened, synchronize, reopened, ready_for_review, labeled, unlabeled]

jobs:
# Enforces the update of a changelog file on every pull request
Expand Down
31 changes: 24 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,27 @@
# Ignore Gradle project-specific cache directory
.gradle
# intellij files
.idea/
*.iml
*.ipr
*.iws
*.log
build-idea/
out/

# Ignore Gradle build output directory
build
.idea
.DS_Store
.gitattributes
# eclipse files
.classpath
.project
.settings

# gradle stuff
.gradle/
build/
bin/

# vscode stuff
.vscode/

# osx stuff
.DS_Store

# git stuff
.gitattributes
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,17 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
### Features
### Enhancements
### Bug Fixes
- Fix for nested field missing sub embedding field in text embedding processor ([#913](https://github.com/opensearch-project/neural-search/pull/913))
### Infrastructure
### Documentation
### Maintenance
### Refactoring

## [Unreleased 2.x](https://github.com/opensearch-project/neural-search/compare/2.16...2.x)
## [Unreleased 2.x](https://github.com/opensearch-project/neural-search/compare/2.17...2.x)
### Features
### Enhancements
- Implement `ignore_missing` field in text chunking processors ([#907](https://github.com/opensearch-project/neural-search/pull/907))
- Added rescorer in hybrid query ([#917](https://github.com/opensearch-project/neural-search/pull/917))
### Bug Fixes
### Infrastructure
- Update batch related tests to use batch_size in processor & refactor BWC version check ([#852](https://github.com/opensearch-project/neural-search/pull/852))
Expand Down
48 changes: 47 additions & 1 deletion DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,12 @@
- [Run Single-node Cluster Locally](#run-single-node-cluster-locally)
- [Run Multi-node Cluster Locally](#run-multi-node-cluster-locally)
- [Debugging](#debugging)
- [Major Dependencies](#major-dependencies)
- [Backwards Compatibility Testing](#backwards-compatibility-testing)
- [Adding new tests](#adding-new-tests)
- [Supported configurations](#supported-configurations)
- [Submitting Changes](#submitting-changes)
- [Building On Lucene Version Updates](#building-on-lucene-version-updates)

# Developer Guide

Expand Down Expand Up @@ -88,12 +90,22 @@ Please follow these formatting guidelines:
OpenSearch neural-search uses a [Gradle](https://docs.gradle.org/6.6.1/userguide/userguide.html) wrapper for its build.
Run `gradlew` on Unix systems.

Build OpenSearch neural-search using `gradlew build`
Build OpenSearch neural-search using `gradlew build`. This command will
also run Integration Tests and Unit Tests.

```
./gradlew build
```

## Run Unit Tests
If you want to strictly test that your unit tests are passing
you can run the following.

```
./gradlew test
```


## Run OpenSearch neural-search

### Run Single-node Cluster Locally
Expand Down Expand Up @@ -227,6 +239,12 @@ Additionally, it is possible to attach one debugger to the cluster JVM and anoth
./gradlew :integTest -Dtest.debug=1 -Dcluster.debug=1
```

#### Major Dependencies
Currently, the major dependencies that Neural Search depends on are [ML-Commons](https://github.com/opensearch-project/ml-commons) and [K-NN](https://github.com/opensearch-project/k-NN).
Make sure to check on them when you observe a failure that affects Neural Search.
See [Building on Lucene Version updates](#building-on-lucene-version-updates) as an example where K-NN caused a build failure.
Also, please note that it may take time for developers to create a fix for your current dependency issue.

## Backwards Compatibility Testing

The purpose of Backwards Compatibility Testing and different types of BWC tests are explained [here](https://github.com/opensearch-project/opensearch-plugins/blob/main/TESTING.md#backwards-compatibility-testing). The BWC tests (i.e. Restart-Upgrade, Mixed-Cluster and Rolling-Upgrade scenarios) should be added with any new feature being added to Neural Search.
Expand Down Expand Up @@ -292,3 +310,31 @@ original PR with an appropriate label `backport <backport-branch-name>` is merge
run successfully on the PR. For example, if a PR on main needs to be backported to `2.x` branch, add a label
`backport 2.x` to the PR and make sure the backport workflow runs on the PR along with other checks. Once this PR is
merged to main, the workflow will create a backport PR to the `2.x` branch.

## Building On Lucene Version Updates
There may be a Lucene version update that can affect your workflow causing errors like
`java.lang.NoClassDefFoundError: org/apache/lucene/codecs/lucene99/Lucene99Codec` or
`Provider org.opensearch.knn.index.codec.KNN910Codec.KNN910Codec could not be instantiated`. In this case
we can observe there may be an issue with a dependency with [K-NN](https://github.com/opensearch-project/k-NN).
This results in having issues with not being able to do `./gradlew run` or `./gradlew build`.

You can check this [K-NN PR](https://github.com/opensearch-project/k-NN/pull/2195) as an example of this event happening or this [Neural Search PR](https://github.com/opensearch-project/neural-search/pull/913#issuecomment-2400189329) that shows a developer going
through the same build issue.

**Follow the steps to remedy the gradle run issue.**
1. From your cloned neural search repo root directory `rm -rf build .gradle`
2. Clear the following directories from your gradle folder located in your root directory
1. `cd ~/.gradle`
2. `rm -rf caches workers wrapper daemon`
3. `cd -` switch back the previous directory (i.e. the neural search repo root directory)
3. Finally run `./gradlew run`

**Follow the steps to remedy the gradle build issue**

**PREREQ:** Make sure you have OpenSearch repo cloned locally

1. From your cloned neural search repo root directory `rm -rf build .gradle`
2. Delete the .gradle folder and .m2 folder. `rm -rf ~/.gradle ~/.m2`
3. Head over to your OpenSearch cloned repo root directory
1. `./gradlew publisToMavenLocal`
4. Finally run `./gradlew build` from the neural search repo
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import java.util.concurrent.Callable

buildscript {
ext {
opensearch_version = System.getProperty("opensearch.version", "2.17.0-SNAPSHOT")
opensearch_version = System.getProperty("opensearch.version", "2.18.0-SNAPSHOT")
buildVersionQualifier = System.getProperty("build.version_qualifier", "")
isSnapshot = "true" == System.getProperty("build.snapshot", "true")
version_tokens = opensearch_version.tokenize('-')
Expand Down
4 changes: 2 additions & 2 deletions gradle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
# https://github.com/opensearch-project/OpenSearch/blob/main/libs/core/src/main/java/org/opensearch/Version.java .
# Wired compatibility of OpenSearch works like 3.x version is compatible with 2.(latest-major) version.
# Therefore, to run rolling-upgrade BWC Test on local machine the BWC version here should be set 2.(latest-major).
systemProp.bwc.version=2.17.0-SNAPSHOT
systemProp.bwc.bundle.version=2.16.0
systemProp.bwc.version=2.18.0-SNAPSHOT
systemProp.bwc.bundle.version=2.17.0

# For fixing Spotless check with Java 17
org.gradle.jvmargs=--add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
import static org.opensearch.neuralsearch.util.TestUtils.TEXT_EMBEDDING_PROCESSOR;
import static org.opensearch.neuralsearch.util.TestUtils.DEFAULT_NORMALIZATION_METHOD;
import static org.opensearch.neuralsearch.util.TestUtils.DEFAULT_COMBINATION_METHOD;

import org.opensearch.knn.index.query.rescore.RescoreContext;
import org.opensearch.neuralsearch.query.HybridQueryBuilder;
import org.opensearch.neuralsearch.query.NeuralQueryBuilder;

Expand Down Expand Up @@ -69,8 +71,10 @@ private void validateNormalizationProcessor(final String fileName, final String
modelId = getModelId(getIngestionPipeline(pipelineName), TEXT_EMBEDDING_PROCESSOR);
loadModel(modelId);
addDocuments(getIndexNameForTest(), false);
validateTestIndex(modelId, getIndexNameForTest(), searchPipelineName);
validateTestIndex(modelId, getIndexNameForTest(), searchPipelineName, Map.of("ef_search", 100));
HybridQueryBuilder hybridQueryBuilder = getQueryBuilder(modelId, null, null);
validateTestIndex(getIndexNameForTest(), searchPipelineName, hybridQueryBuilder);
hybridQueryBuilder = getQueryBuilder(modelId, Map.of("ef_search", 100), RescoreContext.getDefault());
validateTestIndex(getIndexNameForTest(), searchPipelineName, hybridQueryBuilder);
} finally {
wipeOfTestResources(getIndexNameForTest(), pipelineName, modelId, searchPipelineName);
}
Expand Down Expand Up @@ -98,15 +102,10 @@ private void createSearchPipeline(final String pipelineName) {
);
}

private void validateTestIndex(final String modelId, final String index, final String searchPipeline) {
validateTestIndex(modelId, index, searchPipeline, null);
}

private void validateTestIndex(final String modelId, final String index, final String searchPipeline, Map<String, ?> methodParameters) {
private void validateTestIndex(final String index, final String searchPipeline, HybridQueryBuilder queryBuilder) {
int docCount = getDocCount(index);
assertEquals(6, docCount);
HybridQueryBuilder hybridQueryBuilder = getQueryBuilder(modelId, methodParameters);
Map<String, Object> searchResponseAsMap = search(index, hybridQueryBuilder, null, 1, Map.of("search_pipeline", searchPipeline));
Map<String, Object> searchResponseAsMap = search(index, queryBuilder, null, 1, Map.of("search_pipeline", searchPipeline));
assertNotNull(searchResponseAsMap);
int hits = getHitCount(searchResponseAsMap);
assertEquals(1, hits);
Expand All @@ -116,7 +115,7 @@ private void validateTestIndex(final String modelId, final String index, final S
}
}

private HybridQueryBuilder getQueryBuilder(final String modelId, Map<String, ?> methodParameters) {
private HybridQueryBuilder getQueryBuilder(final String modelId, Map<String, ?> methodParameters, RescoreContext rescoreContext) {
NeuralQueryBuilder neuralQueryBuilder = new NeuralQueryBuilder();
neuralQueryBuilder.fieldName("passage_embedding");
neuralQueryBuilder.modelId(modelId);
Expand All @@ -125,6 +124,9 @@ private HybridQueryBuilder getQueryBuilder(final String modelId, Map<String, ?>
if (methodParameters != null) {
neuralQueryBuilder.methodParameters(methodParameters);
}
if (rescoreContext != null) {
neuralQueryBuilder.rescoreContext(rescoreContext);
}

MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("text", QUERY);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ private void validateIndexQuery(final String modelId) {
0.01f,
null,
null,
null,
null
);
Map<String, Object> responseWithMinScoreQuery = search(getIndexNameForTest(), neuralQueryBuilderWithMinScoreQuery, 1);
Expand All @@ -76,6 +77,7 @@ private void validateIndexQuery(final String modelId) {
null,
null,
null,
null,
null
);
Map<String, Object> responseWithMaxDistanceQuery = search(getIndexNameForTest(), neuralQueryBuilderWithMaxDistanceQuery, 1);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ private void validateTestIndex(final String modelId) throws Exception {
null,
null,
null,
null,
null
);
Map<String, Object> response = search(getIndexNameForTest(), neuralQueryBuilder, 1);
Expand Down
30 changes: 29 additions & 1 deletion qa/rolling-upgrade/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -84,12 +84,19 @@ task testAgainstOldCluster(type: StandaloneRestIntegTestTask) {

// Excluding the tests because we introduce these features in 2.13
if (versionsBelow2_13.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.NeuralQueryEnricherProcessorIT.testNeuralQueryEnricherProcessor_NeuralSparseSearch_E2EFlow"
excludeTestsMatching "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.*"
}
}

// Excluding the test because hybrid query with rescore is not compatible with 2.14 and lower
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
}
}

// Excluding the k-NN radial search and batch ingestion tests because we introduce these features in 2.14
if (versionsBelow2_14.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
Expand Down Expand Up @@ -154,6 +161,13 @@ task testAgainstOneThirdUpgradedCluster(type: StandaloneRestIntegTestTask) {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.*"
}
}

// Excluding the test because hybrid query with rescore is not compatible with 2.14 and lower
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
}
}

// Excluding the k-NN radial search and batch ingestion tests because we introduce these features in 2.14
if (versionsBelow2_14.any { ext.neural_search_bwc_version.startsWith(it) }){
Expand Down Expand Up @@ -219,6 +233,13 @@ task testAgainstTwoThirdsUpgradedCluster(type: StandaloneRestIntegTestTask) {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.*"
}
}

// Excluding the test because hybrid query with rescore is not compatible with 2.14 and lower
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
}
}

// Excluding the k-NN radial search and batch ingestion tests because we introduce these features in 2.14
if (versionsBelow2_14.any { ext.neural_search_bwc_version.startsWith(it) }){
Expand Down Expand Up @@ -284,6 +305,13 @@ task testRollingUpgrade(type: StandaloneRestIntegTestTask) {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.*"
}
}

// Excluding the test because hybrid query with rescore is not compatible with 2.14 and lower
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
}
}

// Excluding the k-NN radial search and batch ingestion tests because we introduce these features in 2.14
if (versionsBelow2_14.any { ext.neural_search_bwc_version.startsWith(it) }){
Expand Down
Loading

0 comments on commit 3e00c25

Please sign in to comment.