Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[segment replication]Introducing common Replication interfaces for segment replication and recovery code paths #3234

Merged

Conversation

Poojita-Raj
Copy link
Contributor

@Poojita-Raj Poojita-Raj commented May 6, 2022

Signed-off-by: Poojita Raj poojiraj@amazon.com

Description

In building the segment replication feature, to avoid overlap/redundancies in code paths between segment replication and recovery we have pulled out marker classes defining the interface for such replication processes into the below classes. In this PR, we have the recovery code paths included while segment replication paths will be introduced in a separate PR.

This refactoring change to support changes in segment replication includes the following:

  • Introduces ReplicationState interface - that represents a state object used to track copying of segments from an external source. Implemented by RecoveryState as part of this PR.
  • Introduces ReplicationListener interface for listeners that run when there's a change in ReplicationState. Implemented by RecoveryListener as part of this PR.
  • Introduces ReplicationTarget abstract class that represents the target of an operation performed on a shard. Extended by RecoveryTarget as part of this PR where recovery is the operation.
  • Introduces ReplicationCollection generic class that holds a collection of all on going events on the current node such that the events are bounded by type <T extends ReplicationTarget>.

This is a part of the process of merging our feature branch - feature/segment-replication - back into main by re-PRing our changes from the feature branch.
The breakdown of the plan to merge to main is detailed here: #2355
For added context on segment replication - here's the design proposal #2229

Issues Resolved

Resolves the third issue in #2926

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Poojita-Raj Poojita-Raj requested review from a team and reta as code owners May 6, 2022 20:01
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 74ff68895144a5683ac41e51eceb05842ff7f9db
Log 5102

Reports 5102

@peterzhuamazon
Copy link
Member

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 74ff68895144a5683ac41e51eceb05842ff7f9db
Log 5118

Reports 5118

@peterzhuamazon
Copy link
Member

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 74ff68895144a5683ac41e51eceb05842ff7f9db
Log 5129

Reports 5129

@dblock dblock requested a review from kartg May 10, 2022 15:32
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 52b02c624e201fdc470d7887e55d3bf871c7ef3f
Log 5202

Reports 5202


// last time the target/status was accessed
private volatile long lastAccessTime = System.nanoTime();
private final RecoveryRequestTracker requestTracker = new RecoveryRequestTracker();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - Should RecoveryRequestTracker be renamed here to ReplicationRequestTracker or something more generic ?

I think will want to use this tracker while copying for segrep as well to ensure chunk requests to replicas are idempotent.


package org.opensearch.indices.replication.common;

public class ReplicationState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be declared as abstract?

@@ -2871,7 +2872,7 @@ protected Engine getEngineOrNull() {
public void startRecovery(
RecoveryState recoveryState,
PeerRecoveryTargetService recoveryTargetService,
PeerRecoveryTargetService.RecoveryListener recoveryListener,
ReplicationListener recoveryListener,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this entire chain of recovery code operate on a ReplicationListener doesn't seem right to me. How about this?

  1. Extract the private RecoveryListener class in IndicesClusterStateService to a top level class, and
  2. Have the entire recovery chain use the RecoveryListener class instead

Wdyt?

}

@Override
public void onRecoveryFailure(RecoveryState state, RecoveryFailedException e, boolean sendShardFailure) {
public void onFailure(ReplicationState state, OpenSearchException e, boolean sendShardFailure) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocker - OpenSearchException seems too broad here. We should either move RecoveryFailedException to ReplicationFailedException or create parent/child exception classes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RecoveryFailedException is a subclass of OpenSearchException. I can add in an assert to ensure RecoveryFailedException is passed in. Parameter is set to OpenSearchException in order to inherit RecoveryListener from ReplicationListener and keep the abstract class more generic.

}

/**
* return the last time this RecoveryStatus was used (based on System.nanoTime()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - There are several mentions of "recovery" in this file which should appropriately replaced by "replication"

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 53bb04e17a815c9d41a1d627029c2b5dc96676fa
Log 5264

Reports 5264

…s from ReplicationTarget

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
…nState

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure b48f58fed1a701cf736901f32e864f576960e57c
Log 5266

Reports 5266

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 8f2c3d51c06f3b6a70ef60afe58b4dfac07e8d3e
Log 5268

Reports 5268

…ess review comments

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure ba51c17
Log 5273

Reports 5273

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success d7fc756
Log 5287

Reports 5287

return shardStateAction;
}

public ClusterService getClusterService() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used, please remove

@Override
public void onDone(ReplicationState state) {
RecoveryState RecState = (RecoveryState) state;
indicesClusterStateService.getShardStateAction()
Copy link
Collaborator

@reta reta May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a better option to introduce handleRecoveryDone (or handleRecoverySuccess) method into indicesClusterStateService and encapsulate this logic inside, it will also let us drop unnecessary getShardStateActionListener() and getShardStateAction() methods. It will also nicely complement handleRecoveryFailure method

return recoveryId;
}

public ShardId shardId() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particularly biased, but supporting public ShardId shardId() as a shortcut to indexShard().getId() could be handy (too many recoveryTarget.indexShard().getId() calls)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think we should add this to ReplicationTarget.

return STAGES[id];
}
}
public class RecoveryState extends ReplicationState implements ToXContentFragment, Writeable {
Copy link
Collaborator

@Bukhtawar Bukhtawar May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we have two separate Recovery and Replication states without Recovery state inheriting Replication State. For instance there are other modes of Recovery like Snapshot recovery that probably might not be termed as Replication(based on Recovery inherits Replication)
The current inheritance model sounds confusing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the inheritance model is a bit confusing. I know this comes from trying to reduce duplication between segrep and peer recovery paths, but replication will not go through all of these stages.

I'm thinking that we can conceptually remove the inheritance of a "Recovery" target from "ReplicationTarget" in addition to state. The pieces segrep will reuse here are the lifecycle steps (cancel, done, fail, retry & managing cancellableThreads) and not the methods from RecoveryTargetHandler. This would make it something like "ShardEventTarget"? I think this would open the door to refactoring RecoveriesCollection into a reusable component as well, which segrep will need a version of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My $0.02 - the inheritance model seems confusing due to how some of these classes are named and the overloaded meaning behind some terms. What we're trying to express with this structure is that there is a common underlying mechanic to these two situations - that of copying segment files. We're choosing to call this common functionality "replication" (since segment files are being replicated) but I can see how that could be confused with "shard replication" between primaries and replicas.

Further, this file-copying mechanic is reused by peer recovery, but its related classes (like RecoverySource and RecoveryTarget) don't clarify that they are only used for peer recovery.

I don't have any suggestions on how to express this better, so I'm completely open to feedback :)


protected void ensureRefCount() {
if (refCount() <= 0) {
throw new OpenSearchException("RecoveryStatus is used but it's refcount is 0. Probably a mismatch between incRef/decRef calls");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new OpenSearchException("RecoveryStatus is used but it's refcount is 0. Probably a mismatch between incRef/decRef calls");
throw new OpenSearchException("ReplicationTarget is used but it's refcount is 0. Probably a mismatch between incRef/decRef calls");

}

private void startRecoveryInternal(RecoveryTarget recoveryTarget, TimeValue activityTimeout) {
RecoveryTarget existingTarget = onGoingRecoveries.putIfAbsent(recoveryTarget.recoveryId(), recoveryTarget);
RecoveryTarget existingTarget = onGoingRecoveries.putIfAbsent(recoveryTarget.getId(), recoveryTarget);
assert existingTarget == null : "found two RecoveryStatus instances with the same id";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert existingTarget == null : "found two RecoveryStatus instances with the same id";
assert existingTarget == null : "found two RecoveryTarget instances with the same id";

*
* @opensearch.internal
*/
public class RecoveryRequestTracker {
public class ReplicationRequestTracker {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has use cases outside of Recovery or Replication. What about TransportRequestTracker?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just RequestTracker ? Also @mch2 I only see this used in RecoveryTarget. What other use cases did you spot?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't articulate this well, use case was the wrong word. I meant the functionality that this component provides is not tied to a "replication" or "recovery", it's simply tracking incoming requests. I think it's ok as is, its a nit-pick.

return STAGES[id];
}
}
public class RecoveryState extends ReplicationState implements ToXContentFragment, Writeable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the inheritance model is a bit confusing. I know this comes from trying to reduce duplication between segrep and peer recovery paths, but replication will not go through all of these stages.

I'm thinking that we can conceptually remove the inheritance of a "Recovery" target from "ReplicationTarget" in addition to state. The pieces segrep will reuse here are the lifecycle steps (cancel, done, fail, retry & managing cancellableThreads) and not the methods from RecoveryTargetHandler. This would make it something like "ShardEventTarget"? I think this would open the door to refactoring RecoveriesCollection into a reusable component as well, which segrep will need a version of.

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success b84c184
Log 5404

Reports 5404

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 8447624
Log 5427

Reports 5427

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 5edeae3
Log 5428

Reports 5428

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 47379df
Log 5430

Reports 5430

@dblock dblock requested review from mch2 and Bukhtawar May 18, 2022 21:03
@Poojita-Raj Poojita-Raj changed the title RecoveryState inherits from ReplicationState + RecoveryTarget inherit… [segment replication]Introducing common Replication interfaces for segment replication and recovery code paths May 18, 2022
@Poojita-Raj Poojita-Raj requested a review from kartg May 18, 2022 21:39
Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 233acbc
Log 5461

Reports 5461

Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this more than the first revision, particularly with the change to RecoveriesCollection so it can be reused.

Its still a bit weird to me to have the RecoveryState implement a "ReplicationState" interface, but I like this more with Segment Replication and Recovery having completely separate stages. I can't really think of a better name for that interface that isn't too generic.

@Bukhtawar thoughts?

return recoveryId;
}

public ShardId shardId() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think we should add this to ReplicationTarget.

failAndRemoveShard(shardRouting, sendShardFailure, "failed recovery", failure, clusterService.state());
}

public void handleRecoveryDone(ReplicationState state, ShardRouting shardRouting, long primaryTerm) {
RecoveryState RecState = (RecoveryState) state;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can change the method parameter type to RecoveryState, and this casting can be removed

return STAGES[id];
}
}
public class RecoveryState extends ReplicationState implements ToXContentFragment, Writeable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My $0.02 - the inheritance model seems confusing due to how some of these classes are named and the overloaded meaning behind some terms. What we're trying to express with this structure is that there is a common underlying mechanic to these two situations - that of copying segment files. We're choosing to call this common functionality "replication" (since segment files are being replicated) but I can see how that could be confused with "shard replication" between primaries and replicas.

Further, this file-copying mechanic is reused by peer recovery, but its related classes (like RecoverySource and RecoveryTarget) don't clarify that they are only used for peer recovery.

I don't have any suggestions on how to express this better, so I'm completely open to feedback :)

@@ -109,27 +90,15 @@ public class RecoveryTarget extends AbstractRefCounted implements RecoveryTarget
* @param sourceNode source node of the recovery where we recover from
* @param listener called when recovery is completed/failed
*/
public RecoveryTarget(IndexShard indexShard, DiscoveryNode sourceNode, PeerRecoveryTargetService.RecoveryListener listener) {
super("recovery_status");
public RecoveryTarget(IndexShard indexShard, DiscoveryNode sourceNode, ReplicationListener listener) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a "Recovery" target class, shouldn't this be a RecoveryListener ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's extending ReplicationTarget, it's using replication listener that is overridden in calls to RecoveryTarget by recoveryListener that implements ReplicationListener interface

Comment on lines 96 to 102
* Resets the recovery and performs a recovery restart on the currently recovering index shard
*
* @see IndexShard#performRecoveryRestart()
* @return newly created RecoveryTarget
*/
@SuppressWarnings(value = "unchecked")
public T resetRecovery(final long recoveryId, final TimeValue activityTimeout) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - this javadoc and code shouldn't use "recovery"

Comment on lines 74 to 78
/**
* Starts are new recovery for the given shard, source node and state
*
* @return the id of the new recovery.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - "replication", not recovery

*
* @opensearch.internal
*/
public class ReplicationCollection<T extends ReplicationTarget> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - I think we can use a better name than "Collection" because it doesn't do a good job communicating the purpose fo this class. And, as a minor pet peeve, the repetition of "Replication" isn't great either.

How about either ConcurrencyManager or ReplicationManager ?

}
}

private class ReplicationMonitor extends AbstractRunnable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - javadoc?

*
* @opensearch.internal
*/
public class RecoveryRequestTracker {
public class ReplicationRequestTracker {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just RequestTracker ? Also @mch2 I only see this used in RecoveryTarget. What other use cases did you spot?

package org.opensearch.indices.replication.common;

/**
* Represents a state object used to track copying of segments from an external source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - probably worth it to include javadocs on why this interface is necessary if it's empty/marker

@@ -52,80 +52,74 @@
import static org.hamcrest.Matchers.lessThan;

public class RecoveriesCollectionTests extends OpenSearchIndexLevelReplicationTestCase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - should be renamed to match the new class

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 50c4e6e
Log 5492

Reports 5492

Copy link
Member

@kartg kartg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to add an approval, since all of my PR comments were nitpicks

@Poojita-Raj Poojita-Raj merged commit a023ad9 into opensearch-project:main May 23, 2022
Bukhtawar pushed a commit that referenced this pull request Jun 20, 2022
…gment replication and recovery code paths (#3234)

* RecoveryState inherits from ReplicationState + RecoveryTarget inherits from ReplicationTarget

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring: mixedClusterVersion error fix + move Stage to ReplicationState

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* pull ReplicationListener into a top level class + add javadocs + address review comments

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* fix javadoc

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* review changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring the hierarchy relationship between repl and recovery

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* style fix

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* move package common under replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename to replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename and doc changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Bukhtawar added a commit that referenced this pull request Jun 27, 2022
* Bump reactor-netty-core from 1.0.16 to 1.0.19 in /plugins/repository-azure (#3360)

* Bump reactor-netty-core in /plugins/repository-azure

Bumps [reactor-netty-core](https://github.com/reactor/reactor-netty) from 1.0.16 to 1.0.19.
- [Release notes](https://github.com/reactor/reactor-netty/releases)
- [Commits](reactor/reactor-netty@v1.0.16...v1.0.19)

---
updated-dependencies:
- dependency-name: io.projectreactor.netty:reactor-netty-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* [Type removal] _type removal from mocked responses of scroll hit tests (#3377)

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* [Type removal] Remove _type deprecation from script and conditional processor (#3239)

* [Type removal] Remove _type deprecation from script and conditional processor

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless check apply

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* [Type removal] Remove _type from _bulk yaml test, scripts, unused constants (#3372)

* [Type removal] Remove redundant _type deprecation checks in bulk request

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* [Type removal] bulk yaml tests validating deprecation on _type and removal from scripts

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Fix Lucene-snapshots repo for jdk 17. (#3396)

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Replace internal usages of 'master' term in 'server/src/internalClusterTest' directory (#2521)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* [REMOVE] Cleanup deprecated thread pool types (FIXED_AUTO_QUEUE_SIZE) (#3369)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* [Type removal] _type removal from tests of yaml tests (#3406)

* [Type removal] _type removal from tests of yaml tests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Fix spotless failures

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Fix assertion failures

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Fix assertion failures in DoSectionTests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Add release notes for version 2.0.0 (#3410)


Signed-off-by: Rabi Panda <adnapibar@gmail.com>

* [Upgrade] Lucene-9.2.0-snapshot-ba8c3a8 (#3416)

Upgrades to latest snapshot of lucene 9.2.0 in preparation for GA release.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

* Fix release notes for 2.0.0-rc1 version (#3418)

This change removes some old commits from the 2.0.0-rc1 release notes. These commits were already released as part of 1.x releases.

Add back some missing type removal commits to the 2.0.0 release notes

Signed-off-by: Rabi Panda <adnapibar@gmail.com>

* Bump version 2.1 to Lucene 9.2 after upgrade (#3424)

Bumps Version.V_2_1_0 lucene version to 9.2 after backporting upgrage.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

* Bump com.gradle.enterprise from 3.10 to 3.10.1 (#3425)

Bumps com.gradle.enterprise from 3.10 to 3.10.1.

---
updated-dependencies:
- dependency-name: com.gradle.enterprise
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump reactor-core from 3.4.17 to 3.4.18 in /plugins/repository-azure (#3427)

Bumps [reactor-core](https://github.com/reactor/reactor-core) from 3.4.17 to 3.4.18.
- [Release notes](https://github.com/reactor/reactor-core/releases)
- [Commits](reactor/reactor-core@v3.4.17...v3.4.18)

---
updated-dependencies:
- dependency-name: io.projectreactor:reactor-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump gax-httpjson from 0.101.0 to 0.103.1 in /plugins/repository-gcs (#3426)

Bumps [gax-httpjson](https://github.com/googleapis/gax-java) from 0.101.0 to 0.103.1.
- [Release notes](https://github.com/googleapis/gax-java/releases)
- [Changelog](https://github.com/googleapis/gax-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/gax-java/commits)

---
updated-dependencies:
- dependency-name: com.google.api:gax-httpjson
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* [segment replication]Introducing common Replication interfaces for segment replication and recovery code paths (#3234)

* RecoveryState inherits from ReplicationState + RecoveryTarget inherits from ReplicationTarget

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring: mixedClusterVersion error fix + move Stage to ReplicationState

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* pull ReplicationListener into a top level class + add javadocs + address review comments

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* fix javadoc

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* review changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring the hierarchy relationship between repl and recovery

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* style fix

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* move package common under replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename to replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename and doc changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* [Type removal] Remove type from BulkRequestParser (#3423)

* [Type removal] Remove type handling in bulk request parser

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* [Type removal] Remove testTypesStillParsedForBulkMonitoring as it is no longer present in codebase

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Adding CheckpointRefreshListener to trigger when Segment replication is turned on and Primary shard refreshes (#3108)

* Intial PR adding classes and tests related to checkpoint publishing

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Putting a Draft PR with all changes in classes. Testing is still not included in this commit.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Wiring up index shard to new engine, spotless apply and removing unnecessary tests and logs

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Adding Unit test for checkpointRefreshListener

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Applying spotless check

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Fixing import statements *

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* removing unused constructor in index shard

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Addressing comments from last commit

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Adding package-info.java files for two new packages

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Adding test for null checkpoint publisher and addreesing PR comments

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Add docs for indexshardtests and remove shard.refresh

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Add a new Engine implementation for replicas with segment replication enabled. (#3240)

* Change fastForwardProcessedSeqNo method in LocalCheckpointTracker to persisted checkpoint.

This change inverts fastForwardProcessedSeqNo to fastForwardPersistedSeqNo for use in
Segment Replication.  This is so that a Segrep Engine can match the logic of InternalEngine
where the seqNo is incremented with each operation, but only persisted in the tracker on a flush.
With Segment Replication we bump the processed number with each operation received index/delete/noOp, and
invoke this method when we receive a new set of segments to bump the persisted seqNo.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Extract Translog specific engine methods into an abstract class.

This change extracts translog specific methods to an abstract engine class so that other engine
implementations can reuse translog logic.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Add a separate Engine implementation for replicas with segment replication enabled.

This change adds a new engine intended to be used on replicas with segment replication enabled.
This engine does not wire up an IndexWriter, but still writes all operations to a translog.
The engine uses a new ReaderManager that refreshes from an externally provided SegmentInfos.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless checks.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix :server:compileInternalClusterTestJava compilation.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix failing test naming convention check.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* PR feedback.

- Removed isReadOnlyReplica from overloaded constructor and added feature flag checks.
- Updated log msg in NRTReplicationReaderManager
- cleaned up store ref counting in NRTReplicationEngine.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless check.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove TranslogAwareEngine and build translog in NRTReplicationEngine.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix formatting

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Add missing translog methods to NRTEngine.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove persistent seqNo check from fastForwardProcessedSeqNo.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* PR feedback.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Add test specific to translog trimming.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Javadoc check.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Add failEngine calls to translog methods in NRTReplicationEngine.
Roll xlog generation on replica when a new commit point is received.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Rename master to cluster_manager in the XContent Parser of ClusterHealthResponse (#3432)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Bump hadoop-minicluster in /test/fixtures/hdfs-fixture (#3359)

Bumps hadoop-minicluster from 3.3.2 to 3.3.3.

---
updated-dependencies:
- dependency-name: org.apache.hadoop:hadoop-minicluster
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump avro from 1.10.2 to 1.11.0 in /plugins/repository-hdfs (#3358)

* Bump avro from 1.10.2 to 1.11.0 in /plugins/repository-hdfs

Bumps avro from 1.10.2 to 1.11.0.

---
updated-dependencies:
- dependency-name: org.apache.avro:avro
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix testSetAdditionalRolesCanAddDeprecatedMasterRole() by removing the initial assertion (#3441)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Replace internal usages of 'master' term in 'server/src/test' directory (#2520)

* Replace the non-inclusive terminology "master" with "cluster manager" in code comments, internal variable/method/class names, in `server/src/test` directory.
* Backwards compatibility is not impacted.
* Add a new unit test `testDeprecatedMasterNodeFilter()` to validate using `master:true` or `master:false` can filter the node in [Cluster Stats](https://opensearch.org/docs/latest/opensearch/rest-api/cluster-stats/) API, after the `master` role is deprecated in PR #2424

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Removing unused method from TransportSearchAction (#3437)

* Removing unused method from TransportSearchAction

Signed-off-by: Ankit Jain <jain.ankitk@gmail.com>

* Set term vector flags to false for ._index_prefix field (#1901). (#3119)

* Set term vector flags to false for ._index_prefix field (#1901).

Signed-off-by: Vesa Pehkonen <vesa.pehkonen@intel.com>

* Replaced the FieldType copy ctor with ctor for the prefix field and replaced
setting the field type parameters with setIndexOptions(). (#1901)

Signed-off-by: Vesa Pehkonen <vesa.pehkonen@intel.com>

* Added tests for term vectors. (#1901)

Signed-off-by: Vesa Pehkonen <vesa.pehkonen@intel.com>

* Fixed code formatting error.

Signed-off-by: Vesa Pehkonen <vesa.pehkonen@intel.com>

Co-authored-by: sdp <sdp@9049fa06826d.jf.intel.com>

* [BUG] Fixing org.opensearch.monitor.os.OsProbeTests > testLogWarnCpuMessageOnlyOnes when cgroups are available but cgroup stats is not (#3448)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* [Segment Replication] Add SegmentReplicationTargetService to orchestrate replication events. (#3439)

* Add SegmentReplicationTargetService to orchestrate replication events.

This change introduces  boilerplate classes for Segment Replication and a target service
to orchestrate replication events.

It also includes two refactors of peer recovery components for reuse.
1. Rename RecoveryFileChunkRequest to FileChunkRequest and extract code to handle throttling into
ReplicationTarget.
2. Extracts a component to execute retryable requests over the transport layer.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Code cleanup.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Make SegmentReplicationTargetService component final so that it can not
be extended by plugins.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Bump azure-core-http-netty from 1.11.9 to 1.12.0 in /plugins/repository-azure (#3474)

Bumps [azure-core-http-netty](https://github.com/Azure/azure-sdk-for-java) from 1.11.9 to 1.12.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-java/releases)
- [Commits](Azure/azure-sdk-for-java@azure-core-http-netty_1.11.9...azure-core_1.12.0)

---
updated-dependencies:
- dependency-name: com.azure:azure-core-http-netty
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update to Apache Lucene 9.2 (#3477)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Bump protobuf-java from 3.20.1 to 3.21.1 in /plugins/repository-hdfs (#3472)

Signed-off-by: dependabot[bot] <support@github.com>

* [Upgrade] Lucene-9.3.0-snapshot-823df23 (#3478)

Upgrades to latest snapshot of lucene 9.3.0.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

* Filter out invalid URI and HTTP method in the error message of no handler found for a REST request (#3459)

Filter out invalid URI and HTTP method of a error message, which shown when there is no handler found for a REST request sent by user, so that HTML special characters <>&"' will not shown in the error message.

The error message is return as mine-type `application/json`, which can't contain active (script) content, so it's not a vulnerability. Besides, no browsers are going to render as html when the mine-type is that.
While the common security scanners will raise a false-positive alarm for having HTML tags in the response without escaping the HTML special characters, so the solution only aims to satisfy the code security scanners.

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Support use of IRSA for repository-s3 plugin credentials (#3475)

* Support use of IRSA for repository-s3 plugin credentials

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Bump google-auth-library-oauth2-http from 0.20.0 to 1.7.0 in /plugins/repository-gcs (#3473)

* Bump google-auth-library-oauth2-http in /plugins/repository-gcs

Bumps google-auth-library-oauth2-http from 0.20.0 to 1.7.0.

---
updated-dependencies:
- dependency-name: com.google.auth:google-auth-library-oauth2-http
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Use variable to define the version of dependency google-auth-library-java

Signed-off-by: Tianli Feng <ftianli@amazon.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tianli Feng <ftianli@amazon.com>

* [Segment Replication] Added source-side classes for orchestrating replication events (#3470)

This change expands on the existing SegmentReplicationSource interface and its corresponding Factory class by introducing an implementation where the replication source is a primary shard (PrimaryShardReplicationSource). These code paths execute on the target. The primary shard implementation creates the requests to be send to the source/primary shard.

Correspondingly, this change also defines two request classes for the GET_CHECKPOINT_INFO and GET_SEGMENT_FILES requests as well as an abstract superclass.

A CopyState class has been introduced that captures point-in-time, file-level details from an IndexShard. This implementation mirrors Lucene's NRT CopyState implementation.

Finally, a service class has been introduce for segment replication that runs on the source side (SegmentReplicationSourceService) which handles these two types of incoming requests. This includes private handler classes that house the logic to respond to these requests, with some functionality stubbed for now. The service class also uses a simple map to cache CopyState objects that would be needed by replication targets.

Unit tests have been added/updated for all new functionality.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* [Dependency upgrade] google-oauth-client to 1.33.3 (#3500)

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* move bash flag to set statement (#3494)

Passing bash with flags to the first argument of /usr/bin/env requires
its own flag to interpret it correctly.  Rather than use `env -S` to
split the argument, have the script `set -e` to enable the same behavior
explicitly in preinst and postinst scripts.

Also set `-o pipefail` for consistency.

Closes: #3492

Signed-off-by: Cole White <cwhite@wikimedia.org>

* Support use of IRSA for repository-s3 plugin credentials: added YAML Rest test case (#3499)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Bump azure-storage-common from 12.15.0 to 12.16.0 in /plugins/repository-azure (#3517)

* Bump azure-storage-common in /plugins/repository-azure

Bumps [azure-storage-common](https://github.com/Azure/azure-sdk-for-java) from 12.15.0 to 12.16.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-java/releases)
- [Commits](Azure/azure-sdk-for-java@azure-storage-blob_12.15.0...azure-storage-blob_12.16.0)

---
updated-dependencies:
- dependency-name: com.azure:azure-storage-common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump google-oauth-client from 1.33.3 to 1.34.0 in /plugins/discovery-gce (#3516)

* Bump google-oauth-client from 1.33.3 to 1.34.0 in /plugins/discovery-gce

Bumps [google-oauth-client](https://github.com/googleapis/google-oauth-java-client) from 1.33.3 to 1.34.0.
- [Release notes](https://github.com/googleapis/google-oauth-java-client/releases)
- [Changelog](https://github.com/googleapis/google-oauth-java-client/blob/main/CHANGELOG.md)
- [Commits](googleapis/google-oauth-java-client@v1.33.3...v1.34.0)

---
updated-dependencies:
- dependency-name: com.google.oauth-client:google-oauth-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix the support of RestClient Node Sniffer for version 2.x and update tests (#3487)

Fix the support of RestClient Node Sniffer for OpenSearch 2.x, and update unit tests for OpenSearch.
The current code contains the logic to be compatible with Elasticsearch 2.x version, which is conflict with OpenSearch 2.x, so removed that part of legacy code.

* Update the script create_test_nodes_info.bash to dump the response of Nodes Info API GET _nodes/http for OpenSearch 1.0 and 2.0 version, which used for unit test.
* Remove the support of Elasticsearch version 2.x for the Sniffer
* Update unit test to validate the Sniffer compatible with OpenSearch 1.x and 2.x
* Update the API response parser to meet the array notation (in ES 6.1 and above) for the node attributes setting. It will result the value of `node.attr` setting will not be parsed as array in the Sniffer, when using the Sniffer on cluster in Elasticsearch 6.0 and above.
* Replace "master" node role with "cluster_manager" in unit test

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Bump com.diffplug.spotless from 6.6.1 to 6.7.0 (#3513)

Bumps com.diffplug.spotless from 6.6.1 to 6.7.0.

---
updated-dependencies:
- dependency-name: com.diffplug.spotless
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump guava from 18.0 to 23.0 in /plugins/ingest-attachment (#3357)

* Bump guava from 18.0 to 23.0 in /plugins/ingest-attachment

Bumps [guava](https://github.com/google/guava) from 18.0 to 23.0.
- [Release notes](https://github.com/google/guava/releases)
- [Commits](google/guava@v18.0...v23.0)

---
updated-dependencies:
- dependency-name: com.google.guava:guava
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Add more ingorance of using internal java API sun.misc.Unsafe

Signed-off-by: Tianli Feng <ftianli@amazon.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tianli Feng <ftianli@amazon.com>

* Added bwc version 2.0.1 (#3452)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>

Co-authored-by: opensearch-ci-bot <opensearch-ci-bot@users.noreply.github.com>

* Add release notes for 1.3.3 (#3549)

Signed-off-by: Xue Zhou <xuezhou@amazon.com>

* [Upgrade] Lucene-9.3.0-snapshot-b7231bb (#3537)

Upgrades to latest snapshot of lucene 9.3; including reducing maxFullFlushMergeWaitMillis 
in LuceneTest.testWrapLiveDocsNotExposeAbortedDocuments to 0 ms to ensure aborted 
docs are not merged away in the test with the new mergeOnRefresh default policy.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

* [Remote Store] Upload segments to remote store post refresh (#3460)

* Add RemoteDirectory interface to copy segment files to/from remote store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>

* Add index level setting for remote store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>

* Add RemoteDirectoryFactory and use RemoteDirectory instance in RefreshListener

Co-authored-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Upload segment to remote store post refresh

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>

* Fixing VerifyVersionConstantsIT test failure (#3574)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Bump jettison from 1.4.1 to 1.5.0 in /plugins/discovery-azure-classic (#3571)

* Bump jettison from 1.4.1 to 1.5.0 in /plugins/discovery-azure-classic

Bumps [jettison](https://github.com/jettison-json/jettison) from 1.4.1 to 1.5.0.
- [Release notes](https://github.com/jettison-json/jettison/releases)
- [Commits](jettison-json/jettison@jettison-1.4.1...jettison-1.5.0)

---
updated-dependencies:
- dependency-name: org.codehaus.jettison:jettison
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump google-api-services-storage from v1-rev20200814-1.30.10 to v1-rev20220608-1.32.1 in /plugins/repository-gcs (#3573)

* Bump google-api-services-storage in /plugins/repository-gcs

Bumps google-api-services-storage from v1-rev20200814-1.30.10 to v1-rev20220608-1.32.1.

---
updated-dependencies:
- dependency-name: com.google.apis:google-api-services-storage
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

* Upgrade Google HTTP Client to 1.42.0

Signed-off-by: Xue Zhou <xuezhou@amazon.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xue Zhou <xuezhou@amazon.com>

* Add flat_skew setting to node overload decider (#3563)

* Add flat_skew setting to node overload decider

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>

* Bump xmlbeans from 5.0.3 to 5.1.0 in /plugins/ingest-attachment (#3572)

* Bump xmlbeans from 5.0.3 to 5.1.0 in /plugins/ingest-attachment

Bumps xmlbeans from 5.0.3 to 5.1.0.

---
updated-dependencies:
- dependency-name: org.apache.xmlbeans:xmlbeans
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump google-oauth-client from 1.34.0 to 1.34.1 in /plugins/discovery-gce (#3570)

* Bump google-oauth-client from 1.34.0 to 1.34.1 in /plugins/discovery-gce

Bumps [google-oauth-client](https://github.com/googleapis/google-oauth-java-client) from 1.34.0 to 1.34.1.
- [Release notes](https://github.com/googleapis/google-oauth-java-client/releases)
- [Changelog](https://github.com/googleapis/google-oauth-java-client/blob/main/CHANGELOG.md)
- [Commits](googleapis/google-oauth-java-client@v1.34.0...v1.34.1)

---
updated-dependencies:
- dependency-name: com.google.oauth-client:google-oauth-client
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix for bug showing incorrect awareness attributes count in AwarenessAllocationDecider (#3428)

* Fix for bug showing incorrect awareness attributes count in AwarenessAllocationDecider

Signed-off-by: Anshu Agarwal <anshukag@amazon.com>

* Added bwc version 1.3.4 (#3552)

Signed-off-by: GitHub <noreply@github.com>

Co-authored-by: opensearch-ci-bot <opensearch-ci-bot@users.noreply.github.com>

* Support dynamic node role (#3436)

* Support unknown node role

Currently OpenSearch only supports several built-in nodes like data node
role. If specify unknown node role, OpenSearch node will fail to start.
This limit how to extend OpenSearch to support some extension function.
For example, user may prefer to run ML tasks on some dedicated node
which doesn't serve as any built-in node roles. So the ML tasks won't
impact OpenSearch core function. This PR removed the limitation and user
can specify any node role and OpenSearch will start node correctly with
that unknown role. This opens the door for plugin developer to run
specific tasks on dedicated nodes.

Issue: #2877

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix cat nodes rest API spec

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix mixed cluster IT failure

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* add DynamicRole

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* change generator method name

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix failed docker test

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* transform role name to lower case to avoid confusion

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* transform the node role abbreviation to lower case

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* fix checkstyle

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* add test for case-insensitive role name change

Signed-off-by: Yaliang Wu <ylwu@amazon.com>

* Rename package 'o.o.action.support.master' to 'o.o.action.support.clustermanager' (#3556)

* Rename package org.opensearch.action.support.master to org.opensearch.action.support.clustermanager

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Rename classes with master term in the package org.opensearch.action.support.master

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Deprecate classes in org.opensearch.action.support.master

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Remove pakcage o.o.action.support.master

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Move package-info back

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Move package-info to new folder

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Correct the package-info

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Fixing flakiness of ShuffleForcedMergePolicyTests (#3591)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Deprecate classes in org.opensearch.action.support.master (#3593)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Add release notes for version 2.0.1 (#3595)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>

* Fix NPE when minBound/maxBound is not set before being called. (#3605)

Signed-off-by: George Apaaboah <george.apaaboah@gmail.com>

* Added bwc version 2.0.2 (#3613)

Co-authored-by: opensearch-ci-bot <opensearch-ci-bot@users.noreply.github.com>

* Fix false positive query timeouts due to using cached time (#3454)

* Fix false positive query timeouts due to using cached time

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>

* delegate nanoTime call to SearchContext

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>

* add override to SearchContext getRelativeTimeInMillis to force non cached time

Signed-off-by: Ahmad AbuKhalil <abukhali@amazon.com>

* Fix random gradle check failure issue 3584. (#3627)

* [Segment Replication] Add components for segment replication to perform file copy. (#3525)

* Add components for segment replication to perform file copy.

This change adds the required components to SegmentReplicationSourceService to initiate copy and react to lifecycle events.
Along with new components it refactors common file copy code from RecoverySourceHandler into reusable pieces.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Deprecate public methods and variables with master term in package 'org.opensearch.action.support.master' (#3617)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Add replication orchestration for a single shard (#3533)

* implement segment replication target

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* test added

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* changes to tests + finalizeReplication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* fix style check

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* addressing comments + fix gradle check

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* added test + addressed review comments

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* [BUG] opensearch crashes on closed client connection before search reply (#3626)

* [BUG] opensearch crashes on closed client connection before search reply

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Addressing code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Add all deprecated method in the package with new name 'org.opensearch.action.support.clustermanager' (#3644)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Introduce TranslogManager implementations decoupled from the Engine (#3638)

* Introduce decoupled translog manager interfaces

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* Adding onNewCheckpoint to Start Replication on Replica Shard when Segment Replication is turned on (#3540)

* Adding onNewCheckpoint and it's test to start replication. SCheck for latestcheckpoint and replaying logic is removed from this commit and will be added in a different PR

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Changing binding/inject logic and addressing comments from PR

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Applying spotless check

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Moving shouldProcessCheckpoint() to IndexShard, and removing some trace logs

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* applying spotlessApply

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Adding more info to log statement in targetservice class

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* applying spotlessApply

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Addressing comments on PR

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Adding teardown() in SegmentReplicationTargetServiceTests.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* fixing testShouldProcessCheckpoint() in SegmentReplicationTargetServiceTests

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Removing CheckpointPublisherProvider in IndicesModule

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* spotless check apply

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Remove class org.opensearch.action.support.master.AcknowledgedResponse (#3662)

* Remove class org.opensearch.action.support.master.AcknowledgedResponse

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Remove class org.opensearch.action.support.master.AcknowledgedRequest RequestBuilder ShardsAcknowledgedResponse

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Restore AcknowledgedResponse and AcknowledgedRequest to package org.opensearch.action.support.master (#3669)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* [BUG] Custom POM configuration for ZIP publication produces duplicit tags (url, scm) (#3656)

* [BUG] Custom POM configuration for ZIP publication produces duplicit tags (url, scm)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Added test case for pluginZip with POM

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Support both Gradle 6.8.x and Gradle 7.4.x

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Adding 2.2.0 Bwc version to main (#3673)

* Upgraded to t-digest 3.3. (#3634)

* Revert renaming method onMaster() and offMaster() in interface LocalNodeMasterListener (#3686)

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Upgrading AWS SDK dependency for native plugins (#3694)

* Merge branch 'feature/point_in_time' of https://github.com/opensearch-project/OpenSearch into fb

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Suraj Singh <surajrider@gmail.com>
Co-authored-by: Marc Handalian <handalm@amazon.com>
Co-authored-by: Tianli Feng <ftianli@amazon.com>
Co-authored-by: Andriy Redko <andriy.redko@aiven.io>
Co-authored-by: Rabi Panda <adnapibar@gmail.com>
Co-authored-by: Nick Knize <nknize@apache.org>
Co-authored-by: Poojita Raj <poojiraj@amazon.com>
Co-authored-by: Rishikesh Pasham <62345295+Rishikesh1159@users.noreply.github.com>
Co-authored-by: Ankit Jain <jain.ankitk@gmail.com>
Co-authored-by: vpehkone <101240162+vpehkone@users.noreply.github.com>
Co-authored-by: sdp <sdp@9049fa06826d.jf.intel.com>
Co-authored-by: Kartik Ganesh <gkart@amazon.com>
Co-authored-by: Cole White <42356806+shdubsh@users.noreply.github.com>
Co-authored-by: opensearch-trigger-bot[bot] <98922864+opensearch-trigger-bot[bot]@users.noreply.github.com>
Co-authored-by: opensearch-ci-bot <opensearch-ci-bot@users.noreply.github.com>
Co-authored-by: Xue Zhou <85715413+xuezhou25@users.noreply.github.com>
Co-authored-by: Sachin Kale <sachinpkale@gmail.com>
Co-authored-by: Sachin Kale <kalsac@amazon.com>
Co-authored-by: Xue Zhou <xuezhou@amazon.com>
Co-authored-by: Rishab Nahata <rishabnahata07@gmail.com>
Co-authored-by: Anshu Agarwal <anshuagarwal11@gmail.com>
Co-authored-by: Yaliang Wu <ylwu@amazon.com>
Co-authored-by: Kunal Kotwani <kkotwani@amazon.com>
Co-authored-by: George Apaaboah <35894485+GeorgeAp@users.noreply.github.com>
Co-authored-by: Ahmad AbuKhalil <105249973+aabukhalil@users.noreply.github.com>
Co-authored-by: Bukhtawar Khan <bukhtawa@amazon.com>
Co-authored-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>
Co-authored-by: Daniel (dB.) Doubrovkine <dblock@dblock.org>
@Rishikesh1159 Rishikesh1159 added the backport 2.x Backport to 2.x branch label Jul 22, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 22, 2022
…gment replication and recovery code paths (#3234)

* RecoveryState inherits from ReplicationState + RecoveryTarget inherits from ReplicationTarget

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring: mixedClusterVersion error fix + move Stage to ReplicationState

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* pull ReplicationListener into a top level class + add javadocs + address review comments

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* fix javadoc

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* review changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring the hierarchy relationship between repl and recovery

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* style fix

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* move package common under replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename to replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename and doc changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
(cherry picked from commit a023ad9)
dreamer-89 pushed a commit that referenced this pull request Jul 24, 2022
…gment replication and recovery code paths (#3234) (#3984)

* RecoveryState inherits from ReplicationState + RecoveryTarget inherits from ReplicationTarget

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring: mixedClusterVersion error fix + move Stage to ReplicationState

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* pull ReplicationListener into a top level class + add javadocs + address review comments

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* fix javadoc

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* review changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Refactoring the hierarchy relationship between repl and recovery

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* style fix

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* move package common under replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename to replication

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* rename and doc changes

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
(cherry picked from commit a023ad9)

Co-authored-by: Poojita Raj <poojiraj@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants