Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Use local opensearch.common.SetOnce instead of lucene's utility class #5947

Merged
merged 5 commits into from
Jan 20, 2023

Conversation

nknize
Copy link
Collaborator

@nknize nknize commented Jan 20, 2023

This copies Apache Lucene's SetOnce utility class to o.opensearch.common.SetOnce in the opensearch-core library and refactors all opensearch classes from the lucene implementation to the local opensearch implementation.

This accomplishes three objectives:

  1. remove the dependency on lucene-core library for classes that only import this one lucene utility class,
  2. shield opensearch core dependency on this class from any upstream breaking changes,
  3. further decouples other foundation classes from third party libraries so they can be refactored from the server module into opensearch support libraries.

relates #5910

@nknize nknize added enhancement Enhancement or improvement to existing feature or request v3.0.0 Issues and PRs related to version 3.0.0 labels Jan 20, 2023
…ility class

This copies Apache Lucene's o.a.l.util.SetOnce to o.opensearch.common.SetOnce in
the opensearch-core library and refactors all opensearch classes from the lucene
implementation to the local opensearch implementation. This accomplishes three
objectives: 1. remove the dependency on lucene-core library for classes that
only import this one lucene utility class, 2. shield opensearch core dependency
on this class from any upstream breaking changes, 3. further decouples other
foundation classes from third party libraries so they can be refactored from the
server module into opensearch support libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationRelocationIT.testStartReplicaAfterPrimaryIndexesDocs

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@reta
Copy link
Collaborator

reta commented Jan 20, 2023

@nknize I am generally against copying from Apache Lucene (there are so many classes copied already), anyway regarding your point:

  1. remove the dependency on lucene-core library for classes that only import this one lucene utility class

Is it realistic to break dependency on lucene-core from opensearch-core? (sorry I have not looked yet) If yes - +1 to this change, if not - we are not really solving the problem (I agree with your arguments but if lucene-core has to stay as dependency - what is the point?).

@nknize
Copy link
Collaborator Author

nknize commented Jan 20, 2023

I am generally against copying from Apache Lucene

In general I am as well. But I do have my acceptable boundaries, such as common classes like SetOnce that do not provide Lucene unique functionality.

Is it realistic to break dependency on lucene-core from opensearch-core?

There is no dependency on lucene-core from opensearch-core library.

This is to break dependency on classes in the :server module that only import the SetOnce class from lucene-core (e.g., Settings.java, Aggregations.java) and trim down the dependency in others (e.g., CancellableThreads.java) so they can be incrementally refactored out of :server module and into libraries without requiring the lucene-core dependency. It relates to the effort of trimming down server into more separable parts such that the interdependencies are untangled.

Other positive side effects include shielding OpenSearch from any undesireable changes, bugs, or refactoring that may happen upstream (it is still marked @lucene.experimental). Of course the counter argument is not benefiting from any improvements but the class hasn't been functionally updated in the last 4 years anyway so the risk is minimal.

An alternative would be to open an upstream lucene change that teases these common utility classes (and others like Accountable, BitUtil, etc) into a new lucene-commons lib. I'm sure that would be welcomed but it can also happen independent of this change.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testIndexCreateBlockWhenAllNodesExceededHighWatermark
      1 org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testIndexCreateBlockIsRemovedWhenAnyNodesNotExceedHighWatermark

@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2023

Codecov Report

Merging #5947 (18dcf82) into main (f7dc32e) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #5947      +/-   ##
============================================
- Coverage     70.85%   70.84%   -0.01%     
- Complexity    58711    58794      +83     
============================================
  Files          4770     4771       +1     
  Lines        280744   280761      +17     
  Branches      40539    40540       +1     
============================================
- Hits         198911   198897      -14     
- Misses        65455    65566     +111     
+ Partials      16378    16298      -80     
Impacted Files Coverage Δ
...ch/analysis/common/CommonAnalysisModulePlugin.java 92.33% <ø> (ø)
...nsearch/ingest/geoip/DatabaseReaderLazyLoader.java 75.00% <ø> (ø)
.../org/opensearch/painless/PainlessModulePlugin.java 80.95% <ø> (ø)
...g/opensearch/percolator/PercolateQueryBuilder.java 83.00% <ø> (ø)
...va/org/opensearch/systemd/SystemdModulePlugin.java 74.41% <ø> (ø)
...a/org/opensearch/transport/Netty4ModulePlugin.java 73.68% <ø> (ø)
...earch/plugin/discovery/gce/GceDiscoveryPlugin.java 20.83% <ø> (ø)
...rg/opensearch/repositories/s3/S3BlobContainer.java 38.81% <ø> (+2.95%) ⬆️
...g/opensearch/transport/nio/NioTransportPlugin.java 25.00% <ø> (ø)
...earch/action/search/AbstractSearchAsyncAction.java 78.57% <ø> (-0.30%) ⬇️
... and 480 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@reta
Copy link
Collaborator

reta commented Jan 20, 2023

@nknize I would suggest to add Apache Lucene's SetOnce utility class to forbidden APIs list [1] so when someone accidentally imports it in core, we could catch that and point out for replacement.

[1] https://github.com/opensearch-project/OpenSearch/blob/main/buildSrc/src/main/resources/forbidden/

@nknize
Copy link
Collaborator Author

nknize commented Jan 20, 2023

I would suggest to add Apache Lucene's SetOnce utility class to forbidden APIs list

That's a great idea. I'll do that here.

* This is borrowed from lucene's experimental API. It is not reused to eliminate the dependency
* on lucene core for such a simple (standalone) utility class that may change beyond OpenSearch needs.
*
* @opensearch.internal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these public utilities in the core library fair game to be used by plugins? Several of the plugins packaged in this repo are using this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! Good call out. I'll change the label to @opensearch.api

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing in libs/core currently has any of these javadoc annotations. Is it fair to say that everything Java public in a library module is part of the API, or do we want to start tagging things with annotations?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing in libs/core currently has any of these javadoc annotations.

I added them in #5902 (along with enabling missingJavadocs task for lib-core) but started teasing that change into a separate PR.

Is it fair to say that everything Java public in a library module is part of the API

That would be a reasonable assumption but alas, the inherited code did not thoughtfully apply java modifiers. The default behavior of the predecessor was public. I'm not sure what the best course would be for this beyond "proceed at your own risk" at which point maybe start conservatively @opensearch.internal (like my change did) and address on a per class basis? I'm open to majority opinion on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm being naive, but I would think that a library like :core would contain few to none of the cross package dependencies of logically internal pieces like is common in :server, which means we could rely on the Java access qualifiers. If there is anything currently public that should not be, then we could @Deprecate it now and remove/change qualifier in 3.0. Is that tenable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that a library like :core would contain few to none of the cross package dependencies of logically internal pieces like is common in :server

Correct. There shouldn't be any cross package dependencies. :server should depend on opensearch-core and no dependency the other way around.

Furthermore, IIRC, the original intent was to have zero dependencies in opensearch-core. I'm not sure what the benefit of that really is (if anything it restricts what can be refactored to core) so #5902 breaks that intent by introducing a dependency on the jackson library (which used to be a dependency in x-content). We could continue to discuss whether we want/need a dependency free commons library in the #5902 PR and if so consider a new dependency free library called opensearch-commons and let opensearch-core bring in dependencies.

@@ -50,6 +50,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Migrate client transports to Apache HttpClient / Core 5.x ([#4459](https://github.com/opensearch-project/OpenSearch/pull/4459))
- Changed http code on create index API with bad input raising NotXContentException from 500 to 400 ([#4773](https://github.com/opensearch-project/OpenSearch/pull/4773))
- Change http code for DecommissioningFailedException from 500 to 400 ([#5283](https://github.com/opensearch-project/OpenSearch/pull/5283))
- [Refactor] Use local opensearch.common.SetOnce instead of lucene's utility class ([#5947](https://github.com/opensearch-project/OpenSearch/pull/5947))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the compatibility concern with backporting this to 2.x? It looks to me like it would be fine since the existing Lucene class remains available for anything that is using it. (If it is possible to backport this change, then the changelog entry should go in the [Unreleased 2.x] section).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are no compatibility concerns for backport: the dependency graph is not changing so the SetOnce is still available at the same place as before (lucene-core).

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
@@ -50,6 +50,7 @@ java.nio.channels.SocketChannel#connect(java.net.SocketAddress)
java.lang.Boolean#getBoolean(java.lang.String)

org.apache.lucene.util.IOUtils @ use @org.opensearch.core.internal.io instead
org.apache.lucene.util.SetOnce @ use @org.opensearch.common.SetOnce instead
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@nknize nknize merged commit 963bbe0 into opensearch-project:main Jan 20, 2023
@nknize nknize added the backport 2.x Backport to 2.x branch label Jan 20, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5947-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 963bbe0862f9be5e56617430d1ae38c676855089
# Push it to GitHub
git push --set-upstream origin backport/backport-5947-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-5947-to-2.x.

nknize added a commit to nknize/OpenSearch that referenced this pull request Jan 20, 2023
…ility class (opensearch-project#5947)

This copies Apache Lucene's o.a.l.util.SetOnce to o.opensearch.common.SetOnce in
the opensearch-core library and refactors all opensearch classes from the lucene
implementation to the local opensearch implementation. This accomplishes three
objectives: 1. remove the dependency on lucene-core library for classes that
only import this one lucene utility class, 2. shield opensearch core dependency
on this class from any upstream breaking changes, 3. further decouples other
foundation classes from third party libraries so they can be refactored from the
server module into opensearch support libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
andrross pushed a commit that referenced this pull request Jan 20, 2023
…ility class (#5947) (#5961)

This copies Apache Lucene's o.a.l.util.SetOnce to o.opensearch.common.SetOnce in
the opensearch-core library and refactors all opensearch classes from the lucene
implementation to the local opensearch implementation. This accomplishes three
objectives: 1. remove the dependency on lucene-core library for classes that
only import this one lucene utility class, 2. shield opensearch core dependency
on this class from any upstream breaking changes, 3. further decouples other
foundation classes from third party libraries so they can be refactored from the
server module into opensearch support libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
…ility class (#5947) (#5961)

This copies Apache Lucene's o.a.l.util.SetOnce to o.opensearch.common.SetOnce in
the opensearch-core library and refactors all opensearch classes from the lucene
implementation to the local opensearch implementation. This accomplishes three
objectives: 1. remove the dependency on lucene-core library for classes that
only import this one lucene utility class, 2. shield opensearch core dependency
on this class from any upstream breaking changes, 3. further decouples other
foundation classes from third party libraries so they can be refactored from the
server module into opensearch support libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants