Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract service for accessing Flint index metadata #495

Merged
merged 22 commits into from
Aug 17, 2024

Conversation

seankao-az
Copy link
Collaborator

@seankao-az seankao-az commented Jul 29, 2024

Description

  • add FlintIndexMetadataService
    • move FlintMetadata and FlintVersion to flint-commons
    • remove storage-specific operation from FlintMetadata
      • remove OpenSearch json (de)serialization from FlintMetadata
      • move to OpenSearch implementation for FlintIndexMetadataService
    • remove schema parser in FlintMetadata builder to remove dependency to flint-core
      • FlintSparkIndex generate schema map from json
  • operations
    • move get(All)IndexMetadata from FlintClient to FlintIndexMetadataService
    • remove updateIndex from FlintClient; add updateIndexMetadata to FlintIndexMetadataService
    • add empty implementation for OpenSearch deleteIndexMetadata, as _meta will get deleted alongside FlintOpenSearchCilent.deleteIndex
  • add builder for FlintIndexMetadataService and options
    • fix FlintOptions to read spark properties for custom service class
    • remove SparkConf from service builder argument
  • misc
    • rename for log entry properties
    • correct typo planTransformer for ppl
    • move OpenSearch index name sanitization to utils class
    • fix OpenSearch index name not sanitized for metadata log bug
    • remove OpenSearchIndexTable's dependency on Flint client/services

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@seankao-az
Copy link
Collaborator Author

seankao-az commented Jul 30, 2024

Scrapping initial design. Further creating yet another class for metadata isn't ideal.

@seankao-az seankao-az changed the title Abstract service for accessing Flint index specification Abstract service for accessing Flint index metadata Aug 2, 2024
@seankao-az seankao-az force-pushed the index-metadata-service branch 6 times, most recently from b0b1206 to 55f3eb5 Compare August 5, 2024 23:51
* rename for log entry properties
* correct typo planTransformer

Signed-off-by: Sean Kao <seankao@amazon.com>
* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite

Signed-off-by: Sean Kao <seankao@amazon.com>
* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>
* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az seankao-az force-pushed the index-metadata-service branch from a018703 to 9fdac45 Compare August 6, 2024 01:00
@seankao-az seankao-az self-assigned this Aug 6, 2024
@seankao-az seankao-az added maintenance Code refactoring 0.5 labels Aug 6, 2024
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
@@ -160,8 +161,8 @@ object OpenSearchCluster {
* An list of OpenSearchIndexTable instance.
*/
def apply(indexName: String, options: FlintOptions): Seq[OpenSearchIndexTable] = {
val client = FlintClientBuilder.build(options)
client
val indexMetadataService = FlintIndexMetadataServiceBuilder.build(options)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenSearchIndexTable should not depend on FlintIndexMetadataServiceBuilder. If the reasons is getAllIndexMetadata is moved away from FlintClient. can we add in FlintClient instead.

def getAllIndexMetadata(): IndexMetadata

Copy link
Collaborator Author

@seankao-az seankao-az Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why that's the case? Reason why I move getAllIndexMetadata away from FlintClient is because we might use FlintOpenSearchClient while the index metadata is stored with some other custom storage (not OpenSearch). In this case we would like to fetch index metadata with FlintIndexMetadataService, not with FlintClient

Copy link
Collaborator

@penghuo penghuo Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenSearchIndexTable only refering data store in OpenSearch cluster. FlintIndexMetadata releated to Flint index metadata (skipping, covering), But OpenSearch Index metadata has nothing to do with Flint index, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So this is dealing with all OpenSearch indexes including regular, non-Flint indexes. It doesn't use the _meta for Flint indexes, and it's only using the mappings and settings part of FlintMetadata. Then those won't be covered by FlintIndexMetadataService then.

However methods in FlintClient are designed for Flint indexes, not regular OS index. I think mixing use case for Flint index and regular OS index in FlintClient would be confusing. Also, having both FlintClient.getAllIndexMetadata and FlintIndexMetadataService.getAllIndexMetadata is confusing.

I'll see how to do it.. I don't think OpenSearchIndexTable should use FlintClient or FlintIndexMetadataService. It should either use a private method for that or I move the method into some util class

Copy link
Collaborator Author

@seankao-az seankao-az Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ended up getting rid of FlintClient, FlintIndexMetadataService, and FlintMetadata from OpenSearchCluster
It has its own method for getting MetaData for regular non-Flint OpenSearch indices.

Had some trouble with mocking it in ApplyFlintSparkCoveringIndexSuite so I made it into java class (rather than scala object) which can be easily mocked.

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
* @param indexNamePattern index name pattern
* @return list of OpenSearch table metadata
*/
public static List<MetaData> getAllOpenSearchTableMetadata(FlintOptions options, String... indexNamePattern) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not keep getOpenSearchTableMetadata method in FlintClient?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really an operation for Flint indexes. In the context of FlintClient, it's for operations on Flint index and shouldn't be bounded to specific storage.

Flint index client that provides API for metadata and data operations on a Flint index regardless of concrete storage.

Having flintClient.getAllOpenSearchTableMetadata will violate that.

Do you see a better way to fit this method into FlintClient?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative is to add yet another client OpenSearchClient (non-Flint version) but that's overkill I think

Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az seankao-az merged commit f5ad574 into opensearch-project:main Aug 17, 2024
5 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 17, 2024
* fix missed renames

* rename for log entry properties
* correct typo planTransformer

Signed-off-by: Sean Kao <seankao@amazon.com>

* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* empty implementation for OS deleteIndexMetadata

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix new FlintOption missing from FlintSparkConf

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix FlintOpenSearchClientSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* delete file (missed in resolving conflict)

Signed-off-by: Sean Kao <seankao@amazon.com>

* Use service builder in OpenSearchCluster

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix IT

Signed-off-by: Sean Kao <seankao@amazon.com>

* add test suites

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name for opensearch metadata log

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove spark-warehouse files

Signed-off-by: Sean Kao <seankao@amazon.com>

* exclude _meta field for createIndex in OpenSearch

Signed-off-by: Sean Kao <seankao@amazon.com>

* catch client creation exception

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito

Signed-off-by: Sean Kao <seankao@amazon.com>

* update doc with spark config

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>
(cherry picked from commit f5ad574)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@opensearch-trigger-bot
Copy link

The backport to 0.5-nexus failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/opensearch-spark/backport-0.5-nexus 0.5-nexus
# Navigate to the new working tree
pushd ../.worktrees/opensearch-spark/backport-0.5-nexus
# Create a new branch
git switch --create backport/backport-495-to-0.5-nexus
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f5ad574957ab8616b7e16f20d6c93c2f8bd6905e
# Push it to GitHub
git push --set-upstream origin backport/backport-495-to-0.5-nexus
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/opensearch-spark/backport-0.5-nexus

Then, create a pull request where the base branch is 0.5-nexus and the compare/head branch is backport/backport-495-to-0.5-nexus.

seankao-az added a commit to seankao-az/opensearch-spark that referenced this pull request Aug 17, 2024
…ct#495)

* fix missed renames

* rename for log entry properties
* correct typo planTransformer

Signed-off-by: Sean Kao <seankao@amazon.com>

* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* empty implementation for OS deleteIndexMetadata

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix new FlintOption missing from FlintSparkConf

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix FlintOpenSearchClientSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* delete file (missed in resolving conflict)

Signed-off-by: Sean Kao <seankao@amazon.com>

* Use service builder in OpenSearchCluster

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix IT

Signed-off-by: Sean Kao <seankao@amazon.com>

* add test suites

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name for opensearch metadata log

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove spark-warehouse files

Signed-off-by: Sean Kao <seankao@amazon.com>

* exclude _meta field for createIndex in OpenSearch

Signed-off-by: Sean Kao <seankao@amazon.com>

* catch client creation exception

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito

Signed-off-by: Sean Kao <seankao@amazon.com>

* update doc with spark config

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>
(cherry picked from commit f5ad574)
seankao-az added a commit to seankao-az/opensearch-spark that referenced this pull request Aug 17, 2024
…ct#495)

* fix missed renames

* rename for log entry properties
* correct typo planTransformer

Signed-off-by: Sean Kao <seankao@amazon.com>

* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* empty implementation for OS deleteIndexMetadata

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix new FlintOption missing from FlintSparkConf

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix FlintOpenSearchClientSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* delete file (missed in resolving conflict)

Signed-off-by: Sean Kao <seankao@amazon.com>

* Use service builder in OpenSearchCluster

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix IT

Signed-off-by: Sean Kao <seankao@amazon.com>

* add test suites

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name for opensearch metadata log

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove spark-warehouse files

Signed-off-by: Sean Kao <seankao@amazon.com>

* exclude _meta field for createIndex in OpenSearch

Signed-off-by: Sean Kao <seankao@amazon.com>

* catch client creation exception

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito

Signed-off-by: Sean Kao <seankao@amazon.com>

* update doc with spark config

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>
(cherry picked from commit f5ad574)
Signed-off-by: Sean Kao <seankao@amazon.com>
seankao-az pushed a commit that referenced this pull request Aug 17, 2024
* fix missed renames

* rename for log entry properties
* correct typo planTransformer



* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite



* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)



* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)



* empty implementation for OS deleteIndexMetadata



* sanitize index name



* fix new FlintOption missing from FlintSparkConf



* fix FlintOpenSearchClientSuite



* delete file (missed in resolving conflict)



* Use service builder in OpenSearchCluster



* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument



* fix IT



* add test suites



* sanitize index name for opensearch metadata log



* remove spark-warehouse files



* exclude _meta field for createIndex in OpenSearch



* catch client creation exception



* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito



* update doc with spark config



---------


(cherry picked from commit f5ad574)

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
seankao-az added a commit to seankao-az/opensearch-spark that referenced this pull request Aug 17, 2024
…ct#495)

* fix missed renames

* rename for log entry properties
* correct typo planTransformer

Signed-off-by: Sean Kao <seankao@amazon.com>

* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* empty implementation for OS deleteIndexMetadata

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix new FlintOption missing from FlintSparkConf

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix FlintOpenSearchClientSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* delete file (missed in resolving conflict)

Signed-off-by: Sean Kao <seankao@amazon.com>

* Use service builder in OpenSearchCluster

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix IT

Signed-off-by: Sean Kao <seankao@amazon.com>

* add test suites

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name for opensearch metadata log

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove spark-warehouse files

Signed-off-by: Sean Kao <seankao@amazon.com>

* exclude _meta field for createIndex in OpenSearch

Signed-off-by: Sean Kao <seankao@amazon.com>

* catch client creation exception

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito

Signed-off-by: Sean Kao <seankao@amazon.com>

* update doc with spark config

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>
(cherry picked from commit f5ad574)
Signed-off-by: Sean Kao <seankao@amazon.com>
seankao-az added a commit to seankao-az/opensearch-spark that referenced this pull request Aug 21, 2024
…ct#495)

* fix missed renames

* rename for log entry properties
* correct typo planTransformer

Signed-off-by: Sean Kao <seankao@amazon.com>

* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)

Signed-off-by: Sean Kao <seankao@amazon.com>

* empty implementation for OS deleteIndexMetadata

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix new FlintOption missing from FlintSparkConf

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix FlintOpenSearchClientSuite

Signed-off-by: Sean Kao <seankao@amazon.com>

* delete file (missed in resolving conflict)

Signed-off-by: Sean Kao <seankao@amazon.com>

* Use service builder in OpenSearchCluster

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument

Signed-off-by: Sean Kao <seankao@amazon.com>

* fix IT

Signed-off-by: Sean Kao <seankao@amazon.com>

* add test suites

Signed-off-by: Sean Kao <seankao@amazon.com>

* sanitize index name for opensearch metadata log

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove spark-warehouse files

Signed-off-by: Sean Kao <seankao@amazon.com>

* exclude _meta field for createIndex in OpenSearch

Signed-off-by: Sean Kao <seankao@amazon.com>

* catch client creation exception

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito

Signed-off-by: Sean Kao <seankao@amazon.com>

* update doc with spark config

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>
(cherry picked from commit f5ad574)
Signed-off-by: Sean Kao <seankao@amazon.com>
seankao-az added a commit that referenced this pull request Aug 21, 2024
* fix missed renames

* rename for log entry properties
* correct typo planTransformer



* add FlintIndexMetadataService

* interface class for FlintIndexMetadataService
  * move FlintMetadata and FlintVersion to flint-commons
* remove ser/de from FlintMetadata; move to OS impl for
  FlintIndexMetadataService
* remove schema parser in FlintMetadata builder to remove dependency to
  opensearch
* FlintSparkIndex generate not only schema json but also map
* FlintMetadataSuite divided into two: one for builder and one for
  ser/de, which is merged to FlintOpenSearchIndexMetadataServiceSuite



* move get metadata functions to new service

* Remove getIndexMetadata and getAllIndexMetadata from FlintClient
* Implement the two for OpenSearch
  * TODO: sanitize index name
* Add builder for FlintIndexMetadataService and options
* Refactor caller of FlintClient.get(All)IndexMetadata with
  FlintIndexMetadataService
* TODO: test suite for getIndexMetadata and getAllIndexMetadata (might
  overlap with FlintOpenSearchClientSuite)



* update index metadata

* remove updateIndex from FlintClient
* implement updateIndexMetadata for FlintOpenSearchIndexMetadataService
* updateIndexMetadata upon create index in FlintSpark
  * for OS client + OS index metadata service, the call for update is
    redundant
  * it's for when some other index metadata service implementation is
    provided
* TODO: Suite for updateIndexMetadata (now shared with
  FlintOpenSearchClientSuite)



* empty implementation for OS deleteIndexMetadata



* sanitize index name



* fix new FlintOption missing from FlintSparkConf



* fix FlintOpenSearchClientSuite



* delete file (missed in resolving conflict)



* Use service builder in OpenSearchCluster



* fix service builder class

* fix FlintOptions for custom class spark properties
* remove SparkConf from builder argument



* fix IT



* add test suites



* sanitize index name for opensearch metadata log



* remove spark-warehouse files



* exclude _meta field for createIndex in OpenSearch



* catch client creation exception



* Fetch metadata for OpenSearch table

* OpenSearchCluster move to java file because scala object cannot be
  mocked in mockito



* update doc with spark config



---------


(cherry picked from commit f5ad574)

Signed-off-by: Sean Kao <seankao@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants