Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract metadata log operations from FlintClient into FlintMetadataLogService #379

Merged
merged 20 commits into from
Jun 18, 2024

Conversation

seankao-az
Copy link
Collaborator

@seankao-az seankao-az commented Jun 11, 2024

Description

  1. Extract metadata log operations from FlintClient into FlintMetadataLogService.
  2. Add flint options for custom class path and instantiate FlintMetadataLogService through reflection. re-evaluate in another PR
  3. FlintClient.getIndexMetadata no longer attaches metadata log entry, as it has no access to FlintMetadataLogService. This is moved to FlintSpark.describeIndex.
  4. Abstract recordHeartbeat method
  5. Collect metadata log related test cases into a new suite FlintMetadataLogITSuite

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az seankao-az self-assigned this Jun 11, 2024
@seankao-az seankao-az added maintenance Code refactoring 0.5 labels Jun 11, 2024
Signed-off-by: Sean Kao <seankao@amazon.com>
FlintClientBuilder uses reflection to instantiate
FlintMetadataLogService, which cannot be checked at compile time.

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
* Start a new optimistic transaction.
*
* @param indexName index name
* @param forceInit forceInit create empty metadata log if not exist.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forceInit sounds more like it tries to init even metadata already exist. initIfNotExist or initWhenNeeded should be better naming.

Copy link
Collaborator Author

@seankao-az seankao-az Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to use initIfNotExist for getMetadataLog. Keeping it as forceInit for startTransaction, as the semantics I think is to force init the transaction, instead of init the metadata log

@seankao-az seankao-az marked this pull request as draft June 13, 2024 01:44
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az seankao-az marked this pull request as ready for review June 13, 2024 01:59
Signed-off-by: Sean Kao <seankao@amazon.com>

public FlintOpenSearchClient(FlintOptions options) {
public FlintOpenSearchClient(FlintOptions options, FlintMetadataLogService metadataLogService) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should better naming FlintOpenSearchClient, the reasons it FlintOpenSearchClient depend FlintMedataLogService, not OpenSearch. Or in the other way,

  1. Remove FlintMedataLogService operation from FlintOpenSearchClient
  2. Modify FlintOpenSearchClient caller to call FlintMedataLogService

Copy link
Member

@vamsimanohar vamsimanohar Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for above comment.
Also, what is the value add of FlintOpenSearchClient if we are passing down everything to FlintMetadataLogService?

Can we replace FlintOpenSearchClient with FlintMetadataLogService where ever we make operations on FlintMetadataLog.

Copy link
Collaborator Author

@seankao-az seankao-az Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For startTransaction I agree that we can make the callers use FlintMetadataLogService instead. However, FlintOpenSearchClient getIndexMetadata still needs FlintMetadataLogService to getMetadataLog.

Will try to move it away from getIndexMetadata and instead use it in its caller (FlintSpark describeIndex)

Copy link
Collaborator

@dai-chen dai-chen Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we only initialize request index when starting transaction in createIndex. Now current change is to init whenever new API getIndexMetadataLog called? Not sure the impact and if benefit on code structure.

Copy link
Collaborator Author

@seankao-az seankao-az Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now current change is to init whenever new API getIndexMetadataLog called?

Sure... init is indeed not required. Changed so that the FlintMetadataLogService public interface for getIndexMetadataLog cannot init request index. It gets metadata log if available.
Now the initIfNotExist is only used as private method for FlintOpenSearchMetadataLogService as part of startTransaction with forceInit.

Copy link
Collaborator Author

@seankao-az seankao-az Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed FlintMetadataLogService operation from FlintOpenSearchClient

@noCharger
Copy link
Collaborator

Any plan to move FlintClient / FlintMetadataLogService to commons module?

Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az
Copy link
Collaborator Author

move FlintMetadataLogService to commons module

will do in separate PR

Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az seankao-az force-pushed the metadata-log-service branch from d81a35d to 79c7937 Compare June 14, 2024 19:19
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
@seankao-az seankao-az force-pushed the metadata-log-service branch from 9502ad6 to 346e344 Compare June 14, 2024 22:16
Signed-off-by: Sean Kao <seankao@amazon.com>
val latest = flintMetadataLogService
.getIndexMetadataLog(indexName)
.flatMap(_.getLatest)
if (latest.isPresent) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be not relelated to this PR, why latest metadatalog is requried? seems it could be optional?

Copy link
Collaborator Author

@seankao-az seankao-az Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was introduced as part of show flint index statement, to get index state. And indeed it is optional. The log entry is added to the FlintMetadata if available

@seankao-az seankao-az requested a review from penghuo June 18, 2024 17:04
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@seankao-az seankao-az merged commit d0caa7b into opensearch-project:main Jun 18, 2024
4 checks passed
seankao-az added a commit to seankao-az/opensearch-spark that referenced this pull request Aug 9, 2024
…gService (opensearch-project#379)

* metadata log service interface

Signed-off-by: Sean Kao <seankao@amazon.com>

* flint opensearch metadata log service impl

Signed-off-by: Sean Kao <seankao@amazon.com>

* reflection for metadata log service instantiate

Signed-off-by: Sean Kao <seankao@amazon.com>

* add comments

Signed-off-by: Sean Kao <seankao@amazon.com>

* undo using client builder in tests

FlintClientBuilder uses reflection to instantiate
FlintMetadataLogService, which cannot be checked at compile time.

Signed-off-by: Sean Kao <seankao@amazon.com>

* use metadata log service in transaction IT suites

Signed-off-by: Sean Kao <seankao@amazon.com>

* metadata log test suite

Signed-off-by: Sean Kao <seankao@amazon.com>

* flint client builder test

Signed-off-by: Sean Kao <seankao@amazon.com>

* scalafmtAll

Signed-off-by: Sean Kao <seankao@amazon.com>

* undo reflection

Signed-off-by: Sean Kao <seankao@amazon.com>

* rename descriptive symbols

Signed-off-by: Sean Kao <seankao@amazon.com>

* descriptive parameter name and update comment

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove init from getMetadataLog interface

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove startTransaction from FlintClient

Signed-off-by: Sean Kao <seankao@amazon.com>

* rm flint os client dep on metadata log service

Signed-off-by: Sean Kao <seankao@amazon.com>

* rename function and remove unused flint client

Signed-off-by: Sean Kao <seankao@amazon.com>

* remove test case for init in getMetadataLog

Signed-off-by: Sean Kao <seankao@amazon.com>

* abstract recordHeartbeat method

Signed-off-by: Sean Kao <seankao@amazon.com>

* consistent var naming for metadataLogIndexName

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.5 maintenance Code refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants