Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-25756] [connectors/opensearch] Dedicated Opensearch connectors #18541

Closed
wants to merge 1 commit into from

Conversation

reta
Copy link
Member

@reta reta commented Jan 27, 2022

Signed-off-by: Andriy Redko andriy.redko@aiven.io

What is the purpose of the change

The goal of this change is to provide dedicated Opensearch connectors.

Brief change log

The implementation is largely based on the existing Elasticsearch 7 connector with a few notable changes (besides the dependencies and APIs):

  • any mentions and uses of mapping types have been removed: it is deprecated feature, scheduled for removal (the indices with mapping types cannot be created or migrated to Opensearch 1.x and beyond)
  • any mentions and uses have been removed: it is deprecated feature, scheduled for removal (only HighLevelRestClient is used)
  • the default distributions of Opensearch come with HTTPS turned on, using self-signed certificates: to simplify the integration a new option allow-insecure has been added to suppress certificates validation for development and testing purposes
  • old streaming APIs are also supported to facilitate the migration of existing applications from Elasticsearch 7/6 to Opensearch (the classes will change but the familiar model will stay)

The new connector name is opensearch and it follows the existing conventions:

CREATE TABLE users ( ... ) WITH (
  'connector' = 'opensearch', 
  'hosts' = 'https://localhost:9200',
  'index' = 'users', 
  'allow-insecure' = 'true', 
  'username' = 'admin', 
  'password' = 'admin');

Verifying this change

This change added comprehensive tests and can be verified as follows (largely ported the existing unit and integration tests for Elasticsearch 7):

  • Added unit tests
  • Added integration tests for end-to-end
  • Added end-to-end tests
  • Manually verified the connector by running a node clusters

Does this pull request potentially affect one of the following parts:

  • Dependencies: yes (the latest Opensearch 1.2.4 APIs as of this moment)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? (docs - in progress, JavaDocs)

Huge thanks @snuyanzin for help.

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 95b34f9 (Thu Jan 27 15:31:50 UTC 2022)

Warnings:

  • 4 pom.xml files were touched: Check for build and licensing issues.
  • No documentation files were touched! Remember to keep the Flink docs up to date!
  • This pull request references an unassigned Jira ticket. According to the code contribution guide, tickets need to be assigned before starting with the implementation work.

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 27, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@reta reta force-pushed the FLINK-25756 branch 4 times, most recently from a520c43 to e31965a Compare January 31, 2022 21:38
@CrynetLogistics
Copy link
Contributor

Hi @reta Thanks for the contribution!
I was wondering if you considered basing your implementation on the Generic Async Sink, new in 1.15?

@reta
Copy link
Member Author

reta commented Feb 1, 2022

Hi @reta Thanks for the contribution! I was wondering if you considered basing your implementation on the Generic Async Sink, new in 1.15?

Hey @CrynetLogistics , thanks a lot for pointing it out. No, we have not considered the Generic Async Sink API, it seems like it was introduced at the same time this pull request was created (but it sounds like a good future improvement for sure). Thank you!

@reta reta marked this pull request as ready for review February 1, 2022 20:53
@reta reta force-pushed the FLINK-25756 branch 3 times, most recently from cb9026e to 588cb7a Compare February 4, 2022 14:05
@MartijnVisser
Copy link
Contributor

@reta Thanks for your patience! We've started this week with our first external connector repo project, which is moving out the Elasticsearch connector from this repository to https://github.com/apache/flink-connector-elasticsearch

I think it would be best to first get that one moved out, so we can understand the actual issues that we might run into. When that one is done, I propose to create a dedicated repo for Opensearch and move your code to that repo. What do you think?

@reta
Copy link
Member Author

reta commented Mar 29, 2022

I think it would be best to first get that one moved out, so we can understand the actual issues that we might run into. When that one is done, I propose to create a dedicated repo for Opensearch and move your code to that repo. What do you think?

Sounds great, thank you @MartijnVisser ! Mind looking at #18634 before moving Elasticsearch repository (cleanup test dependencies)? Thank you!

@reta reta changed the title [FLINK-25756] [connectors] Dedicated Opensearch connectors [FLINK-25756] [connectors/opensearch] Dedicated Opensearch connectors Mar 30, 2022
@MartijnVisser MartijnVisser self-assigned this Mar 30, 2022
@MartijnVisser MartijnVisser removed their assignment Mar 30, 2022
@reta reta force-pushed the FLINK-25756 branch 3 times, most recently from 3498b37 to b0ea656 Compare March 30, 2022 18:23
@reta reta force-pushed the FLINK-25756 branch 2 times, most recently from f285d71 to dd322f2 Compare May 4, 2022 14:16
@reta
Copy link
Member Author

reta commented May 26, 2022

@MartijnVisser I see that https://github.com/apache/flink-connector-elasticsearch is getting filled in, do you think I could re-target the pull request to this repository (or, alternatively, new one https://github.com/apache/flink-connector-opensearch could be created)? Thank you.

@MartijnVisser
Copy link
Contributor

@MartijnVisser I see that https://github.com/apache/flink-connector-elasticsearch is getting filled in, do you think I could re-target the pull request to this repository (or, alternatively, new one https://github.com/apache/flink-connector-opensearch could be created)? Thank you.

Based on the latest discussion on the mailing list, we identified that if we want to create a new connector, we need to create a small FLIP (see https://cwiki.apache.org/confluence/display/FLINK/FLIP+Connector+Template). Do you have edit rights on the wiki? If not, I can ask a PMC to take care of that for you. After a FLIP discussion and a vote, we can create the repo directly. I think that makes the most sense. What do you think?

@reta
Copy link
Member Author

reta commented May 30, 2022

Based on the latest discussion on the mailing list, we identified that if we want to create a new connector, we need to create a small FLIP (see https://cwiki.apache.org/confluence/display/FLINK/FLIP+Connector+Template). Do you have edit rights on the wiki? If not, I can ask a PMC to take care of that for you. After a FLIP discussion and a vote, we can create the repo directly. I think that makes the most sense. What do you think?

Thanks @MartijnVisser , sure, I will do FLIP and follow the process. I don't have write permissions for Apache Flink space to create the new page for connector, I would really appreciate your help (same ASF username as on Github), thank you!

@reta reta force-pushed the FLINK-25756 branch 3 times, most recently from a9dee23 to 55fbee8 Compare June 22, 2022 18:45
@reta reta force-pushed the FLINK-25756 branch 2 times, most recently from 1eae18f to eabb85e Compare September 1, 2022 20:22
@reta reta force-pushed the FLINK-25756 branch 2 times, most recently from 80e9b3a to 6b19cf6 Compare September 9, 2022 18:16
@reta reta force-pushed the FLINK-25756 branch 6 times, most recently from 4bb685d to 3e627df Compare November 3, 2022 12:33
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
@MartijnVisser
Copy link
Contributor

@MartijnVisser
Copy link
Contributor

@reta Could you sync this PR to use the same setup as https://github.com/apache/flink-connector-elasticsearch ? Especially the Parent POM setup + the CI stuff.

@MartijnVisser MartijnVisser removed the request for review from zentol November 14, 2022 14:19
@reta
Copy link
Member Author

reta commented Nov 14, 2022

@reta Could you sync this PR to use the same setup as https://github.com/apache/flink-connector-elasticsearch ? Especially the Parent POM setup + the CI stuff.

Will do, thanks @MartijnVisser !

@reta
Copy link
Member Author

reta commented Nov 14, 2022

@MartijnVisser I am closing this one in favor of apache/flink-connector-opensearch#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants