Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spark connector #1780

Merged
merged 19 commits into from
Jul 5, 2023
Merged

Conversation

rupal-bq
Copy link
Contributor

Description

  • Setup spark connector module

Issues Resolved

#1721

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

vmmusings and others added 7 commits June 7, 2023 18:00
Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
@codecov
Copy link

codecov bot commented Jun 26, 2023

Codecov Report

Merging #1780 (be15dae) into main (fa840e0) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #1780      +/-   ##
============================================
+ Coverage     97.32%   97.33%   +0.01%     
- Complexity     4458     4490      +32     
============================================
  Files           388      394       +6     
  Lines         11050    11118      +68     
  Branches        790      795       +5     
============================================
+ Hits          10754    10822      +68     
  Misses          289      289              
  Partials          7        7              
Flag Coverage Δ
sql-engine 97.33% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...implementation/SparkSqlFunctionImplementation.java 100.00% <100.00%> (ø)
...ctions/resolver/SparkSqlTableFunctionResolver.java 100.00% <100.00%> (ø)
...nctions/scan/SparkSqlFunctionTableScanBuilder.java 100.00% <100.00%> (ø)
...ensearch/sql/spark/storage/SparkStorageEngine.java 100.00% <100.00%> (ø)
...nsearch/sql/spark/storage/SparkStorageFactory.java 100.00% <100.00%> (ø)
...a/org/opensearch/sql/spark/storage/SparkTable.java 100.00% <100.00%> (ø)

rupal-bq added 8 commits June 26, 2023 11:13
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
spark/lombok.config Outdated Show resolved Hide resolved

SparkQueryRequest sparkQueryRequest = new SparkQueryRequest();
arguments.forEach(arg -> {
String argName = ((NamedArgumentExpression) arg).getArgName();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is arg always NamedArgumentExpression? How do you guarantee that?
You can do cast in constructor though.


@Override
public Table getTable(DataSourceSchemaName dataSourceSchemaName, String tableName) {
return null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Maybe throw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*/
StorageEngine getStorageEngine(Map<String, String> requiredConfig) {
SparkClient sparkClient = null;
//TODO: Initialize spark client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All TODOs in this PR will be resolved in follow up #1790. Trying to split spark connector in 3 parts to avoid large PR.

assertEquals("Invalid Function Argument:tmp", exception.getMessage());
}

}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line

@Yury-Fridlyand
Copy link
Collaborator

Yury-Fridlyand commented Jun 27, 2023

Can you add ITs, docs and doctests?

@Yury-Fridlyand
Copy link
Collaborator

Is explain supported by spark queries? If not, please, add an IT. If yes, please, add an IT too (and maybe doctest/docs too).

Signed-off-by: Rupal Mahajan <maharup@amazon.com>
@rupal-bq
Copy link
Contributor Author

Can you add ITs, docs and doctests?

Yes, I need at least one client implementation for adding proper ITs. So I was planning to add ITs, docs and doctests in follow up #1790

spark/lombok.config Outdated Show resolved Hide resolved
rupal-bq added 3 commits July 4, 2023 15:34
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Copy link
Collaborator

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. Please add IT in the following PR.

Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@rupal-bq rupal-bq merged commit a816a58 into opensearch-project:main Jul 5, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 5, 2023
* Create Spark Connector

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add spark client and engine

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove vars

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Spark connector draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle errors

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix license header

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add spark storage test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle in comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor class name

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>
(cherry picked from commit a816a58)
matthewryanwells pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Jul 7, 2023
* Create Spark Connector

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add spark client and engine

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove vars

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Spark connector draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle errors

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix license header

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add spark storage test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle in comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor class name

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 11, 2023
* Create Spark Connector

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add spark client and engine

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove vars

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Spark connector draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle errors

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix license header

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add spark storage test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle in comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor class name

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>
(cherry picked from commit a816a58)
penghuo pushed a commit that referenced this pull request Jul 11, 2023
* Add spark connector (#1780)

* Create Spark Connector

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add spark client and engine

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove vars

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Spark connector draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle errors

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix license header

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add spark storage test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle in comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor class name

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>
(cherry picked from commit a816a58)

* Upgrade httpclient to 4.5.14 to fix build

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Revert Upgrade httpclient to 4.5.14

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Co-authored-by: Rupal Mahajan <maharup@amazon.com>
penghuo pushed a commit that referenced this pull request Jul 11, 2023
* Create Spark Connector

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add spark client and engine

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove vars

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Spark connector draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle errors

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix license header

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add spark storage test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix checkstyle in comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor class name

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address PR comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>
(cherry picked from commit a816a58)

Co-authored-by: Rupal Mahajan <maharup@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants