Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JDBC datasource #1361

Closed
wants to merge 16 commits into from
Closed

Conversation

penghuo
Copy link
Collaborator

@penghuo penghuo commented Feb 20, 2023

Description

  1. Add JDBC datasource support. Only support org.apache.hive.jdbc.HiveDriver.
  2. Doc. https://github.com/penghuo/os-sql/blob/pr/spark/jdbc/docs/user/ppl/admin/jdbc.rst#ppl-supported-for-jdbc-connector

Issues Resolved

#1331

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov-commenter
Copy link

codecov-commenter commented Feb 20, 2023

Codecov Report

Merging #1361 (4c95018) into main (bc39346) will increase coverage by 0.05%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #1361      +/-   ##
============================================
+ Coverage     98.43%   98.48%   +0.05%     
- Complexity     3775     3913     +138     
============================================
  Files           343      355      +12     
  Lines          9364     9708     +344     
  Branches        599      621      +22     
============================================
+ Hits           9217     9561     +344     
  Misses          142      142              
  Partials          5        5              
Flag Coverage Δ
sql-engine 98.48% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ava/org/opensearch/sql/jdbc/JDBCStorageEngine.java 100.00% <100.00%> (ø)
...va/org/opensearch/sql/jdbc/JDBCStorageFactory.java 100.00% <100.00%> (ø)
...rg/opensearch/sql/jdbc/functions/JDBCFunction.java 100.00% <100.00%> (ø)
.../sql/jdbc/functions/JDBCTableFunctionResolver.java 100.00% <100.00%> (ø)
...pensearch/sql/jdbc/operator/JDBCQueryOperator.java 100.00% <100.00%> (ø)
...ensearch/sql/jdbc/operator/JDBCResponseHandle.java 100.00% <100.00%> (ø)
...sql/jdbc/operator/JDBCResultSetResponseHandle.java 100.00% <100.00%> (ø)
...l/jdbc/operator/JDBCUpdateCountResponseHandle.java 100.00% <100.00%> (ø)
...in/java/org/opensearch/sql/jdbc/parser/Option.java 100.00% <100.00%> (ø)
...g/opensearch/sql/jdbc/parser/PropertiesParser.java 100.00% <100.00%> (ø)
... and 1 more

... and 28 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@penghuo penghuo self-assigned this Feb 21, 2023
penghuo added 5 commits March 6, 2023 08:03
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
docs/user/ppl/admin/jdbc.rst Outdated Show resolved Hide resolved
doctest/build.gradle Outdated Show resolved Hide resolved
doctest/build.gradle Show resolved Hide resolved
docs/user/ppl/admin/jdbc.rst Outdated Show resolved Hide resolved
import org.opensearch.sql.exception.SemanticCheckException;

/** Describe a single datasource configuration option. */
@Builder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to use a constructor or of static method?

* ``url`` [Required].
* This parameters provides the URL to connect to a database instance provided endpoint.
* ``driver`` [Required].
* This parameters provides the Driver to connect to a database instance provided endpoint. The only supported ``org.apache.hive.jdbc.HiveDriver``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only HiveDriver supported? There is no hardcoded restriction. Can I use anything else, for example, put SQLite driver into CLASSPATH and use it?

Copy link
Collaborator Author

@penghuo penghuo Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only HiveDriver supported? There is no hardcoded restriction.

Yes, only hive supported. Explain the possible value in doc.

Can I use anything else, for example, put SQLite driver into CLASSPATH and use it?

You can, if you put SQLLite into classpath. But it is not prebuild.

penghuo added 7 commits March 14, 2023 12:57
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
@penghuo penghuo marked this pull request as ready for review March 21, 2023 02:52
@penghuo penghuo requested a review from a team as a code owner March 21, 2023 02:52
penghuo added 2 commits March 20, 2023 19:58
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
@penghuo penghuo removed the v2.7.0 label Mar 24, 2023
into "$projectDir/bin"
}
}
command "$projectDir/bin/${SPARK_BINARY}/bin/spark-class org.apache.spark.deploy.master.Master -h localhost -p 7077 --webui-port 8080"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it cross-platform?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To piggy-back on this, is it possible to run Spark in Docker?

Copy link
Collaborator

@MaxKsyunz MaxKsyunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main question is: can this data source be used with any JDBC driver?
If not, is it possible to make it clear to the users that Spark is the only supported JDBC data source?
I see now this was addressed in docs section.

I also had a question about if the change to security policies is required?

As a minor point, I flagged a few instance where Immutable* types were used instead of Java core types.

Comment on lines +19 to +21
// hive2 jdbc required
permission java.io.FilePermission "*", "read, write";
permission java.security.SecurityPermission "*";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be made configuration option?

If I understand this correctly, this will permanently broaden permissions of the sql plugin yet they are only necessary when a JDBC data source with Hive2 driver is used.

into "$projectDir/bin"
}
}
command "$projectDir/bin/${SPARK_BINARY}/bin/spark-class org.apache.spark.deploy.master.Master -h localhost -p 7077 --webui-port 8080"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To piggy-back on this, is it possible to run Spark in Docker?

case Types.TIMESTAMP:
return TIMESTAMP;

// we assume the result is json encoded string. refer https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.0.0.2/ds_Hive/jdbc-hs2.html,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assumption safe to make for other JDBC drivers?

@Override
public ExecutionEngine.Schema schema() {
return new ExecutionEngine.Schema(
ImmutableList.of(new ExecutionEngine.Schema.Column("result", "result", INTEGER)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ImmutableList.of(new ExecutionEngine.Schema.Column("result", "result", INTEGER)));
List.of(new ExecutionEngine.Schema.Column("result", "result", INTEGER)));

*/
public PropertiesParser() {
options =
new ImmutableList.Builder<Option>()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be written using java.util.List?

@penghuo penghuo marked this pull request as draft April 10, 2023 16:48
@penghuo penghuo closed this May 16, 2023
@luyuncheng
Copy link

luyuncheng commented Apr 16, 2024

may I ask about why we close this pr? is there any continues planning for this?

@nazarovkv
Copy link

may I ask about why we close this pr? is there any continues planning for this?

+1

@penghuo is it possible to proceed with changes? Looks like community can loose valuable feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants