Skip to content

Conversation

@Hisoka-X
Copy link
Member

@Hisoka-X Hisoka-X commented Jul 4, 2023

What changes were proposed in this pull request?

  1. This PR add dropTable function to JdbcDialect. So user can override dropTable SQL by other JdbcDialect like Neo4J
    Neo4J Drop case
MATCH (m:Person {name: 'Mark'})
DELETE m
  1. Also add getInsertStatement for same reason.
    Neo4J Insert case
MATCH (p:Person {name: 'Jennifer'})
SET p.birthdate = date('1980-01-01')
RETURN p

Neo4J SQL(in fact named CQL) not like normal SQL, but it have JDBC driver.

Why are the changes needed?

Make JdbcDialect more useful

Does this PR introduce any user-facing change?

No

How was this patch tested?

exist test

@github-actions github-actions bot added the SQL label Jul 4, 2023
@Hisoka-X
Copy link
Member Author

Hisoka-X commented Jul 4, 2023

cc @HyukjinKwon @MaxGekk

@Hisoka-X Hisoka-X force-pushed the SPARK-44262_JDBCUtils_improve branch from 4ed366d to 06ddb9a Compare July 4, 2023 15:01
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other JdbcDialect like Neo4J

@Hisoka-X What's the issue Neo4J? Could you provide more details in PR's description, please.

LGTM in general.

@HyukjinKwon
Copy link
Member

cc @sadikovi FYI

@Hisoka-X Hisoka-X changed the title [SPARK-44262][SQL] Add DropTable to JdbcDialect [SPARK-44262][SQL] Add DropTable and getInsertStatement to JdbcDialect Jul 5, 2023
@Hisoka-X
Copy link
Member Author

Hisoka-X commented Jul 5, 2023

@fbiville Hi, can you tell us more detail about how neo4j use drop table or insert statement in JDBC? Thanks.

@Hisoka-X
Copy link
Member Author

Hisoka-X commented Jul 5, 2023

other JdbcDialect like Neo4J

@Hisoka-X What's the issue Neo4J? Could you provide more details in PR's description, please.

LGTM in general.

Already Update. FYI, also move insertStatement to JdbcDialect.

@MaxGekk MaxGekk changed the title [SPARK-44262][SQL] Add DropTable and getInsertStatement to JdbcDialect [SPARK-44262][SQL] Add dropTable and getInsertStatement to JdbcDialect Jul 6, 2023
@fbiville
Copy link

fbiville commented Jul 7, 2023

@fbiville Hi, can you tell us more detail about how neo4j use drop table or insert statement in JDBC? Thanks.

Hello, I experimented using https://github.com/neo4j-contrib/neo4j-jdbc and a third-party, closed-source SDK based on Spark.

The Neo4j JDBC connector supports only Cypher queries, not SQL queries.

After a conversation with the third-party maintainer, I learnt they were calling org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getSchemaOption, which would fail right away since there is no JdbcDialect registered for Neo4j.

However, I quickly realize that supplying a Neo4j dialect would not be enough, as some SQL statements are hardcoded directly in JdbcUtils and those would not work against Neo4j.

Does that help?

@Hisoka-X Hisoka-X requested a review from MaxGekk July 10, 2023 11:40
@Hisoka-X
Copy link
Member Author

Hisoka-X commented Aug 3, 2023

Hi @MaxGekk , can you continue review this PR? Please.

@Hisoka-X
Copy link
Member Author

Hisoka-X commented Sep 1, 2023

Since the 3.5.0 rc3 will failed, shell we add this PR to 3.5.0 now? @cloud-fan @MaxGekk @HyukjinKwon

@HyukjinKwon
Copy link
Member

Nah, let's don't add a new API to 3.5.0 at this moment.

@Hisoka-X
Copy link
Member Author

Hisoka-X commented Sep 4, 2023

Nah, let's don't add a new API to 3.5.0 at this moment.

Got it, let me change since to 4.0.0

* @return The SQL query to use for insert data into table.
*/
@Since("4.0.0")
def getInsertStatement(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use consistent naming in this file. How about def insertIntoTable?

Copy link
Member Author

@Hisoka-X Hisoka-X Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns is this method only return sql string statement, not do insert into table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have createTable, alterTable, and you just added dropTable :)

Copy link
Member Author

@Hisoka-X Hisoka-X Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the behavior of getInsertStatement is return the string SQL, unlike createTable, alterTable,dropTable do create/alter/drop to database. The getInsertStatement dosen't insert data into table, it just used to prepare JDBCStatement, change it to insertIntoTable seem like not match with it behavior.

oh just realized that other method doesn't actually perform either. Let me change it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, createTable do create table, but other method just return SQL string, should we let createTable only return string to make JdbcDialect behave more consistently?

@Hisoka-X Hisoka-X requested a review from cloud-fan September 12, 2023 02:06
@Hisoka-X Hisoka-X force-pushed the SPARK-44262_JDBCUtils_improve branch from c9f38da to 02339a3 Compare October 12, 2023 11:45
@Hisoka-X Hisoka-X force-pushed the SPARK-44262_JDBCUtils_improve branch from 4beea08 to a9dfc49 Compare October 13, 2023 12:27
Copy link
Contributor

@beliefer beliefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests passed.
cc @cloud-fan

rddSchema: StructType,
tableSchema: Option[StructType],
isCaseSensitive: Boolean,
dialect: JdbcDialect): String = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: getInsertStatement is not a public API but we don't stop people from calling JdbcUtils as there is no private[spark]. I'm fine keeping it to avoid troubles for third-party Spark vendors.

@MaxGekk
Copy link
Member

MaxGekk commented Oct 16, 2023

+1, LGTM. Merging to master.
Thank you, @Hisoka-X and @cloud-fan @beliefer @yaooqinn @HyukjinKwon @fbiville for review.

@MaxGekk MaxGekk closed this in 6994bad Oct 16, 2023
@Hisoka-X
Copy link
Member Author

Thanks @MaxGekk and all!

@Hisoka-X Hisoka-X deleted the SPARK-44262_JDBCUtils_improve branch October 16, 2023 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants