Skip to content

Reliance on Spark's private JDBCRDD API breaks compilation against Spark 1.5.0 #40

@JoshRosen

Description

@JoshRosen

This library's use of Spark's private JDBCRDD API means that it no longer compiles against Spark 1.5.0:

/build/sbt -Dspark.version=1.5.0-SNAPSHOT clean test

[error] /Users/joshrosen/Documents/spark-redshift/src/main/scala/org/apache/spark/sql/jdbc/RedshiftJDBCWrapper.scala:33: not found: value JDBCRDD
[error]     JDBCRDD.getConnector(driver, url, properties)
[error]     ^
[error] /Users/joshrosen/Documents/spark-redshift/src/main/scala/org/apache/spark/sql/jdbc/RedshiftJDBCWrapper.scala:34: not found: value JdbcUtils
[error]   def tableExists(conn: Connection, table: String) = JdbcUtils.tableExists(conn, table)
[error]                                                      ^
[error] /Users/joshrosen/Documents/spark-redshift/src/main/scala/org/apache/spark/sql/jdbc/RedshiftJDBCWrapper.scala:29: not found: value DriverRegistry
[error]   def registerDriver(driverClass: String) = DriverRegistry.register(driverClass)
[error]                                             ^
[error] /Users/joshrosen/Documents/spark-redshift/src/main/scala/org/apache/spark/sql/jdbc/RedshiftJDBCWrapper.scala:31: not found: value JDBCRDD
[error]     JDBCRDD.resolveTable(jdbcUrl, table, properties)
[error]     ^
[error] /Users/joshrosen/Documents/spark-redshift/src/main/scala/org/apache/spark/sql/jdbc/RedshiftJDBCWrapper.scala:28: not found: value JDBCWriteDetails
[error]   def schemaString(dataFrame: DataFrame, url: String) = JDBCWriteDetails.schemaString(dataFrame, url)
[error]                                                         ^
[error] 5 errors found
[error] (compile:compile) Compilation failed
[error] Total time: 21 s, completed Aug 18, 2015 9:23:39 AM

The problem is that these class's package changed from org.apache.spark.sql.jdbc to org.apache.spark.sql.execution.datasources.jdbc.

There's a few ways that we can work around this. Ideally we would only use public Spark SQL APIs; if that's not possible then we can consider opening up certain internal SQL APIs to be @DeveloperApi or public. If we want to support pre-1.5.x versions then we might still need to rely on APIs that used to be private in those releases.

If we're fine with continuing to rely on private APIs then we can use reflection in order to maintain compatibility with 1.4.x and 1.5.x. If we do this, though, we're going to need to have really good tests that exercise all of the reflection code. As a result, I'm going to propose that we defer fixing this immediately and wait until we've gotten end-to-end test infra set up for the 1.4.x-compatible code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions