Skip to content

Conversation

@taroplus
Copy link
Contributor

TIMESTAMP (-101), BINARY_DOUBLE (101) and BINARY_FLOAT (100) are handled in OracleDialect

What changes were proposed in this pull request?

When a oracle table contains columns whose type is BINARY_FLOAT or BINARY_DOUBLE, spark sql fails to load a table with SQLException

java.sql.SQLException: Unsupported type 101
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292)
 at scala.Option.getOrElse(Option.scala:121)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291)
 at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
 at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47)
 at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)

How was this patch tested?

I updated a UT which covers type conversion test for types (-101, 100, 101), on top of that I tested this change against actual table with those columns and it was able to read and write to the table.

TIMESTAMP (-101), BINARY_DOUBLE (101) and BINARY_FLOAT (100) are handled in OracleDialect
@gatorsmile
Copy link
Member

ok to test

@gatorsmile
Copy link
Member

Could you add a test case to OracleIntegrationSuite.scala?

Below is the instruction how to run the docker-based testsuite OracleIntegrationSuite.scala .

build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive-thriftserver -Phive -DskipTests install
Before running dockers test . You need to set the docker env variabled. Following does the magic:
eval $(docker-machine env default)
build/mvn -Pdocker-integration-tests -pl :spark-docker-integration-tests_2.11 compile test

}
case -101 => Some(TimestampType) // Value for Timestamp with Time Zone in Oracle
case 100 => Some(FloatType) // Value for OracleTypes.BINARY_FLOAT
case 101 => Some(DoubleType) // Value for OracleTypes.BINARY_DOUBLE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is the value range of OracleTypes.BINARY_DOUBLE identical to our Spark Double type?

Also, OracleTypes.BINARY_FLOAT with Spark Float type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should match java's double / float definition

@SparkQA
Copy link

SparkQA commented Oct 22, 2017

Test build #82951 has finished for PR 19548 at commit 51c616c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@taroplus
Copy link
Contributor Author

Okay new integration test has been added

According to their document

BINARY_DOUBLE is a 64-bit, double-precision, floating-point number.

BINARY_FLOAT is a 32-bit, single-precision, floating-point number.

Both BINARY_FLOAT and BINARY_DOUBLE support the special values Inf, -Inf, and NaN (not a number) and conform to the IEEE standard.

This should match java's double / float definition

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are other DB-specific types supported in Spark?

val props = new Properties()

// write it back to the table (append mode)
val data = spark.sparkContext.parallelize(Seq(Row(1.1D, 2.2F)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think it's more conventional to write "1.1" and "2.2f"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

conn.commit();


conn.prepareStatement("CREATE TABLE oracle_types (d BINARY_DOUBLE, f BINARY_FLOAT)").executeUpdate();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No semicolon at the end of lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed trailing semi-colons throughout this file

class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLContext {
import testImplicits._
// To make === between double tolerate inexact values
implicit val doubleEquality = TolerantNumerics.tolerantDoubleEquality(0.01)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why it would be that inequal somewhere in these tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this line and the test still passes

Some(DecimalType(DecimalType.MAX_PRECISION, 10)))
assert(oracleDialect.getCatalystType(java.sql.Types.NUMERIC, "numeric", 0, null) ==
Some(DecimalType(DecimalType.MAX_PRECISION, 10)))
assert(oracleDialect.getCatalystType(100, "BINARY_FLOAT", 0, null) ==
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define constants somewhere suitable for 100/100/-101

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made them constants defined in OracleDialect

@SparkQA
Copy link

SparkQA commented Oct 23, 2017

Test build #82965 has finished for PR 19548 at commit 91e911d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

 - No trailing semi-colons
 - constatns for 100, 101 and -101
 - just compare double / float
@SparkQA
Copy link

SparkQA commented Oct 23, 2017

Test build #82968 has finished for PR 19548 at commit 28c7ce8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 5a5b6b7 Oct 23, 2017
@taroplus taroplus deleted the oracle_sql_types_101 branch September 23, 2021 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants