Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented May 1, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-7299

When connecting with oracle db through jdbc, the precision and scale of BigDecimal object returned by ResultSet.getBigDecimal is not correctly matched to the table schema reported by ResultSetMetaData.getPrecision and ResultSetMetaData.getScale.

So in case you insert a value like 19999 into a column with NUMBER(12, 2) type, you get through a BigDecimal object with scale as 0. But the dataframe schema has correct type as DecimalType(12, 2). Thus, after you save the dataframe into parquet file and then retrieve it, you will get wrong result 199.99.

Because it is reported to be problematic on jdbc connection with oracle db. It might be difficult to add test case for it. But according to the user's test on JIRA, it solves this problem.

@viirya viirya changed the title [SPARK-7299][SQL] Set up Decimal's precision and scale according to table schema instead of returned BigDecimal [SPARK-7299][SQL] Set precision and scale for Decimal according to JDBC metadata instead of returned BigDecimal May 1, 2015
@SparkQA
Copy link

SparkQA commented May 1, 2015

Test build #31545 has finished for PR 5833 at commit 5f9da94.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ExecutorUIData(
    • case class DecimalConversion(precisionInfo: Option[(Int, Int)]) extends JDBCConversion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit clearer as:

case DecimalConversion(Some(precision,scale)) =>
  mutableRow.update(i, Decimal(rs.getBigDecimal(pos), precision, scale))
case DecimalConversion(None) => mutableRow.update(i, Decimal(rs.getBigDecimal(pos)))

@squito
Copy link
Contributor

squito commented May 1, 2015

its a bummer that this is an Oracle specific thing. Do you think it would be worthwhile to add a test for the behavior in any case, even if in our testing environment it works without the change? And I'd add also add some comments to the code & test that very clearly indicate this is needed specifically for oracle.

@viirya
Copy link
Member Author

viirya commented May 2, 2015

Such a test might be confusing? See if others have some comments for that. I would add some comments to the code.

@SparkQA
Copy link

SparkQA commented May 2, 2015

Test build #31660 has finished for PR 5833 at commit 928f864.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DecimalConversion(precisionInfo: Option[(Int, Int)]) extends JDBCConversion

@squito
Copy link
Contributor

squito commented May 2, 2015

yeah, I could go either way on the test -- I guess the point was mostly to make sure it still worked with DBs, and also to have some example code somebody could try if they did want to mess with oracle integration. it was just a thought, I don't feel strongly about it.

btw, thanks for adding the comment. I had meant something much more brief, just like, "this is needed for oracle, see SPARK-7299" but what you have is good :)

@rxin
Copy link
Contributor

rxin commented May 16, 2015

@viirya can you bring this up to date? I'd like to merge this.

…ision

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala
@viirya
Copy link
Member Author

viirya commented May 16, 2015

@rxin ok. updated.

@SparkQA
Copy link

SparkQA commented May 16, 2015

Test build #32895 has finished for PR 5833 at commit 69bc2b5.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DecimalConversion(precisionInfo: Option[(Int, Int)]) extends JDBCConversion

@viirya
Copy link
Member Author

viirya commented May 16, 2015

retest this please.

@rxin
Copy link
Contributor

rxin commented May 16, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented May 16, 2015

Test build #32901 has finished for PR 5833 at commit 69bc2b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DecimalConversion(precisionInfo: Option[(Int, Int)]) extends JDBCConversion

@SparkQA
Copy link

SparkQA commented May 16, 2015

Test build #32902 has finished for PR 5833 at commit 69bc2b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class JavaOneVsRestExample
    • head("OneVsRest Example: multiclass to binary reduction using OneVsRest")
    • class DataTypeSingleton(type):
    • class NullType(DataType):
    • class AtomicType(DataType):
    • class NumericType(AtomicType):
    • class IntegralType(NumericType):
    • class FractionalType(NumericType):
    • class StringType(AtomicType):
    • class BinaryType(AtomicType):
    • class BooleanType(AtomicType):
    • class DateType(AtomicType):
    • class TimestampType(AtomicType):
    • class DecimalType(FractionalType):
    • class DoubleType(FractionalType):
    • class FloatType(FractionalType):
    • class ByteType(IntegralType):
    • class IntegerType(IntegralType):
    • class LongType(IntegralType):
    • class ShortType(IntegralType):
    • class Column(object):
    • class GroupedData(object):
    • case class DecimalConversion(precisionInfo: Option[(Int, Int)]) extends JDBCConversion

@viirya
Copy link
Member Author

viirya commented May 18, 2015

@rxin The tests are passed. Please take a look.

@rxin
Copy link
Contributor

rxin commented May 18, 2015

Thanks - merged in.

@asfgit asfgit closed this in e32c0f6 May 18, 2015
asfgit pushed a commit that referenced this pull request May 18, 2015
…BC metadata instead of returned BigDecimal

JIRA: https://issues.apache.org/jira/browse/SPARK-7299

When connecting with oracle db through jdbc, the precision and scale of `BigDecimal` object returned by `ResultSet.getBigDecimal` is not correctly matched to the table schema reported by `ResultSetMetaData.getPrecision` and `ResultSetMetaData.getScale`.

So in case you insert a value like `19999` into a column with `NUMBER(12, 2)` type, you get through a `BigDecimal` object with scale as 0. But the dataframe schema has correct type as `DecimalType(12, 2)`. Thus, after you save the dataframe into parquet file and then retrieve it, you will get wrong result `199.99`.

Because it is reported to be problematic on jdbc connection with oracle db. It might be difficult to add test case for it. But according to the user's test on JIRA, it solves this problem.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5833 from viirya/jdbc_decimal_precision and squashes the following commits:

69bc2b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into jdbc_decimal_precision
928f864 [Liang-Chi Hsieh] Add comments.
5f9da94 [Liang-Chi Hsieh] Set up Decimal's precision and scale according to table schema instead of returned BigDecimal.

(cherry picked from commit e32c0f6)
Signed-off-by: Reynold Xin <rxin@databricks.com>
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…BC metadata instead of returned BigDecimal

JIRA: https://issues.apache.org/jira/browse/SPARK-7299

When connecting with oracle db through jdbc, the precision and scale of `BigDecimal` object returned by `ResultSet.getBigDecimal` is not correctly matched to the table schema reported by `ResultSetMetaData.getPrecision` and `ResultSetMetaData.getScale`.

So in case you insert a value like `19999` into a column with `NUMBER(12, 2)` type, you get through a `BigDecimal` object with scale as 0. But the dataframe schema has correct type as `DecimalType(12, 2)`. Thus, after you save the dataframe into parquet file and then retrieve it, you will get wrong result `199.99`.

Because it is reported to be problematic on jdbc connection with oracle db. It might be difficult to add test case for it. But according to the user's test on JIRA, it solves this problem.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes apache#5833 from viirya/jdbc_decimal_precision and squashes the following commits:

69bc2b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into jdbc_decimal_precision
928f864 [Liang-Chi Hsieh] Add comments.
5f9da94 [Liang-Chi Hsieh] Set up Decimal's precision and scale according to table schema instead of returned BigDecimal.
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…BC metadata instead of returned BigDecimal

JIRA: https://issues.apache.org/jira/browse/SPARK-7299

When connecting with oracle db through jdbc, the precision and scale of `BigDecimal` object returned by `ResultSet.getBigDecimal` is not correctly matched to the table schema reported by `ResultSetMetaData.getPrecision` and `ResultSetMetaData.getScale`.

So in case you insert a value like `19999` into a column with `NUMBER(12, 2)` type, you get through a `BigDecimal` object with scale as 0. But the dataframe schema has correct type as `DecimalType(12, 2)`. Thus, after you save the dataframe into parquet file and then retrieve it, you will get wrong result `199.99`.

Because it is reported to be problematic on jdbc connection with oracle db. It might be difficult to add test case for it. But according to the user's test on JIRA, it solves this problem.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes apache#5833 from viirya/jdbc_decimal_precision and squashes the following commits:

69bc2b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into jdbc_decimal_precision
928f864 [Liang-Chi Hsieh] Add comments.
5f9da94 [Liang-Chi Hsieh] Set up Decimal's precision and scale according to table schema instead of returned BigDecimal.
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…BC metadata instead of returned BigDecimal

JIRA: https://issues.apache.org/jira/browse/SPARK-7299

When connecting with oracle db through jdbc, the precision and scale of `BigDecimal` object returned by `ResultSet.getBigDecimal` is not correctly matched to the table schema reported by `ResultSetMetaData.getPrecision` and `ResultSetMetaData.getScale`.

So in case you insert a value like `19999` into a column with `NUMBER(12, 2)` type, you get through a `BigDecimal` object with scale as 0. But the dataframe schema has correct type as `DecimalType(12, 2)`. Thus, after you save the dataframe into parquet file and then retrieve it, you will get wrong result `199.99`.

Because it is reported to be problematic on jdbc connection with oracle db. It might be difficult to add test case for it. But according to the user's test on JIRA, it solves this problem.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes apache#5833 from viirya/jdbc_decimal_precision and squashes the following commits:

69bc2b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into jdbc_decimal_precision
928f864 [Liang-Chi Hsieh] Add comments.
5f9da94 [Liang-Chi Hsieh] Set up Decimal's precision and scale according to table schema instead of returned BigDecimal.
@viirya viirya deleted the jdbc_decimal_precision branch December 27, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants