Skip to content

Conversation

@cloud-fan
Copy link
Contributor

@cloud-fan cloud-fan commented May 10, 2016

What changes were proposed in this pull request?

SPARK-15241: We now support java decimal and catalyst decimal in external row, it makes sense to also support scala decimal.

SPARK-15242: This is a long-standing bug, and is exposed after #12364, which eliminate the If expression if the field is not nullable:

val fieldValue = serializerFor(
  GetExternalRowField(inputObject, i, externalDataTypeForInput(f.dataType)),
  f.dataType)
if (f.nullable) {
  If(
    Invoke(inputObject, "isNullAt", BooleanType, Literal(i) :: Nil),
    Literal.create(null, f.dataType),
    fieldValue)
} else {
  fieldValue
}

Previously, we always use DecimalType.SYSTEM_DEFAULT as the output type of converted decimal field, which is wrong as it doesn't match the real decimal type. However, it works well because we always put converted field into If expression to do the null check, and If use its trueValue's data type as its output type.
Now if we have a not nullable decimal field, then the converted field's output type will be DecimalType.SYSTEM_DEFAULT, and we will write wrong data into unsafe row.

The fix is simple, just use the given decimal type as the output type of converted decimal field.

These 2 issues was found at #13008

How was this patch tested?

new tests in RowEncoderSuite

@cloud-fan
Copy link
Contributor Author

cc @yhuai @liancheng @viirya

@SparkQA
Copy link

SparkQA commented May 10, 2016

Test build #58202 has finished for PR 13019 at commit 9621fc6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented May 11, 2016

actually cc @davies also

.add("int", IntegerType)
.add("string", StringType)
.add("double", DoubleType)
.add("decimal", DecimalType.SYSTEM_DEFAULT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is still good to keep int, string, and double columns?

@viirya
Copy link
Member

viirya commented May 11, 2016

LGTM

@SparkQA
Copy link

SparkQA commented May 11, 2016

Test build #58339 has finished for PR 13019 at commit a2c77ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented May 11, 2016

LGTM

@davies
Copy link
Contributor

davies commented May 11, 2016

Merging this into master and 2.0, thanks!

@asfgit asfgit closed this in d8935db May 11, 2016
asfgit pushed a commit that referenced this pull request May 11, 2016
…Encoder

## What changes were proposed in this pull request?

SPARK-15241: We now support java decimal and catalyst decimal in external row, it makes sense to also support scala decimal.

SPARK-15242: This is a long-standing bug, and is exposed after #12364, which eliminate the `If` expression if the field is not nullable:
```
val fieldValue = serializerFor(
  GetExternalRowField(inputObject, i, externalDataTypeForInput(f.dataType)),
  f.dataType)
if (f.nullable) {
  If(
    Invoke(inputObject, "isNullAt", BooleanType, Literal(i) :: Nil),
    Literal.create(null, f.dataType),
    fieldValue)
} else {
  fieldValue
}
```

Previously, we always use `DecimalType.SYSTEM_DEFAULT` as the output type of converted decimal field, which is wrong as it doesn't match the real decimal type. However, it works well because we always put converted field into `If` expression to do the null check, and `If` use its `trueValue`'s data type as its output type.
Now if we have a not nullable decimal field, then the converted field's output type will be `DecimalType.SYSTEM_DEFAULT`, and we will write wrong data into unsafe row.

The fix is simple, just use the given decimal type as the output type of converted decimal field.

These 2 issues was found at #13008

## How was this patch tested?

new tests in RowEncoderSuite

Author: Wenchen Fan <wenchen@databricks.com>

Closes #13019 from cloud-fan/encoder-decimal.

(cherry picked from commit d8935db)
Signed-off-by: Davies Liu <davies.liu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants