-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
I tried to write an unit test: it generate few generic Record firstly, then write to an orc file1. Another spark reader will open this file and read it , finally write to another orc file2.
There seems be some bugs there because the spark reader failed to get the same result with the record reader. It will throw an exception like this:
Value should match expected: schema.dec_11_2 expected:<623.9> but was:<62.39>
Expected :623.9
Actual :62.39
<Click to see difference>
java.lang.AssertionError: Value should match expected: schema.dec_11_2 expected:<623.9> but was:<62.39>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.apache.iceberg.spark.data.TestHelpers.assertEquals(TestHelpers.java:631)
at org.apache.iceberg.spark.data.TestHelpers.assertEquals(TestHelpers.java:641)
at org.apache.iceberg.spark.data.TestHelpers.assertEquals(TestHelpers.java:612)
at org.apache.iceberg.spark.data.TestHelpers.assertEquals(TestHelpers.java:599)
at org.apache.iceberg.spark.data.TestSparkRecordOrcReaderWriter.writeAndValidate(TestSparkRecordOrcReaderWriter.java:86)
at org.apache.iceberg.spark.data.AvroDataTest.testSimpleStruct(AvroDataTest.java:67)
at java.lang.Thread.run(Thread.java:748)
After checking the iceberg code, I found that the hive decimal will decrease its decimal scale by removing its trailing zero (Pls see here ) while our GenericOrcWriter and SparkOrcWriter did not consider this case, so we messed up the scale of the decimal.
The unit test is here.
FYI @rdsr @rdblue @shardulm94
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels