-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21047] Add test suites for complicated cases in ColumnarBatchSuite #18327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@kiszk |
|
Test build #78178 has finished for PR 18327 at commit
|
|
Thanks, let me take a look |
| return getArray(ordinal); | ||
| } else if (dataType instanceof StructType) { | ||
| return getStruct(ordinal, ((StructType)dataType).fields().length); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to support other data types (e.g. ByteType, StringType, BinaryType, DateType, TimestampType, and MapType). It would be good to see copy() method.
| }} | ||
| } | ||
|
|
||
| test("Nest Array in Array.") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we prepare a test for primitive type array (e.g. new ArrayType(new ArrayType(IntegerType, false), true), too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not pretty sure about this. Could you please give more details ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. My previous comment was a little bit wrong.
What I want to say is the following. It would be good to prepare the following two cases for array test.
In the case of new ArrayType(new ArrayType(IntegerType, false), true), all of the elements in each leaf array must not have null (i.e. this test case).
In the case of new ArrayType(new ArrayType(IntegerType, true), true), any element in each leaf array may have null (e.g. [[0]], [[1, 2], []], [[], [3, null, 5]]).
| c1.putArray(2, 3, 3) | ||
|
|
||
| assert(column.getStruct(0).getInt(0) === 0) | ||
| assert(column.getStruct(0).getArray(1).toIntArray === Array(0, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: toIntArray()? since other places use toIntArray().
|
LGTM except three comments. |
|
@kiszk |
|
Test build #78380 has finished for PR 18327 at commit
|
|
Test build #78381 has finished for PR 18327 at commit
|
|
Test build #78416 has finished for PR 18327 at commit
|
|
Test build #78423 has finished for PR 18327 at commit
|
|
@kiszk |
| @Override | ||
| public Object get(int ordinal, DataType dataType) { | ||
| throw new UnsupportedOperationException(); | ||
| if (dataType instanceof BooleanType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to use ‘match’ style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review :)
Could you be specific how to do that in this java code ?
| assert(column.getArray(1).getArray(0).toIntArray() === Array(1, 2)) | ||
| assert(column.getArray(1).getArray(1).toIntArray() === Array()) | ||
| assert(column.getArray(2).getArray(0).toIntArray() === Array()) | ||
| assert(column.getArray(2).getArray(1)isNullAt(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be getArray(1).isNullAt(0)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure.
|
Test build #78456 has finished for PR 18327 at commit
|
|
LGTM |
| } | ||
| } | ||
|
|
||
| test("Nest Array(containing null) in Array.") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we have 2 tests for containing null and not containing null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One is for ColumnVector.allocate(10, new ArrayType(new ArrayType(IntegerType, true), true)). The other is for ColumnVector.allocate(10, new ArrayType(new ArrayType(IntegerType, false), true)). For different schemes.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the containing null case covers the not containing null case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are right since now we are testing correct cases.
Is it better to add a test suite to put null into an array for the case ColumnVector.allocate(10, new ArrayType(new ArrayType(IntegerType, false), true))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kiszk @cloud-fan
It seems that there is no check for putting null into a ArrayType(IntegerType, false).
And do you think it's better to fail the putNullmethod in OffHeapColumnVector and OnHeapColumnVector when containsNull=false. I'm happy to make another pr for that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. That's why I asked to prepare two cases.
|
LGTM except one comment |
| assert(column.getArray(2).getArray(0).toIntArray() === Array()) | ||
| assert(column.getArray(2).getArray(1).isNullAt(0)) | ||
| assert(Array(column.getArray(2).getArray(1).getInt(1), | ||
| column.getArray(2).getArray(1).getInt(2)) === Array(4, 5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can split this into 2 asserts
|
LGTM pending jenkins |
|
@cloud-fan |
|
Test build #78514 has finished for PR 18327 at commit
|
|
thanks, merging to master! |
|
@cloud-fan |
…uite ## What changes were proposed in this pull request? Current ColumnarBatchSuite has very simple test cases for `Array` and `Struct`. This pr wants to add some test suites for complicated cases in ColumnVector. Author: jinxing <jinxing6042@126.com> Closes apache#18327 from jinxing64/SPARK-21047.
What changes were proposed in this pull request?
Current ColumnarBatchSuite has very simple test cases for
ArrayandStruct. This pr wants to add some test suites for complicated cases in ColumnVector.