Skip to content

Conversation

@jinxing64
Copy link

What changes were proposed in this pull request?

Current ColumnarBatchSuite has very simple test cases for Array and Struct. This pr wants to add some test suites for complicated cases in ColumnVector.

@jinxing64 jinxing64 changed the title [SPARK-21047] Add test suites for complicated cases in ColumnarBatchSuite [WIP][SPARK-21047] Add test suites for complicated cases in ColumnarBatchSuite Jun 16, 2017
@jinxing64
Copy link
Author

@kiszk
Would you mind if I make a try for this JIRA?

@SparkQA
Copy link

SparkQA commented Jun 16, 2017

Test build #78178 has finished for PR 18327 at commit 5668978.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64 jinxing64 changed the title [WIP][SPARK-21047] Add test suites for complicated cases in ColumnarBatchSuite [SPARK-21047] Add test suites for complicated cases in ColumnarBatchSuite Jun 19, 2017
@kiszk
Copy link
Member

kiszk commented Jun 19, 2017

Thanks, let me take a look tonight tomorrow.

return getArray(ordinal);
} else if (dataType instanceof StructType) {
return getStruct(ordinal, ((StructType)dataType).fields().length);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to support other data types (e.g. ByteType, StringType, BinaryType, DateType, TimestampType, and MapType). It would be good to see copy() method.

}}
}

test("Nest Array in Array.") {
Copy link
Member

@kiszk kiszk Jun 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we prepare a test for primitive type array (e.g. new ArrayType(new ArrayType(IntegerType, false), true), too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not pretty sure about this. Could you please give more details ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. My previous comment was a little bit wrong.

What I want to say is the following. It would be good to prepare the following two cases for array test.
In the case of new ArrayType(new ArrayType(IntegerType, false), true), all of the elements in each leaf array must not have null (i.e. this test case).
In the case of new ArrayType(new ArrayType(IntegerType, true), true), any element in each leaf array may have null (e.g. [[0]], [[1, 2], []], [[], [3, null, 5]]).

c1.putArray(2, 3, 3)

assert(column.getStruct(0).getInt(0) === 0)
assert(column.getStruct(0).getArray(1).toIntArray === Array(0, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: toIntArray()? since other places use toIntArray().

@kiszk
Copy link
Member

kiszk commented Jun 20, 2017

LGTM except three comments.
@cloud-fan could you please look at this? I believe that these test suites would help make implementations of the API stable.

@jinxing64
Copy link
Author

@kiszk
Thank you so much !
I will read your comments carefully and refine this pr : )

@SparkQA
Copy link

SparkQA commented Jun 21, 2017

Test build #78380 has finished for PR 18327 at commit 66b4eff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2017

Test build #78381 has finished for PR 18327 at commit 36dff00.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 22, 2017

Test build #78416 has finished for PR 18327 at commit 78102f3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 22, 2017

Test build #78423 has finished for PR 18327 at commit 9319a2f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link
Author

@kiszk
I tried to add a test Nest Array(containing null) in Array.. Please take a look when you have time and I will continue working on this :)

@Override
public Object get(int ordinal, DataType dataType) {
throw new UnsupportedOperationException();
if (dataType instanceof BooleanType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to use ‘match’ style.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review :)
Could you be specific how to do that in this java code ?

assert(column.getArray(1).getArray(0).toIntArray() === Array(1, 2))
assert(column.getArray(1).getArray(1).toIntArray() === Array())
assert(column.getArray(2).getArray(0).toIntArray() === Array())
assert(column.getArray(2).getArray(1)isNullAt(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be getArray(1).isNullAt(0)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure.

@SparkQA
Copy link

SparkQA commented Jun 22, 2017

Test build #78456 has finished for PR 18327 at commit 47be362.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Jun 23, 2017

LGTM
@cloud-fan could you please review this?

}
}

test("Nest Array(containing null) in Array.") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we have 2 tests for containing null and not containing null?

Copy link
Member

@kiszk kiszk Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One is for ColumnVector.allocate(10, new ArrayType(new ArrayType(IntegerType, true), true)). The other is for ColumnVector.allocate(10, new ArrayType(new ArrayType(IntegerType, false), true)). For different schemes.
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the containing null case covers the not containing null case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you are right since now we are testing correct cases.

Is it better to add a test suite to put null into an array for the case ColumnVector.allocate(10, new ArrayType(new ArrayType(IntegerType, false), true))?

Copy link
Author

@jinxing64 jinxing64 Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiszk @cloud-fan
It seems that there is no check for putting null into a ArrayType(IntegerType, false).
And do you think it's better to fail the putNullmethod in OffHeapColumnVector and OnHeapColumnVector when containsNull=false. I'm happy to make another pr for that :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. That's why I asked to prepare two cases.

@cloud-fan
Copy link
Contributor

LGTM except one comment

assert(column.getArray(2).getArray(0).toIntArray() === Array())
assert(column.getArray(2).getArray(1).isNullAt(0))
assert(Array(column.getArray(2).getArray(1).getInt(1),
column.getArray(2).getArray(1).getInt(2)) === Array(4, 5))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can split this into 2 asserts

@cloud-fan
Copy link
Contributor

LGTM pending jenkins

@jinxing64
Copy link
Author

@cloud-fan
Thanks a lot for taking time review this :)

@SparkQA
Copy link

SparkQA commented Jun 23, 2017

Test build #78514 has finished for PR 18327 at commit ed210fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 153dd49 Jun 23, 2017
@jinxing64
Copy link
Author

@cloud-fan
Thanks for merging :)

robert3005 pushed a commit to palantir/spark that referenced this pull request Jun 29, 2017
…uite

## What changes were proposed in this pull request?
Current ColumnarBatchSuite has very simple test cases for `Array` and `Struct`. This pr wants to add  some test suites for complicated cases in ColumnVector.

Author: jinxing <jinxing6042@126.com>

Closes apache#18327 from jinxing64/SPARK-21047.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants