Skip to content

Conversation

@michalsenkyr
Copy link
Contributor

What changes were proposed in this pull request?

Add support for specific Java List subtypes in deserialization as well as a generic implicit encoder.

All List subtypes are supported by using either the size-specifying constructor (one int parameter) or the default constructor.

Interfaces/abstract classes use the following implementations:

  • java.util.List, java.util.AbstractList or java.util.AbstractSequentialList => java.util.ArrayList

How was this patch tested?

build/mvn -DskipTests clean package && dev/run-tests

Additionally in Spark shell:

scala> val jlist = new java.util.LinkedList[Int]; jlist.add(1)
jlist: java.util.LinkedList[Int] = [1]
res0: Boolean = true

scala> Seq(jlist).toDS().map(_.element()).collect()
res1: Array[Int] = Array(1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not supported intentionally. If we want to support java classes and scala classes mixedly, we should unify the ScalaRelfection and JavaTypeInference, which is a lot of effort.

For now I think it's more important to completely support specific collections type in scala, and then try to unify the ScalaRelfection and JavaTypeInference.

what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you can improve JavaTypeInference to support specific java list and test it with java bean in JavaDatasetSuite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not notice that there is separate inference code for Java classes. It would certainly be nice to unify the code for Java and Scala classes. Is this already being worked on/planned?

I moved the List support to JavaTypeInference and rewrote tests accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's planned but no one is working on it yet. You can start by creating a JIRA ticket if you are interested :)

@cloud-fan
Copy link
Contributor

ping @michalsenkyr

Remove specific Java List support from ScalaReflection
Remove implicit encoder for Java Lists
Add relevant tests to JavaDatasetSuite
Remove tests from ScalaReflectionSuite and DatasetPrimitiveSuite
@cloud-fan
Copy link
Contributor

ok to test

@cloud-fan
Copy link
Contributor

LGTM, pending test

@SparkQA
Copy link

SparkQA commented Jun 10, 2017

Test build #77879 has finished for PR 18009 at commit 881e636.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in f48273c Jun 12, 2017
dataknocker pushed a commit to dataknocker/spark that referenced this pull request Jun 16, 2017
## What changes were proposed in this pull request?

Add support for specific Java `List` subtypes in deserialization as well as a generic implicit encoder.

All `List` subtypes are supported by using either the size-specifying constructor (one `int` parameter) or the default constructor.

Interfaces/abstract classes use the following implementations:

* `java.util.List`, `java.util.AbstractList` or `java.util.AbstractSequentialList` => `java.util.ArrayList`

## How was this patch tested?

```bash
build/mvn -DskipTests clean package && dev/run-tests
```

Additionally in Spark shell:

```
scala> val jlist = new java.util.LinkedList[Int]; jlist.add(1)
jlist: java.util.LinkedList[Int] = [1]
res0: Boolean = true

scala> Seq(jlist).toDS().map(_.element()).collect()
res1: Array[Int] = Array(1)
```

Author: Michal Senkyr <mike.senkyr@gmail.com>

Closes apache#18009 from michalsenkyr/dataset-java-lists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants