Skip to content

Conversation

@srowen
Copy link
Member

@srowen srowen commented Mar 11, 2015

Avoid UnsupportedOperationException from JsonRDD.inferSchema on empty RDD.

Not sure if this is supposed to be an error (but a better one), but it seems like this case can come up if the input is down-sampled so much that nothing is sampled.

Now stuff like this:

sqlContext.jsonRDD(sc.parallelize(List[String]()))

just results in

org.apache.spark.sql.DataFrame = []

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28450 has started for PR 4971 at commit 3c619e1.

  • This patch merges cleanly.

@marmbrus
Copy link
Contributor

LGTM, thanks @srowen

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super nit, but I think typically we would just do Set.empty here or maybe Set.empty[(String, DataType)] if you really want to be explicit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No prob, will do, and if the tests succeed I'll merge.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28450 has finished for PR 4971 at commit 3c619e1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28450/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28473 has started for PR 4971 at commit 3699964.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28473 has finished for PR 4971 at commit 3699964.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28473/
Test PASSed.

@asfgit asfgit closed this in 55c4831 Mar 11, 2015
@srowen srowen deleted the SPARK-6245 branch March 13, 2015 15:06
asfgit pushed a commit that referenced this pull request Mar 16, 2015
Avoid `UnsupportedOperationException` from JsonRDD.inferSchema on empty RDD.

Not sure if this is supposed to be an error (but a better one), but it seems like this case can come up if the input is down-sampled so much that nothing is sampled.

Now stuff like this:
```
sqlContext.jsonRDD(sc.parallelize(List[String]()))
```
just results in
```
org.apache.spark.sql.DataFrame = []
```

Author: Sean Owen <sowen@cloudera.com>

Closes #4971 from srowen/SPARK-6245 and squashes the following commits:

3699964 [Sean Owen] Set() -> Set.empty
3c619e1 [Sean Owen] Avoid UnsupportedOperationException from JsonRDD.inferSchema on empty RDD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants