Skip to content

Conversation

@sun-rui
Copy link
Contributor

@sun-rui sun-rui commented Jul 15, 2015

This is a WIP patch for SPARK-8844 for collecting reviews.

This bug is about reading an empty DataFrame. in readCol(),
lapply(1:numRows, function(x) {
does not take into consideration the case where numRows = 0.

Will add unit test case.

@SparkQA
Copy link

SparkQA commented Jul 15, 2015

Test build #37361 has finished for PR 7419 at commit 965a2bd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So fundamentally it looks like this we want to create a vector with mode = colType if the column is empty. But does it matter if we get the colType wrong ? i.e. what happens if we just use the default mode (which I think is logical)

@shivaram
Copy link
Contributor

cc @davies

@shivaram
Copy link
Contributor

@sun-rui Any update on this ?

@sun-rui
Copy link
Contributor Author

sun-rui commented Aug 15, 2015

@shivaram , as you pointed out, I come with a simpler fix. I realized that simply creating an empty vector using vector() without mode is OK when there is no row in columns. Because vector() creates a zero-length vector of type "logical", and "logical" is of virutally lowest precedence in the R's type hierarchy, so it is supposed that "logical" type won't affect later coercion in operations on a data.frame of 0 row, for example, rbind().

@shivaram
Copy link
Contributor

Jenkins, retest this please

@shivaram
Copy link
Contributor

Thanks @sun-rui -- this fix is simple and looks good. Couple of minor things

  1. There is an empty file nohup.out added in the PR which needs to be removed
  2. Could we also call head in the test case ?

@SparkQA
Copy link

SparkQA commented Aug 15, 2015

Test build #40969 has finished for PR 7419 at commit d0f6d9a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sun-rui
Copy link
Contributor Author

sun-rui commented Aug 16, 2015

@shivaram, fixed.

@davies
Copy link
Contributor

davies commented Aug 16, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Aug 16, 2015

Test build #40978 has finished for PR 7419 at commit cce54aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

LGTM. Merging this

asfgit pushed a commit that referenced this pull request Aug 16, 2015
This is a WIP patch for SPARK-8844  for collecting reviews.

This bug is about reading an empty DataFrame. in readCol(),
      lapply(1:numRows, function(x) {
does not take into consideration the case where numRows = 0.

Will add unit test case.

Author: Sun Rui <rui.sun@intel.com>

Closes #7419 from sun-rui/SPARK-8844.

(cherry picked from commit 5f9ce73)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
@asfgit asfgit closed this in 5f9ce73 Aug 16, 2015
CodingCat pushed a commit to CodingCat/spark that referenced this pull request Aug 17, 2015
This is a WIP patch for SPARK-8844  for collecting reviews.

This bug is about reading an empty DataFrame. in readCol(),
      lapply(1:numRows, function(x) {
does not take into consideration the case where numRows = 0.

Will add unit test case.

Author: Sun Rui <rui.sun@intel.com>

Closes apache#7419 from sun-rui/SPARK-8844.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants