Skip to content

Conversation

@actuaryzhang
Copy link
Contributor

Update R doc:

  1. columns, names and colnames returns a vector of strings, not list as in current doc.
  2. colnames<- does allow the subset assignment, so the length of value can be less than the number of columns, e.g., colnames(df)[1] <- "a".

@felixcheung

@actuaryzhang
Copy link
Contributor Author

@felixcheung I see lots of the SparkDataFrame methods use the following in examples:

 path <- "path/to/file.json"
 df <- read.json(path)

I'm not sure where this json file resides. Do you think it's better to use a more concrete data example?

@HyukjinKwon
Copy link
Member

( @actuaryzhang , maybe we could make the PR title more meaningful to summerise the change just for a better shaped PR )

@srowen
Copy link
Member

srowen commented Mar 1, 2017

Yes please. The file doesn't exist. Its name suggests this is just an example path.

@SparkQA
Copy link

SparkQA commented Mar 1, 2017

Test build #73679 has finished for PR 17115 at commit 8082c82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

great thanks!

  • we could say return a vector of in cases it makes sense
  • colnames<- does allow the subset assignment - would be good to add example and tests if we don't have it already! ;)
  • path <- "path/to/file.json" yea it doesn't exist, it's mean to illustrate reading from a file. We have "don't run" for the examples so it doesn't matter much, but if you have ideas on how to make it more clear or concrete I'd definitely welcome that

@actuaryzhang actuaryzhang changed the title [Doc][Minor] Update R doc [Doc][Minor][SparkR] Update SparkR doc for names, columns and colnames Mar 1, 2017
@actuaryzhang
Copy link
Contributor Author

@HyukjinKwon Thanks. Updated title.
@felixcheung Updated doc and added tests. Thanks!

@actuaryzhang
Copy link
Contributor Author

@srowen @felixcheung
Thanks for the clarification. I will open another PR to add real data examples for the SparkDataFrame methods.

I have seen lots of R package document the methods together in a single doc. This way, it's easier to see what methods are available. Also the examples for all methods can be documented together, sharing the same data set (so that we don't have to create one data set for each method). We can use the @describeIn tag to do this. Let me know if this is a good approach.

@SparkQA
Copy link

SparkQA commented Mar 1, 2017

Test build #73704 has finished for PR 17115 at commit 921c437.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

@actuaryzhang can you point me to an example you are thinking of?

@felixcheung
Copy link
Member

yeah, looking at columns the example there isn't that great (in fact, the example for all the sql functions are pretty simplistic)

@felixcheung
Copy link
Member

this merged to master

@asfgit asfgit closed this in 2ff1467 Mar 1, 2017
@actuaryzhang actuaryzhang deleted the sparkRMinorDoc branch March 2, 2017 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants