Skip to content

Conversation

@mengxr
Copy link
Contributor

@mengxr mengxr commented Dec 13, 2016

What changes were proposed in this pull request?

Mention spark.randomForest and spark.gbt in vignettes. Keep the content minimal since users can type ?spark.randomForest to see the full doc.

cc: @jkbradley

@mengxr mengxr changed the title SPARK-18792] [R] add spark.randomForest to vignettes SPARK-18793] [R] add spark.randomForest to vignettes Dec 13, 2016
@mengxr mengxr changed the title SPARK-18793] [R] add spark.randomForest to vignettes [SPARK-18793] [R] add spark.randomForest to vignettes Dec 13, 2016
@mengxr mengxr changed the title [SPARK-18793] [R] add spark.randomForest to vignettes [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/spark.gbt to vignettes Dec 13, 2016
@felixcheung
Copy link
Member

LGTM. maybe good to have one example for classification but optional

@felixcheung
Copy link
Member

and if we are doing "(Added in Spark 2.1.0)" in https://github.com/apache/spark/pull/16222/files perhaps we should have the same for these 2?

actually, I'd vote for removing them - vignettes is for that specific version of package you install. If you install SparkR 2.1.0 what is described there is in 2.1.0, by default

```{r}
df <- createDataFrame(longley)
rfModel <- spark.randomForest(df, Employed ~ ., type = "regression", numTrees = 5)
rfModel <- spark.randomForest(df, Employed ~ ., type = "regression", maxDepth = 2, numTrees = 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also do this to limit output

ops <- options()
options(max.print=40)

there is another example in the vignettes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current length of the output seems fine (<40 lines).


In the following example, we use the `longley` dataset to train a random forest and make predictions:

```{r}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

longley has columns with . like Armed.Forces
suggest putting warning = FALSE like here otherwise we will have a warning in the output vignettes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@SparkQA
Copy link

SparkQA commented Dec 13, 2016

Test build #70068 has finished for PR 16264 at commit 292745b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 13, 2016

Test build #70070 has finished for PR 16264 at commit be9c846.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 13, 2016

Test build #70069 has finished for PR 16264 at commit fb1933d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor Author

mengxr commented Dec 13, 2016

Agree that it is not necessary to mention the version added here. I will send a follow-up PR after this one to rearrange the ordering of the ML algorithms. There is no logical ordering now.

@SparkQA
Copy link

SparkQA commented Dec 13, 2016

Test build #70095 has finished for PR 16264 at commit b3bf19f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor Author

mengxr commented Dec 14, 2016

Merged into master and branch-2.1. The last commit passed Jenkins.

@asfgit asfgit closed this in 594b14f Dec 14, 2016
asfgit pushed a commit that referenced this pull request Dec 14, 2016
…nettes

## What changes were proposed in this pull request?

Mention `spark.randomForest` and `spark.gbt` in vignettes. Keep the content minimal since users can type `?spark.randomForest` to see the full doc.

cc: jkbradley

Author: Xiangrui Meng <meng@databricks.com>

Closes #16264 from mengxr/SPARK-18793.

(cherry picked from commit 594b14f)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@mengxr
Copy link
Contributor Author

mengxr commented Dec 14, 2016

@HyukjinKwon Could you take a look at the AppVeyor error?

@felixcheung
Copy link
Member

felixcheung commented Dec 14, 2016 via email

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Dec 14, 2016

Yup, it seems failed time to time due to network problem. FYI, there was some discussions about this in #15686 (comment) and #15697 (comment)

I could not find a good workaround to re-trigger this so far rather than closing and reopening and it seems (I manually checked and privately asked if it is true) some Apahce projects using Travis CI/AppVeyor are also re-triggering this via closing and opening.

robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 15, 2016
…nettes

## What changes were proposed in this pull request?

Mention `spark.randomForest` and `spark.gbt` in vignettes. Keep the content minimal since users can type `?spark.randomForest` to see the full doc.

cc: jkbradley

Author: Xiangrui Meng <meng@databricks.com>

Closes apache#16264 from mengxr/SPARK-18793.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…nettes

## What changes were proposed in this pull request?

Mention `spark.randomForest` and `spark.gbt` in vignettes. Keep the content minimal since users can type `?spark.randomForest` to see the full doc.

cc: jkbradley

Author: Xiangrui Meng <meng@databricks.com>

Closes apache#16264 from mengxr/SPARK-18793.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants