Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
cf9f56c
Appveyor SparkR Windows test draft
HyukjinKwon Aug 29, 2016
b25eaba
Fix script path for R installation
HyukjinKwon Aug 29, 2016
2772002
Fix the name of script for R installation
HyukjinKwon Aug 29, 2016
6cb4416
Upgrade maven version to 3.3.9
HyukjinKwon Aug 29, 2016
a2852a0
Clean up and fix the path for Hadoop bin package
HyukjinKwon Aug 29, 2016
5aca104
Merged dependecies installation
HyukjinKwon Aug 29, 2016
fbcfe13
Remove R installation script
HyukjinKwon Aug 29, 2016
f3eb163
Clean up the dependencies installation script
HyukjinKwon Aug 29, 2016
fe95491
Fix comment
HyukjinKwon Aug 29, 2016
a8e74fc
Uppercase for Maven in the comment
HyukjinKwon Aug 29, 2016
69fd3f1
Clean up and make pretty
HyukjinKwon Aug 29, 2016
3a21367
Fix typo in variable names
HyukjinKwon Aug 29, 2016
2b9af15
Fix variable declaration
HyukjinKwon Aug 29, 2016
cdb24a7
Fix R version to 3.3.0
HyukjinKwon Aug 29, 2016
3988305
Remove meaningless CmdletBinding and Param
HyukjinKwon Aug 29, 2016
1f23b05
Consistent variable names
HyukjinKwon Aug 29, 2016
97f3ea7
Fix R version to 3.3.1 and minimize the codes
HyukjinKwon Aug 30, 2016
e7addc9
Consistent downloading via Start-FileDownload
HyukjinKwon Aug 30, 2016
1b7b5f3
Fix styles and nits
HyukjinKwon Aug 30, 2016
b1a5076
Tests with hive related ones as well
HyukjinKwon Aug 31, 2016
2cc7a47
Add a guide and file change detection for R
HyukjinKwon Sep 4, 2016
4f2db1e
Make the guide pretty
HyukjinKwon Sep 4, 2016
2e04911
Make the guide better
HyukjinKwon Sep 4, 2016
43a5a44
Build against master only
HyukjinKwon Sep 4, 2016
e6793a7
Add the documentation for UI configuration and appveyor.yml
HyukjinKwon Sep 4, 2016
ceef9bf
Change Appveyor to AppVeyor
HyukjinKwon Sep 4, 2016
ccf176d
Fix typo in the guide line
HyukjinKwon Sep 4, 2016
27075de
Fetch upstream to pass R tests
HyukjinKwon Sep 8, 2016
8e1954b
Add -Phadoop-2.6 profile
HyukjinKwon Sep 8, 2016
41dbdcf
Add the documenation for checking full details of failed tests
HyukjinKwon Sep 8, 2016
fe9419d
Consistent newlines for each title
HyukjinKwon Sep 8, 2016
9c06b92
Adds survival and e1071 for dependencies
HyukjinKwon Sep 8, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions appveyor.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

version: "{build}-{branch}"

shallow_clone: true

platform: x64
configuration: Debug

branches:
only:
- master

only_commits:
files:
- R/

cache:
- C:\Users\appveyor\.m2

install:
# Install maven and dependencies
- ps: .\dev\appveyor-install-dependencies.ps1
# Required package for R unit tests
- cmd: R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to include e1071, survival for a few more compatibility tests.
(see DESCRIPTION file)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have been broken with newer versions of testthat before, not sure if we should fix the version we run with here to match Jenkins?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks! I will.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point but its actually tricky to specify a version number using install.packages - FWIW on Jenkins I see

> packageVersion("testthat")
[1] ‘1.0.2’

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true - maybe just print out packageVersion into the log, in case it breaks

- cmd: R -e "packageVersion('testthat')"
- cmd: R -e "install.packages('e1071', repos='http://cran.us.r-project.org')"
- cmd: R -e "packageVersion('e1071')"
- cmd: R -e "install.packages('survival', repos='http://cran.us.r-project.org')"
- cmd: R -e "packageVersion('survival')"

build_script:
- cmd: mvn -DskipTests -Phadoop-2.6 -Psparkr -Phive -Phive-thriftserver package

test_script:
- cmd: .\bin\spark-submit2.cmd --conf spark.hadoop.fs.default.name="file:///" R\pkg\tests\run-all.R

notifications:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung Here is the configuration about the notification. If there is a preferable scenario, I will test and update with documentation. (BTW, it might be okay because the success/failure appear in each PR and this currently doesn't run nightly builds.)

Here is the documentation for more details - https://www.appveyor.com/docs/notifications/#triggering-notifications

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think spark-test#1 was an example @HyukjinKwon shared before. I think we can see the status in Github

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool!

- provider: Email
on_build_success: false
on_build_failure: false
on_build_status_changed: false

168 changes: 168 additions & 0 deletions dev/appveyor-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# AppVeyor Guides

Currently, SparkR on Windows is being tested with [AppVeyor](https://ci.appveyor.com). This page describes how to set up AppVeyor with Spark, how to run the build, check the status and stop the build via this tool. There is the documenation for AppVeyor [here](https://www.appveyor.com/docs). Please refer this for full details.


### Setting up AppVeyor

#### Sign up AppVeyor.

- Go to https://ci.appveyor.com, and then click "SIGN UP FOR FREE".

<img width="196" alt="2016-09-04 11 07 48" src="https://cloud.githubusercontent.com/assets/6477701/18228809/2c923aa4-7299-11e6-91b4-f39eff5727ba.png">

- As Apache Spark is one of open source projects, click "FREE - for open-source projects".

<img width="379" alt="2016-09-04 11 07 58" src="https://cloud.githubusercontent.com/assets/6477701/18228810/2f674e5e-7299-11e6-929d-5c2dff269ddc.png">

- Click "Github".

<img width="360" alt="2016-09-04 11 08 10" src="https://cloud.githubusercontent.com/assets/6477701/18228811/344263a0-7299-11e6-90b7-9b1c7b6b8b01.png">


#### After signing up, go to profile to link Github and AppVeyor.

- Click your account and then click "Profile".

<img width="204" alt="2016-09-04 11 09 43" src="https://cloud.githubusercontent.com/assets/6477701/18228803/12a4b810-7299-11e6-9140-5cfc277297b1.png">

- Enable the link with GitHub via clicking "Link Github account".

<img width="256" alt="2016-09-04 11 09 52" src="https://cloud.githubusercontent.com/assets/6477701/18228808/23861584-7299-11e6-9352-640a9c747c83.png">

- Click "Authorize application" in Github site.

<img width="491" alt="2016-09-04 11 10 05" src="https://cloud.githubusercontent.com/assets/6477701/18228814/5cc239e0-7299-11e6-8aeb-71305e22d930.png">


#### Add a project, Spark to enable the builds.

- Go to the PROJECTS menu.

<img width="97" alt="2016-08-30 12 16 31" src="https://cloud.githubusercontent.com/assets/6477701/18075017/2e572ffc-6eac-11e6-8e72-1531c81717a0.png">

- Click "NEW PROJECT" to add Spark.

<img width="144" alt="2016-08-30 12 16 35" src="https://cloud.githubusercontent.com/assets/6477701/18075026/3ee57bc6-6eac-11e6-826e-5dd09aeb0e7c.png">

- Since we will use Github here, click the "GITHUB" button and then click "Authorize Github" so that AppVeyor can access to the Github logs (e.g. commits).

<img width="517" alt="2016-09-04 11 10 22" src="https://cloud.githubusercontent.com/assets/6477701/18228819/9a4d5722-7299-11e6-900c-c5ff6b0450b1.png">

- Click "Authorize application" from Github (the above step will pop up this page).

<img width="484" alt="2016-09-04 11 10 27" src="https://cloud.githubusercontent.com/assets/6477701/18228820/a7cfce02-7299-11e6-8ec0-1dd7807eecb7.png">

- Come back to https://ci.appveyor.com/projects/new and then adds "spark".

<img width="738" alt="2016-09-04 11 10 36" src="https://cloud.githubusercontent.com/assets/6477701/18228821/b4b35918-7299-11e6-968d-233f18bc2cc7.png">


#### Check if any event supposed to run the build actually triggers the build.

- Click "PROJECTS" menu.

<img width="97" alt="2016-08-30 12 16 31" src="https://cloud.githubusercontent.com/assets/6477701/18075017/2e572ffc-6eac-11e6-8e72-1531c81717a0.png">

- Click Spark project.

<img width="707" alt="2016-09-04 11 22 37" src="https://cloud.githubusercontent.com/assets/6477701/18228828/5174cad4-729a-11e6-8737-bb7b9e0703c8.png">


### Checking the status, restarting and stopping the build

- Click "PROJECTS" menu.

<img width="97" alt="2016-08-30 12 16 31" src="https://cloud.githubusercontent.com/assets/6477701/18075017/2e572ffc-6eac-11e6-8e72-1531c81717a0.png">

- Locate "spark" and click it.

<img width="707" alt="2016-09-04 11 22 37" src="https://cloud.githubusercontent.com/assets/6477701/18228828/5174cad4-729a-11e6-8737-bb7b9e0703c8.png">

- Here, we can check the status of current build. Also, "HISTORY" shows the past build history.

<img width="709" alt="2016-09-04 11 23 24" src="https://cloud.githubusercontent.com/assets/6477701/18228825/01b4763e-729a-11e6-8486-1429a88d2bdd.png">

- If the build is stopped, "RE-BUILD COMMIT" button appears. Click this button to restart the build.

<img width="176" alt="2016-08-30 12 29 41" src="https://cloud.githubusercontent.com/assets/6477701/18075336/de618b52-6eae-11e6-8f01-e4ce48963087.png">

- If the build is running, "CANCEL BUILD" buttom appears. Click this button top cancel the current build.

<img width="158" alt="2016-08-30 1 11 13" src="https://cloud.githubusercontent.com/assets/6477701/18075806/4de68564-6eb3-11e6-855b-ee22918767f9.png">


### Specifying the branch for building and setting the build schedule

Note: It seems the configurations in UI and `appveyor.yml` are mutually exclusive according to the [documentation](https://www.appveyor.com/docs/build-configuration/#configuring-build).


- Click the settings button on the right.

<img width="1010" alt="2016-08-30 1 19 12" src="https://cloud.githubusercontent.com/assets/6477701/18075954/65d1aefa-6eb4-11e6-9a45-b9a9295f5085.png">

- Set the default branch to build as above.

<img width="422" alt="2016-08-30 12 42 25" src="https://cloud.githubusercontent.com/assets/6477701/18075416/8fac36c8-6eaf-11e6-9262-797a2a66fec4.png">

- Specify the branch in order to exclude the builds in other branches.

<img width="358" alt="2016-08-30 12 42 33" src="https://cloud.githubusercontent.com/assets/6477701/18075421/97b17734-6eaf-11e6-8b19-bc1dca840c96.png">

- Set the Crontab expression to regularly start the build. AppVeyor uses Crontab expression, [atifaziz/NCrontab](https://github.com/atifaziz/NCrontab/wiki/Crontab-Expression). Please refer the examples [here](https://github.com/atifaziz/NCrontab/wiki/Crontab-Examples).


<img width="471" alt="2016-08-30 12 42 43" src="https://cloud.githubusercontent.com/assets/6477701/18075450/d4ef256a-6eaf-11e6-8e41-74e38dac8ca0.png">


### Filtering commits and Pull Requests

Currently, AppVeyor is only used for SparkR. So, the build is only triggered when R codes are changed.

This is specified in `.appveyor.yml` as below:

```
only_commits:
files:
- R/
```

Please refer https://www.appveyor.com/docs/how-to/filtering-commits for more details.


### Checking the full log of the build

Currently, the console in AppVeyor does not print full details. This can be manually checked. For example, AppVeyor shows the failed tests as below in console

```
Failed -------------------------------------------------------------------------
1. Error: union on two RDDs (@test_binary_function.R#38) -----------------------
1: textFile(sc, fileName) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:38
2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
```

After downloading the log by clicking the log button as below:

![2016-09-08 11 37 17](https://cloud.githubusercontent.com/assets/6477701/18335227/b07d0782-75b8-11e6-94da-1b88cd2a2402.png)

the details can be checked as below (e.g. exceptions)

```
Failed -------------------------------------------------------------------------
1. Error: spark.lda with text input (@test_mllib.R#655) ------------------------
org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:/projects/spark/R/lib/SparkR/tests/testthat/data/mllib/sample_lda_data.txt;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:376)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:365)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
...

1: read.text("data/mllib/sample_lda_data.txt") at C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:655
2: dispatchFunc("read.text(path)", x, ...)
3: f(x, ...)
4: callJMethod(read, "text", paths)
5: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
6: stop(readString(conn))
```
126 changes: 126 additions & 0 deletions dev/appveyor-install-dependencies.ps1
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
<#
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
#>

$CRAN = "https://cloud.r-project.org"

Function InstallR {
if ( -not(Test-Path Env:\R_ARCH) ) {
$arch = "i386"
}
Else {
$arch = $env:R_ARCH
}

$urlPath = ""
$latestVer = $(ConvertFrom-JSON $(Invoke-WebRequest http://rversions.r-pkg.org/r-release).Content).version
If ($rVer -ne $latestVer) {
$urlPath = ("old/" + $rVer + "/")
}

$rurl = $CRAN + "/bin/windows/base/" + $urlPath + "R-" + $rVer + "-win.exe"

# Downloading R
Start-FileDownload $rurl "R-win.exe"

# Running R installer
Start-Process -FilePath .\R-win.exe -ArgumentList "/VERYSILENT /DIR=C:\R" -NoNewWindow -Wait

$RDrive = "C:"
echo "R is now available on drive $RDrive"

$env:PATH = $RDrive + '\R\bin\' + $arch + ';' + 'C:\MinGW\msys\1.0\bin;' + $env:PATH

# Testing R installation
Rscript -e "sessionInfo()"
}

Function InstallRtools {
$rtoolsver = $rToolsVer.Split('.')[0..1] -Join ''
$rtoolsurl = $CRAN + "/bin/windows/Rtools/Rtools$rtoolsver.exe"

# Downloading Rtools
Start-FileDownload $rtoolsurl "Rtools-current.exe"

# Running Rtools installer
Start-Process -FilePath .\Rtools-current.exe -ArgumentList /VERYSILENT -NoNewWindow -Wait

$RtoolsDrive = "C:"
echo "Rtools is now available on drive $RtoolsDrive"

if ( -not(Test-Path Env:\GCC_PATH) ) {
$gccPath = "gcc-4.6.3"
}
Else {
$gccPath = $env:GCC_PATH
}
$env:PATH = $RtoolsDrive + '\Rtools\bin;' + $RtoolsDrive + '\Rtools\MinGW\bin;' + $RtoolsDrive + '\Rtools\' + $gccPath + '\bin;' + $env:PATH
$env:BINPREF=$RtoolsDrive + '/Rtools/mingw_$(WIN)/bin/'
}

# create tools directory outside of Spark directory
$up = (Get-Item -Path ".." -Verbose).FullName
$tools = "$up\tools"
if (!(Test-Path $tools)) {
New-Item -ItemType Directory -Force -Path $tools | Out-Null
}

# ========================== Maven
Push-Location $tools

$mavenVer = "3.3.9"
Start-FileDownload "https://archive.apache.org/dist/maven/maven-3/$mavenVer/binaries/apache-maven-$mavenVer-bin.zip" "maven.zip"

# extract
Invoke-Expression "7z.exe x maven.zip"

# add maven to environment variables
$env:Path += ";$tools\apache-maven-$mavenVer\bin"
$env:M2_HOME = "$tools\apache-maven-$mavenVer"
$env:MAVEN_OPTS = "-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"

Pop-Location

# ========================== Hadoop bin package
$hadoopVer = "2.6.0"
$hadoopPath = "$tools\hadoop"
if (!(Test-Path $hadoopPath)) {
New-Item -ItemType Directory -Force -Path $hadoopPath | Out-Null
}
Push-Location $hadoopPath

Start-FileDownload "https://github.com/steveloughran/winutils/archive/master.zip" "winutils-master.zip"

# extract
Invoke-Expression "7z.exe x winutils-master.zip"

# add hadoop bin to environment variables
$env:HADOOP_HOME = "$hadoopPath/winutils-master/hadoop-$hadoopVer"

Pop-Location

# ========================== R
$rVer = "3.3.1"
$rToolsVer = "3.4.0"

InstallR
InstallRtools

$env:R_LIBS_USER = 'c:\RLibrary'
if ( -not(Test-Path $env:R_LIBS_USER) ) {
mkdir $env:R_LIBS_USER
}