apache · HyukjinKwon · May 20, 2016 · May 20, 2016 · May 20, 2016 · May 20, 2016
diff --git a/R/README.md b/R/README.md
@@ -1,11 +1,13 @@
 # R on Spark
 
 SparkR is an R package that provides a light-weight frontend to use Spark from R.
+
 ### Installing sparkR
 
 Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
 By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
 Example: 
+
 ```
 # where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
 export R_HOME=/home/username/R
@@ -17,6 +19,7 @@ export R_HOME=/home/username/R
 #### Build Spark
 
 Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
+
 ```
   build/mvn -DskipTests -Psparkr package
 ```
@@ -38,6 +41,7 @@ To set other options like driver memory, executor memory etc. you can pass in th
 #### Using SparkR from RStudio
 
 If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example 
+
 ```
 # Set this to where Spark is installed
 Sys.setenv(SPARK_HOME="/Users/username/spark")
@@ -64,13 +68,15 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
 
     ./bin/spark-submit examples/src/main/r/dataframe.R
 
-You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):
+You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
 
     R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
     ./R/run-tests.sh
 
 ### Running on YARN
+
 The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
+
 ```
 export YARN_CONF_DIR=/etc/hadoop/conf
 ./bin/spark-submit --master yarn examples/src/main/r/dataframe.R

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
@@ -11,3 +11,23 @@ include Rtools and R in `PATH`.
 directory in Maven in `PATH`.
 4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
 5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`
+
+##  Unit tests
+
+To run the SparkR unit tests on Windows, the following steps are required —assuming you are in the Spark root directory and do not have Apache Hadoop installed already:
+
+1. Create a folder to download Hadoop related files for Windows. For example, `cd ..` and `mkdir hadoop`.
+
+2. Download the relevant Hadoop bin package from [steveloughran/winutils](https://github.com/steveloughran/winutils). While these are not official ASF artifacts, they are built from the ASF release git hashes by a Hadoop PMC member on a dedicated Windows VM. For further reading, consult [Windows Problems on the Hadoop wiki](https://wiki.apache.org/hadoop/WindowsProblems).
+
+3. Install the files into `hadoop\bin`; make sure that `winutils.exe` and `hadoop.dll` are present.
+
+4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.
+
+5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
+
+    ```
+    R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
+    .\bin\spark-submit2.cmd --conf spark.hadoop.fs.defualt.name="file:///" R\pkg\tests\run-all.R
+    ```
+