You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Download [Apache Spark 2.3+](https://spark.apache.org/downloads.html) and extract it into a local folder (e.g., `C:\bin\spark-2.3.2-bin-hadoop2.7\`) using [7-zip](https://www.7-zip.org/). (The supported spark versions are 2.3.*, 2.4.0, 2.4.1, 2.4.3 and 2.4.4)
35
-
- Add a [new environment variable](https://www.java.com/en/download/help/path.xml)`SPARK_HOME` e.g., `C:\bin\spark-2.3.2-bin-hadoop2.7\`.
33
+
- Download [Apache Spark 2.3+](https://spark.apache.org/downloads.html) and extract it into a local folder (for example, *C:\bin\spark-2.3.2-bin-hadoop2.7\*) using [7-zip](https://www.7-zip.org/). (The supported spark versions are 2.3.*, 2.4.0, 2.4.1, 2.4.3 and 2.4.4)
34
+
- Add a [new environment variable](https://www.java.com/en/download/help/path.xml)`SPARK_HOME`. For example, *C:\bin\spark-2.3.2-bin-hadoop2.7\*.
36
35
37
36
```powershell
38
37
set SPARK_HOME=C:\bin\spark-2.3.2-bin-hadoop2.7\
39
38
```
40
39
41
-
- Add Apache Spark to your [PATH environment variable](https://www.java.com/en/download/help/path.xml) e.g., `C:\bin\spark-2.3.2-bin-hadoop2.7\bin`.
40
+
- Add Apache Spark to your [PATH environment variable](https://www.java.com/en/download/help/path.xml). For example, *C:\bin\spark-2.3.2-bin-hadoop2.7\bin*.
42
41
43
42
```powershell
44
43
set PATH=%SPARK_HOME%\bin;%PATH%
@@ -66,36 +65,36 @@ If you already have all of the following prerequisites, skip to the [build](#bui
- Download `winutils.exe` binary from [WinUtils repository](https://github.com/steveloughran/winutils). You should select the version of Hadoop the Spark distribution was compiled with, e.g. use hadoop-2.7.1 for Spark 2.3.2.
70
-
- Save `winutils.exe` binary to a directory of your choice e.g., `C:\hadoop\bin`.
68
+
- Download `winutils.exe` binary from [WinUtils repository](https://github.com/steveloughran/winutils). You should select the version of Hadoop the Spark distribution was compiled with. For exammple, use hadoop-2.7.1 for Spark 2.3.2.
69
+
- Save `winutils.exe` binary to a directory of your choice. For example, *C:\hadoop\bin*.
71
70
- Set `HADOOP_HOME` to reflect the directory with winutils.exe (without bin). For instance, using command-line:
72
71
73
72
```powershell
74
73
set HADOOP_HOME=C:\hadoop
75
74
```
76
75
77
-
- Set PATH environment variable to include `%HADOOP_HOME%\bin`. For instance, using command-line:
76
+
- Set PATH environment variable to include `%HADOOP_HOME%\bin`. For instance, using commandline:
78
77
79
78
```powershell
80
79
set PATH=%HADOOP_HOME%\bin;%PATH%
81
80
```
82
81
83
-
Make sure you are able to run `dotnet`, `java`, `mvn`, `spark-shell` from your command-line before you move to the next section. Feel there is a better way? Please [open an issue](https://github.com/dotnet/spark/issues) and feel free to contribute.
82
+
Make sure you are able to run `dotnet`, `java`, `mvn`, `spark-shell` from your commandline before you move to the next section. Feel there is a better way? [Open an issue](https://github.com/dotnet/spark/issues) and feel free to contribute.
84
83
85
84
> [!NOTE]
86
-
> A new instance of the command-line may be required if any environment variables were updated.
85
+
> A new instance of the commandline may be required if any environment variables were updated.
87
86
88
87
## Build
89
88
90
-
For the remainder of this guide, you will need to have cloned the .NET for Apache Spark repository into your machine. You can choose any location for the cloned repository, e.g., `C:\github\dotnet-spark\`.
89
+
For the remainder of this guide, you will need to have cloned the .NET for Apache Spark repository into your machine. You can choose any location for the cloned repository. For example, *C:\github\dotnet-spark\*.
### Build .NET for Apache Spark Scala extensions layer
97
96
98
-
When you submit a .NET application, .NET for Apache Spark has the necessary logic written in Scala that informs Apache Spark how to handle your requests (e.g., request to create a new Spark Session, request to transfer data from .NET side to JVM side etc.). This logic can be found in the [.NET for Spark Scala Source Code](https://github.com/dotnet/spark/tree/master/src/scala).
97
+
When you submit a .NET application, .NET for Apache Spark has the necessary logic written in Scala that informs Apache Spark how to handle your requests (for example, request to create a new Spark Session, request to transfer data from .NET side to JVM side etc.). This logic can be found in the [.NET for Spark Scala Source Code](https://github.com/dotnet/spark/tree/master/src/scala).
99
98
100
99
Regardless of whether you are using .NET Framework or .NET Core, you will need to build the .NET for Apache Spark Scala extension layer:
101
100
@@ -208,13 +207,13 @@ This section explains how to build the [sample applications](https://github.com/
208
207
209
208
Once you build the samples, running them will be through `spark-submit` regardless of whether you are targeting .NET Framework or .NET Core. Make sure you have followed the [prerequisites](#prerequisites) sectionandinstalledApacheSpark.
0 commit comments