Skip to content

Latest commit

 

History

History
78 lines (54 loc) · 1.75 KB

README.md

File metadata and controls

78 lines (54 loc) · 1.75 KB

scala-spark-example

This demo covers using Apache Spark with the Scala SDK with a simple application.

This demo uses Apache Spark 2.4.4 and sentry-java 1.7.27

First Time Setup

Spark requires Java 8. It is recommended that you use jenv to manage your Java versions.

Check your Java version with:

java -version

You should get something like this:

openjdk version "1.8.0_222"

Install sbt with homebrew

brew install sbt

Download Apache Spark version 2.4.4 with Hadoop 2.7 - https://spark.apache.org/downloads.html

Set your $SPARK_HOME environmental variable to point to your Spark folder.

export SPARK_HOME=path/to/spark/spark-2.4.4-bin-hadoop2.7

Run

Package your application jar

sbt package

Run your application with spark-submit

Examples

SimpleApp - uses SentrySparkListener

$SPARK_HOME/bin/spark-submit \
  --class "SimpleApp" \
  --master "local[4]" \
  --files "sentry.properties" \
  --packages "io.sentry:sentry-spark_2.11:0.0.1-alpha04" \
  target/scala-2.11/simple-project_2.11-1.0.jar

SimpleQueryApp - uses SentryQueryExecutionListener

$SPARK_HOME/bin/spark-submit \
  --class "SimpleQueryApp" \
  --master "local[4]" \
  --files "sentry.properties" \
  --packages "io.sentry:sentry-spark_2.11:0.0.1-alpha04" \
  target/scala-2.11/simple-project_2.11-1.0.jar

SimpleStreamingQueryApp - uses SentryStreamingQueryListener

$SPARK_HOME/bin/spark-submit \
  --class "SimpleStreamingQueryApp" \
  --master "local[4]" \
  --files "sentry.properties" \
  --packages "io.sentry:sentry-spark_2.11:0.0.1-alpha04" \
  target/scala-2.11/simple-project_2.11-1.0.jar