Sandbox for learning Spark 2.1 using resources from Pluralsight and Learning Spark (O'Reilly)
- Intellij IDEA
- Maven
- Scala 2.11
- Spark 2.1
- PyCharm
- Python 3.5
- Spark 2.1
- Install Java 1.8 and Python 3
- Install Scala 2.11
- Download pre-built Spark 2.1
- Install py4j (I used PyCharm's repository search)
- Add following to '~/.bash_profile' 'export SPARK_HOME=/My/Spark/Directory/Spark' 'export PYTHONPATH=$SPARK_HOME/python'
- In PyCharm, add 'SPARK_HOME/python/' to interpreter path
Open terminal, cd to SPARK_HOME directory, then enter './bin/pyspark --master local[2]' Note that '[2]' represents the # of cores used