📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
-
Updated
Dec 4, 2016 - Python
📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)
A Spark-based data pipeline analyzing Japan's visa data using PySpark, Plotly, and Docker on an Azure-hosted distributed cluster.
Add a description, image, and links to the spark-clusters topic page so that developers can more easily learn about it.
To associate your repository with the spark-clusters topic, visit your repo's landing page and select "manage topics."