Spark is an amazingly powerful big data engine that's written in Scala with different programatic APIs.
This repo contains style guides for the different APIs:
- PySpark style guide for the Python API.
- Spark Scala style guide.
This document draws on the Spark source code, the Spark examples, and popular open source Spark libraries to outline coding conventions and best practices.
Comprehensive Python/Scala/Java style guides already exist and this document focuses specifically on the style issues for Spark programmers.
Reader beware:
Any style guide written in English is either so brief that it’s ambiguous, or so long that no one reads it.