Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 656 Bytes

README.md

File metadata and controls

21 lines (14 loc) · 656 Bytes

Spark Integration with Localstack

Aiming to enable local testing of pyspark when uploading/downloading files from S3 buckets. Localstack aim to provide an easy way to bring up various AWS services locally using Docker. Through the spark/conf/ folder, the application knows to connect to a custom endpoint locally (localhost:4572) when executing pyspark commands (i.e. spark.read.json('s3://my-test/data'))

Check an integration test example in tests/ folder.

Project is run with spark version 2.4.4

Dependencies:

pip install pyspark

Run the application:

make docker-run