Skip to content

gfunc/aws-glue-docker

 
 

Repository files navigation

Supported tags and respective Dockerfile links

Simple Tags

Python Shell

Spark

You can use Python extension modules and libraries with your AWS Glue ETL scripts as long as they are written in pure Python. C libraries such as pandas are not supported at the present time, nor are extensions written in other languages.
-- AWS

AWS Glue Docker

Software License

AWS Glue Development enviroment based on svajiraya/aws-glue-libs fix.

Getting started

# install docker and configure aliases
curl -sSL https://raw.githubusercontent.com/webysther/aws-glue-docker/master/start.sh | sh

# to use pandas
glue

# or pyspark
glue-spark

# here you are inside docker

# Glue PySpark (REPL)
pyspark

# Glue PySpark
# /app is you current folder
glue-spark sparksubmit /app/spark_script.py

# Test
glue pytest

# aliases inside docker (backwards compatibility)
gluesparksubmit == sparksubmit
gluepyspark == pyspark
gluepytest == pytest

License

MIT License. Please see License File for more information.

About

🐋 Docker image for AWS Glue Spark/Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Dockerfile 82.6%
  • Shell 17.4%