Skip to content

Example on how to run locally AWS Glue & Localstack (as S3 replacement)

Notifications You must be signed in to change notification settings

ifoukarakis/local-aws-glue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

local-aws-glue

AWS Glue offers a really nice set of tools. However, in order to get started either an AWS account is required or by using a docker image plus some setup.

This repo offers an example docker-compose.yml file, accompanied by a project setup. You can use this setup to jump start your Glue experimentation.

Features

  • Run Glue locally either via jobs or via Jupyter Lab
  • Local S3 using Localstack

Requirements

Docker and docker compose (or similar) is all you need.

Configuration

S3 setup can optionally be done on container startup. Just edit .aws/buckets.sh. This bash script can contain any set of AWS CLI S3 operations.

Running

Simply run

docker compose up

Jupyter notebooks

JupyterLab is available at http://127.0.0.1:8888/. Any notebooks under notebooks/ will be available. A couple of sample notebooks exist to get you started.

S3 operations using AWS CLI

AWS CLI can be used for managing buckets and objects. The only requirement is that mock credentials have been defined. Here's an example:

AWS_ACCESS_KEY_ID=mock AWS_SECRET_ACCESS_KEY=mock aws --endpoint-url=http://localhost:4566  s3 ls

Running test jobs

All jobs under jobs/ will be copied automatically under /opt/jobs inside Glue docker container.

Connect to the docker container for glue. The command should be similar to:

docker exec -it local-aws-glue-glue-1 /bin/bash

Then using the container's bash shell use glue-spark-submit to run a job. For example, you can run orders.py by running:

glue-spark-submit --master local\[*\] /opt/jobs/orders.py

About

Example on how to run locally AWS Glue & Localstack (as S3 replacement)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published