AWS Batch + EFA

Sample code to setup NAS Parallel Benchmark using EFA and AWS Batch

This is the sample code for the AWS Batch Blog: Run High Performance Computing Workloads using AWS Batch MultiNode Jobs with Elastic Fabric Adapter

License

This library is licensed under the MIT-0 License. See the LICENSE file.

AWS Batch + EFA

To get started, clone this repo locally:

git clone https://github.com/sean-smith/aws-batch-efa-blogpost.git
cd aws-batch-efa-blogpost/

AWS Batch Resources

In part 1, we'll create all the necessary AWS Batch resources.

cd batch-resources/

First we'll create a launch template, this launch template installs EFA on the instance and configures the network interface to use EFA. In the launch_template.json file substitute <Account Id>, <Security Group>, <Subnet Id> and <KEY-PAIR-NAME> with your your information.

Now create the launch template:

aws ec2 create-launch-template --cli-input-json file://launch_template.json

To ensure optimal physical locality of instances, we create a placement group, with strategy cluster.

aws ec2 create-placement-group --group-name "efa" --strategy "cluster" --region [your_region]

Next, we'll create the compute environment, this defines the instance type, subnet and IAM role to be used. Edit the <same-subnet-as-in-LaunchTemplate> and <account-id> sections with the pertinent information. Then create the compute environment:

aws batch create-compute-environment --cli-input-json file://compute_environment.json

Finally we need a job queue to point to the compute environment:

aws batch create-job-queue --cli-input-json file://job_queue.json

Dockerfile

In part 2, we build the docker image and upload it to Elastic Container Registery (ECR), so we can use it in our job.

cd ..
pwd # you should be in the aws-batch-efa-blogpost/ directory

First we'll build the docker image, to help with this, we included a Makefile, simply run:

make

Then you can push the docker image to ECR, first modify the top of the Makefile with your region and account id:

AWS_REGION=[your region]
ACCOUNT_ID=[your account id]

Next, push to ECR, note the Makefile assumes you have an ECR repo named aws-batch-efa:

make push     # logs in, tags, and pushes to ECR

Job Definition

Now we need a job definition, this defines which docker image to use for the job:

aws batch register-job-definition --cli-input-json file://job_definition.json
{
    "jobDefinitionArn": "arn:aws:batch:us-east-1:<account-id>:job-definition/EFA-MPI-JobDefinition:1",
    "jobDefinitionName": "EFA-MPI-JobDefinition",
    "revision": 1
}

Submit a job

Finally we can submit a job!

make submit

Credit

Arya Hezarkhani, Software Development Engineer, AWS Batch
Jason Rupard, Principal Systems Development Engineer, AWS Batch
Sean Smith, Software Development Engineer II, AWS HPC

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
batch-resources		batch-resources
batch-runtime-scripts		batch-runtime-scripts
conf/supervisord		conf/supervisord
supervised-scripts		supervised-scripts
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
THIRD_PARTY_LICENSES.txt		THIRD_PARTY_LICENSES.txt
make.def_efa		make.def_efa
suite.def		suite.def

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sample code to setup NAS Parallel Benchmark using EFA and AWS Batch

License

AWS Batch + EFA

AWS Batch Resources

Dockerfile

Job Definition

Submit a job

Credit

About

Releases

Packages

Languages

License

shijin-aws/aws-batch-efa

Folders and files

Latest commit

History

Repository files navigation

Sample code to setup NAS Parallel Benchmark using EFA and AWS Batch

License

AWS Batch + EFA

AWS Batch Resources

Dockerfile

Job Definition

Submit a job

Credit

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages