This module sets up the resources to run Metaflow steps on AWS Batch. One can modify how many resources we want to have available, as well as configure autoscaling
This module is not required to use Metaflow, as you can also run steps locally, or in a Kubernetes cluster instead.
To read more, see the Metaflow docs
Name | Description | Type | Default | Required |
---|---|---|---|---|
batch_type | AWS Batch Compute Type ('ec2', 'fargate') | string |
"ec2" |
no |
compute_environment_additional_security_group_ids | Additional security group ids to apply to the Batch Compute environment | list(string) |
[] |
no |
compute_environment_allocation_strategy | Allocation strategy for Batch Compute environment (BEST_FIT, BEST_FIT_PROGRESSIVE, SPOT_CAPACITY_OPTIMIZED) | string |
"BEST_FIT" |
no |
compute_environment_desired_vcpus | Desired Starting VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) | number |
n/a | yes |
compute_environment_egress_cidr_blocks | CIDR blocks to which egress is allowed from the Batch Compute environment's security group | list(string) |
[ |
no |
compute_environment_instance_types | The instance types for the compute environment as a comma-separated list | list(string) |
n/a | yes |
compute_environment_max_vcpus | Maximum VCPUs for Batch Compute Environment [16-96] | number |
n/a | yes |
compute_environment_min_vcpus | Minimum VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) | number |
n/a | yes |
iam_partition | IAM Partition (Select aws-us-gov for AWS GovCloud, otherwise leave as is) | string |
"aws" |
no |
launch_template_http_endpoint | Whether the metadata service is available. Can be 'enabled' or 'disabled' | string |
"enabled" |
no |
launch_template_http_put_response_hop_limit | The desired HTTP PUT response hop limit for instance metadata requests. Can be an integer from 1 to 64 | number |
2 |
no |
launch_template_http_tokens | Whether or not the metadata service requires session tokens, also referred to as Instance Metadata Service Version 2 (IMDSv2). Can be 'optional' or 'required' | string |
"optional" |
no |
launch_template_image_id | AMI id for launch template, defaults to allow AWS Batch to decide | string |
null |
no |
metaflow_vpc_id | ID of the Metaflow VPC this SageMaker notebook instance is to be deployed in | string |
n/a | yes |
resource_prefix | Prefix given to all AWS resources to differentiate between applications | string |
n/a | yes |
resource_suffix | Suffix given to all AWS resources to differentiate between environment and workspace | string |
n/a | yes |
standard_tags | The standard tags to apply to every AWS resource. | map(string) |
n/a | yes |
subnet1_id | The first private subnet used for redundancy | string |
n/a | yes |
subnet2_id | The second private subnet used for redundancy | string |
n/a | yes |
Name | Description |
---|---|
METAFLOW_BATCH_JOB_QUEUE | AWS Batch Job Queue ARN for Metaflow |
batch_compute_environment_security_group_id | The ID of the security group attached to the Batch Compute environment. |
batch_job_queue_arn | The ARN of the job queue we'll use to accept Metaflow tasks |
ecs_execution_role_arn | The IAM role that grants access to ECS and Batch services which we'll use as our Metadata Service API's execution_role for our Fargate instance |
ecs_instance_role_arn | This role will be granted access to our S3 Bucket which acts as our blob storage. |