Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(batch): fargate support for jobs #15848

Merged
merged 6 commits into from
Sep 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion packages/@aws-cdk/aws-batch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ For more information on **AWS Batch** visit the [AWS Docs for Batch](https://doc

## Compute Environment

At the core of AWS Batch is the compute environment. All batch jobs are processed within a compute environment, which uses resource like OnDemand or Spot EC2 instances.
At the core of AWS Batch is the compute environment. All batch jobs are processed within a compute environment, which uses resource like OnDemand/Spot EC2 instances or Fargate.

In **MANAGED** mode, AWS will handle the provisioning of compute resources to accommodate the demand. Otherwise, in **UNMANAGED** mode, you will need to manage the provisioning of those resources.

Expand Down Expand Up @@ -74,6 +74,21 @@ const spotEnvironment = new batch.ComputeEnvironment(stack, 'MySpotEnvironment',
});
```

### Fargate Compute Environment

It is possible to have AWS Batch submit jobs to be run on Fargate compute resources. Below is an example of how this can be done:

```ts
const vpc = new ec2.Vpc(this, 'VPC');

const fargateSpotEnvironment = new batch.ComputeEnvironment(stack, 'MyFargateEnvironment', {
computeResources: {
type: batch.ComputeResourceType.FARGATE_SPOT,
vpc,
},
});
```

### Understanding Progressive Allocation Strategies

AWS Batch uses an [allocation strategy](https://docs.aws.amazon.com/batch/latest/userguide/allocation-strategies.html) to determine what compute resource will efficiently handle incoming job requests. By default, **BEST_FIT** will pick an available compute instance based on vCPU requirements. If none exist, the job will wait until resources become available. However, with this strategy, you may have jobs waiting in the queue unnecessarily despite having more powerful instances available. Below is an example of how that situation might look like:
Expand Down
170 changes: 124 additions & 46 deletions packages/@aws-cdk/aws-batch/lib/compute-environment.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import { CfnComputeEnvironment } from './batch.generated';

/**
* Property to specify if the compute environment
* uses On-Demand or SpotFleet compute resources.
* uses On-Demand, SpotFleet, Fargate, or Fargate Spot compute resources.
*/
export enum ComputeResourceType {
/**
Expand All @@ -18,6 +18,20 @@ export enum ComputeResourceType {
* Resources will be EC2 SpotFleet resources.
*/
SPOT = 'SPOT',

/**
* Resources will be Fargate resources.
*/
FARGATE = 'FARGATE',

/**
* Resources will be Fargate Spot resources.
*
* Fargate Spot uses spare capacity in the AWS cloud to run your fault-tolerant,
* time-flexible jobs at up to a 70% discount. If AWS needs the resources back,
* jobs running on Fargate Spot will be interrupted with two minutes of notification.
*/
FARGATE_SPOT = 'FARGATE_SPOT',
}

/**
Expand Down Expand Up @@ -135,7 +149,7 @@ export interface ComputeResources {
readonly vpcSubnets?: ec2.SubnetSelection;

/**
* The type of compute environment: ON_DEMAND or SPOT.
* The type of compute environment: ON_DEMAND, SPOT, FARGATE, or FARGATE_SPOT.
*
* @default ON_DEMAND
*/
Expand Down Expand Up @@ -340,44 +354,49 @@ export class ComputeEnvironment extends Resource implements IComputeEnvironment
physicalName: props.computeEnvironmentName,
});

this.validateProps(props);
const isFargate = ComputeResourceType.FARGATE === props.computeResources?.type
|| ComputeResourceType.FARGATE_SPOT === props.computeResources?.type;;

this.validateProps(props, isFargate);

const spotFleetRole = this.getSpotFleetRole(props);
let computeResources: CfnComputeEnvironment.ComputeResourcesProperty | undefined;

// Only allow compute resources to be set when using MANAGED type
if (props.computeResources && this.isManaged(props)) {
computeResources = {
allocationStrategy: props.computeResources.allocationStrategy
|| (
props.computeResources.type === ComputeResourceType.SPOT
? AllocationStrategy.SPOT_CAPACITY_OPTIMIZED
: AllocationStrategy.BEST_FIT
),
bidPercentage: props.computeResources.bidPercentage,
desiredvCpus: props.computeResources.desiredvCpus,
ec2KeyPair: props.computeResources.ec2KeyPair,
imageId: props.computeResources.image && props.computeResources.image.getImage(this).imageId,
instanceRole: props.computeResources.instanceRole
? props.computeResources.instanceRole
: new iam.CfnInstanceProfile(this, 'Instance-Profile', {
roles: [new iam.Role(this, 'Ecs-Instance-Role', {
assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
managedPolicies: [
iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonEC2ContainerServiceforEC2Role'),
],
}).roleName],
}).attrArn,
instanceTypes: this.buildInstanceTypes(props.computeResources.instanceTypes),
launchTemplate: props.computeResources.launchTemplate,
maxvCpus: props.computeResources.maxvCpus || 256,
minvCpus: props.computeResources.minvCpus || 0,
placementGroup: props.computeResources.placementGroup,
securityGroupIds: this.buildSecurityGroupIds(props.computeResources.vpc, props.computeResources.securityGroups),
spotIamFleetRole: spotFleetRole?.roleArn,
subnets: props.computeResources.vpc.selectSubnets(props.computeResources.vpcSubnets).subnetIds,
tags: props.computeResources.computeResourcesTags,
type: props.computeResources.type || ComputeResourceType.ON_DEMAND,
...(!isFargate ? {
iliapolo marked this conversation as resolved.
Show resolved Hide resolved
allocationStrategy: props.computeResources.allocationStrategy
|| (
props.computeResources.type === ComputeResourceType.SPOT
? AllocationStrategy.SPOT_CAPACITY_OPTIMIZED
: AllocationStrategy.BEST_FIT
),
instanceRole: props.computeResources.instanceRole
? props.computeResources.instanceRole
: new iam.CfnInstanceProfile(this, 'Instance-Profile', {
roles: [new iam.Role(this, 'Ecs-Instance-Role', {
assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
managedPolicies: [
iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonEC2ContainerServiceforEC2Role'),
],
}).roleName],
}).attrArn,
instanceTypes: this.buildInstanceTypes(props.computeResources.instanceTypes),
minvCpus: props.computeResources.minvCpus || 0,
} : {}),
};
}

Expand Down Expand Up @@ -414,7 +433,7 @@ export class ComputeEnvironment extends Resource implements IComputeEnvironment
/**
* Validates the properties provided for a new batch compute environment.
*/
private validateProps(props: ComputeEnvironmentProps) {
private validateProps(props: ComputeEnvironmentProps, isFargate: boolean) {
if (props === undefined) {
return;
}
Expand All @@ -427,41 +446,100 @@ export class ComputeEnvironment extends Resource implements IComputeEnvironment
throw new Error('computeResources is missing but required on a managed compute environment');
}

// Setting a bid percentage is only allowed on SPOT resources +
// Cannot use SPOT_CAPACITY_OPTIMIZED when using ON_DEMAND
if (props.computeResources) {
if (props.computeResources.type === ComputeResourceType.ON_DEMAND) {
// VALIDATE FOR ON_DEMAND
if (isFargate) {
// VALIDATE FOR FARGATE

// Bid percentage is not allowed
// Bid percentage cannot be set for Fargate evnvironments
if (props.computeResources.bidPercentage !== undefined) {
throw new Error('Setting the bid percentage is only allowed for SPOT type resources on a batch compute environment');
throw new Error('Bid percentage must not be set for Fargate compute environments');
}

// SPOT_CAPACITY_OPTIMIZED allocation is not allowed
if (props.computeResources.allocationStrategy && props.computeResources.allocationStrategy === AllocationStrategy.SPOT_CAPACITY_OPTIMIZED) {
throw new Error('The SPOT_CAPACITY_OPTIMIZED allocation strategy is only allowed if the environment is a SPOT type compute environment');
// Allocation strategy cannot be set for Fargate evnvironments
if (props.computeResources.allocationStrategy !== undefined) {
throw new Error('Allocation strategy must not be set for Fargate compute environments');
}
} else {
// VALIDATE FOR SPOT

// Bid percentage must be from 0 - 100
if (props.computeResources.bidPercentage !== undefined &&
(props.computeResources.bidPercentage < 0 || props.computeResources.bidPercentage > 100)) {
throw new Error('Bid percentage can only be a value between 0 and 100');
// Desired vCPUs cannot be set for Fargate evnvironments
if (props.computeResources.desiredvCpus !== undefined) {
throw new Error('Desired vCPUs must not be set for Fargate compute environments');
}
}

if (props.computeResources.minvCpus) {
// minvCpus cannot be less than 0
if (props.computeResources.minvCpus < 0) {
throw new Error('Minimum vCpus for a batch compute environment cannot be less than 0');
// Image ID cannot be set for Fargate evnvironments
if (props.computeResources.image !== undefined) {
throw new Error('Image must not be set for Fargate compute environments');
}

// minvCpus cannot exceed max vCpus
if (props.computeResources.maxvCpus &&
props.computeResources.minvCpus > props.computeResources.maxvCpus) {
throw new Error('Minimum vCpus cannot be greater than the maximum vCpus');
// Instance types cannot be set for Fargate evnvironments
if (props.computeResources.instanceTypes !== undefined) {
throw new Error('Instance types must not be set for Fargate compute environments');
}

// EC2 key pair cannot be set for Fargate evnvironments
if (props.computeResources.ec2KeyPair !== undefined) {
throw new Error('EC2 key pair must not be set for Fargate compute environments');
}

// Instance role cannot be set for Fargate evnvironments
if (props.computeResources.instanceRole !== undefined) {
throw new Error('Instance role must not be set for Fargate compute environments');
}

// Launch template cannot be set for Fargate evnvironments
if (props.computeResources.launchTemplate !== undefined) {
throw new Error('Launch template must not be set for Fargate compute environments');
}

// Min vCPUs cannot be set for Fargate evnvironments
if (props.computeResources.minvCpus !== undefined) {
throw new Error('Min vCPUs must not be set for Fargate compute environments');
}

// Placement group cannot be set for Fargate evnvironments
if (props.computeResources.placementGroup !== undefined) {
throw new Error('Placement group must not be set for Fargate compute environments');
}

// Spot fleet role cannot be set for Fargate evnvironments
if (props.computeResources.spotFleetRole !== undefined) {
throw new Error('Spot fleet role must not be set for Fargate compute environments');
}
} else {
// VALIDATE FOR ON_DEMAND AND SPOT
if (props.computeResources.minvCpus) {
// minvCpus cannot be less than 0
if (props.computeResources.minvCpus < 0) {
throw new Error('Minimum vCpus for a batch compute environment cannot be less than 0');
}

// minvCpus cannot exceed max vCpus
if (props.computeResources.maxvCpus &&
props.computeResources.minvCpus > props.computeResources.maxvCpus) {
throw new Error('Minimum vCpus cannot be greater than the maximum vCpus');
}
}
// Setting a bid percentage is only allowed on SPOT resources +
// Cannot use SPOT_CAPACITY_OPTIMIZED when using ON_DEMAND
if (props.computeResources.type === ComputeResourceType.ON_DEMAND) {
// VALIDATE FOR ON_DEMAND

// Bid percentage is not allowed
if (props.computeResources.bidPercentage !== undefined) {
throw new Error('Setting the bid percentage is only allowed for SPOT type resources on a batch compute environment');
}

// SPOT_CAPACITY_OPTIMIZED allocation is not allowed
if (props.computeResources.allocationStrategy && props.computeResources.allocationStrategy === AllocationStrategy.SPOT_CAPACITY_OPTIMIZED) {
throw new Error('The SPOT_CAPACITY_OPTIMIZED allocation strategy is only allowed if the environment is a SPOT type compute environment');
}
} else if (props.computeResources.type === ComputeResourceType.SPOT) {
// VALIDATE FOR SPOT

// Bid percentage must be from 0 - 100
if (props.computeResources.bidPercentage !== undefined &&
(props.computeResources.bidPercentage < 0 || props.computeResources.bidPercentage > 100)) {
throw new Error('Bid percentage can only be a value between 0 and 100');
}
}
}
}
Expand Down
Loading