Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(aws-batch): Support omitting ComputeEnvironment security groups so that they can be specified in Launch Template #21579

Merged
merged 23 commits into from
Aug 23, 2022

Conversation

tcutts
Copy link
Contributor

@tcutts tcutts commented Aug 12, 2022

HPC Batch applications frequently require Elastic Fabric Adapters for low-latency networking. Currently, the ComputeEnvironment construct always automatically defines a set of SecurityGroupIds in the CloudFormation it generates, and this prevents the stack deploying if the LaunchTemplate contains network interface definitions; Batch does not allow SecurityGroups at the ComputeEnvironment level if there are network interfaces defined in the CfnLaunchTemplate.

Since we do not currently have support for network interfaces this PR adds a new boolean property in launchTemplate called useNetworkInterfaceSecurityGroups. When this is enabled we will assume that security groups are being provided by the launch template.

A long term solution may be to:

  • Add support for network interfaces in the L2 ec2.LaunchTemplate construct.
  • Update the batch.ComputeEnvironment construct to take a ILaunchTemplate instead of the name/id.
  • Check the ILaunchTemplate for whether the ComputeEnvironment needs to create any security groups.

closes #21577


All Submissions:

Adding new Unconventional Dependencies:

  • [no] This PR adds new unconventional dependencies following the process described here

New Features

  • [yes] Have you added the new feature to an integration test?
    • [yes] Did you use yarn integ to deploy the infrastructure and generate the snapshot (i.e. yarn integ without --dry-run)?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Aug 12, 2022

@aws-cdk-automation aws-cdk-automation requested a review from a team August 12, 2022 11:55
@github-actions github-actions bot added bug This issue is a bug. p2 labels Aug 12, 2022
@tcutts tcutts changed the title bug(aws-batch): LaunchTemplates with network interfaces do not work with ComputeEnvironments fix(aws-batch): LaunchTemplates with network interfaces do not work with ComputeEnvironments Aug 12, 2022
Copy link
Contributor

@corymhall corymhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tcutts thanks for taking the time to create this PR! Unfortunately the union type is not currently supported in CDK so I don't think this solution will work.

Without looking into this too much, I think the ideal solution is probably a much bigger effort. I'm thinking that we need to:

  1. Add support for network interfaces in the L2 ec2.LaunchTemplate construct.
  2. Update the batch.ComputeEnvironment construct to take a ILaunchTemplate instead of the name/id.
  3. Check the ILaunchTemplate for whether the ComputeEnvironment needs to create any security groups.

packages/@aws-cdk/aws-batch/lib/compute-environment.ts Outdated Show resolved Hide resolved
@tcutts
Copy link
Contributor Author

tcutts commented Aug 15, 2022 via email

@tcutts
Copy link
Contributor Author

tcutts commented Aug 15, 2022

This is potentially an enormous can of worms. CDK has no abstraction for networkInterfaces at all, anywhere. That needs to be done first before anything else that ought to support them can do so (Instance, AutoScalingGroup). This is way beyond my relatively hobbyist ability to implement.

@tcutts
Copy link
Contributor Author

tcutts commented Aug 15, 2022

What I could do is revert to a simpler change, which is to change what happens if the user specifies [] as the list of security groups. If, in that case, we emit no SecurityGroupIds line in the resource, we still fix the bug, and we've no longer created the unsupported union type. That does leave us with slightly messy and misleading end user code:

const computeEnvironmentEFA = new batch.ComputeEnvironment(stack, 'EFAComputeEnv', {
  managed: true,
  computeResources: {
    securityGroups: [],
    vpc,
    launchTemplate: {
      launchTemplateName: launchTemplateEFA.launchTemplateName as string,
    },
  },
});

which I don't like very much, but would achieve the desired outcome. Arguably, ComputeEnvironment needs updating to do this anyway, because emitting an empty list in a set SecurityGroupIds property doesn't seem to be the right behaviour to me.

@corymhall
Copy link
Contributor

What I could do is revert to a simpler change, which is to change what happens if the user specifies [] as the list of security groups. If, in that case, we emit no SecurityGroupIds line in the resource, we still fix the bug, and we've no longer created the unsupported union type. That does leave us with slightly messy and misleading end user code code:

I agree that this isn't what we want, but since this package is experimental and given the complexity of the preferred solution I think I'm fine with this as an interim solution.

@mergify mergify bot dismissed corymhall’s stale review August 16, 2022 13:56

Pull request has been modified.

@corymhall corymhall self-assigned this Aug 17, 2022
@tcutts tcutts changed the title fix(aws-batch): LaunchTemplates with network interfaces do not work with ComputeEnvironments fix(aws-batch): Support no ComputeEnvironment security groups so that they can be specified in Launch Template instead Aug 17, 2022
@github-actions github-actions bot added the effort/small Small work item – less than a day of effort label Aug 17, 2022
@tcutts tcutts changed the title fix(aws-batch): Support no ComputeEnvironment security groups so that they can be specified in Launch Template instead fix(aws-batch): Support omitting ComputeEnvironment security groups so that they can be specified in Launch Template instead Aug 17, 2022
@tcutts tcutts changed the title fix(aws-batch): Support omitting ComputeEnvironment security groups so that they can be specified in Launch Template instead fix(aws-batch): Support omitting ComputeEnvironment security groups so that they can be specified in Launch Template Aug 17, 2022
Copy link
Contributor

@corymhall corymhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! just 1 minor change requested.

packages/@aws-cdk/aws-batch/README.md Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-batch/lib/compute-environment.ts Outdated Show resolved Hide resolved
@mergify mergify bot dismissed corymhall’s stale review August 18, 2022 13:39

Pull request has been modified.

@mergify
Copy link
Contributor

mergify bot commented Aug 23, 2022

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 6d3668a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify mergify bot merged commit 33b00dd into aws:main Aug 23, 2022
@mergify
Copy link
Contributor

mergify bot commented Aug 23, 2022

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@tcutts tcutts deleted the efaLT branch August 26, 2022 14:24
josephedward pushed a commit to josephedward/aws-cdk that referenced this pull request Aug 30, 2022
…o that they can be specified in Launch Template (aws#21579)

HPC Batch applications frequently require Elastic Fabric Adapters for low-latency networking.  Currently, the `ComputeEnvironment` construct always automatically defines a set of `SecurityGroupIds` in the CloudFormation it generates, and this prevents the stack deploying if the LaunchTemplate contains network interface definitions; Batch does not allow SecurityGroups at the `ComputeEnvironment` level if there are network interfaces defined in the `CfnLaunchTemplate`.

Since we do not currently have support for network interfaces this PR adds a new boolean property in `launchTemplate` called `useNetworkInterfaceSecurityGroups`. When this is enabled we will assume that security groups are being provided by the launch template.

A long term solution may be to:
- Add support for network interfaces in the L2 ec2.LaunchTemplate construct.
- Update the batch.ComputeEnvironment construct to take a ILaunchTemplate instead of the name/id.
- Check the ILaunchTemplate for whether the ComputeEnvironment needs to create any security groups.

closes aws#21577 

----

### All Submissions:

* [yes] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md)

### Adding new Unconventional Dependencies:

* [no] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies)

### New Features

* [yes] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)?
	* [yes] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)?

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. effort/small Small work item – less than a day of effort p2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(aws-batch): (Compute environments cannot be created with launch templates specifying network interface)
3 participants