Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stepfunctions-tasks): add emr-containers support for calling CreateVirtualCluster, DeleteVirtualCluster, and StartJobRun #15262

Closed
wants to merge 77 commits into from

Conversation

matthewsvu
Copy link

@matthewsvu matthewsvu commented Jun 22, 2021

API as per documentation here:
https://docs.aws.amazon.com/step-functions/latest/dg/connect-emr-eks.html
https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_Operations.html

Overview

This CDK feature has implemented a simplified API service integration for EMR on EKS, providing multiple defaults and automating setup processes to customers using existing CDK library features. Reducing time to setup.

Task features added

Design decisions

  • We've flattened the user facing API parameters for CreateVirtualCluster and StartJobRun and provided multiple different inputs and defaults for all EMR on EKS SFN service integrations.
  • We've simplified the setting up process by automating the creation of a Job Execution Role, so users don't have to provide their own role if they so choose.
  • We've implemented the IGrantable interface, so users can also add permissions to the automatically generated Job Execution Role in StartJobRun via grantPrincipal.
  • We have two Custom Resources which do the following:
  1. Executes an EMR Containers SDK call todescribeVirtualCluster and retrieves the EKS Cluster's namespace and virtualClusterId
  2. The other bundles the awscli lambda layer and updates the role trust policy of the Job Execution Role by executing a SingletonLambda using the namespace and virtualClusterId retrieved from the previous Custom Resource.

Required setup

Running Tests

Change directory into aws-cdk/packages/@aws-cdk/aws-stepfunctions-tasks and run the following commands

yarn build
npm test
cdk deploy --app integ.start-job-run.js
cdk deploy --app integ.job-submission-workflow.js

closes #15234


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

skinny85 and others added 2 commits June 21, 2021 12:35
…part fails

New-style ARNs are of the form 'arn:aws:s4:us-west-1:12345:/resource-type/resource-name'.
We didn't handle that correctly in parseArn(), and instead returned an `undefined` resource,
which funnily enough should never happen according to our types.

Introduce the concept of ARN formats,
represented by an enum in core,
and replace the `Stack.parseArn()` method by a new one `Stack.splitArn()`,
taking that enum as a required second argument.

Spotted in https://github.com/aws/aws-cdk/pull/15140/files#r653112073
@gitpod-io
Copy link

gitpod-io bot commented Jun 22, 2021

@BenChaimberg BenChaimberg self-requested a review June 23, 2021 15:59
@BenChaimberg BenChaimberg self-assigned this Jun 23, 2021
… createvirtualcluster, and deleteVirtualCluster finished
@matthewsvu matthewsvu marked this pull request as ready for review June 25, 2021 17:01
@matthewsvu
Copy link
Author

In the case of create-virtual-cluster.ts and delete-virtual-cluster.ts I will probably just write the integration tests and unit tests for those APIs because those are relatively simple. However, I will wait on Ben's approval for the start-job-run.ts because that's a fairly complicated API with lots of properties.

@matthewsvu matthewsvu changed the title feat(stepfunctions-tasks): emrcontainers support for calling CreateVirtualCluster, DeleteVirtualCluster, and StartJobRun feat(stepfunctions-tasks): add emr-containers support for calling CreateVirtualCluster, DeleteVirtualCluster, and StartJobRun Jul 2, 2021
@matthewsvu
Copy link
Author

I'll probably just start writing the implementation or tests for start-job-run to get as much done as possible and iterate on any failures later.

Copy link
Contributor

@BenChaimberg BenChaimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the correspondence with the underlying API, but one of the big benefits of the CDK is that we can gloss over some of the details and provide helpful defaults for our users. This means that we want to provide fairly flat interfaces, few required parameters, and rely on the type system. To that end, let's try to simplify the interfaces a bit.

interface EmrContainersCreateVirtualClusterProps {
  eksCluster: eks.ICluster;
  eksNamespace: string; // unless this is derivable somehow? will this almost always be "default"? is it important to be able to configure?
  virtualClusterName: string;
  tags?: { [key: string]: string };
}

// needs a better name
class VirtualClusterProp {
  static fromTaskInput(taskInput: sfn.TaskInput): VirtualClusterArgument
  static fromVirtualClusterId(virtualClusterId: string): VirtualClusterArgument
  // doesn't exist yet, but will eventually
  static fromVirtualCluster(virtualCluster: emrcontainers.IVirtualCluster): VirtualClusterArgument
  constructor(public readonly id: string);
}

interface ApplicationConfiguration {} // copy from Configuration

class ReleaseLabel {
  static EMR_5_32_0(): ReleaseLabel
  static EMR_5_33_0(): ReleaseLabel
  static EMR_6_2_0(): ReleaseLabel
  static EMR_6_3_0(): ReleaseLabel
  constructor(public readonly label: string);
}

interface EmrContainersStartJobRunProps {
  virtualCluster: VirtualClusterProp;
  executionRole: iam.IRole;
  releaseLabel: ReleaseLabel;
  entryPoint: sfn.TaskInput;
  jobArguments?: sfn.TaskInput;
  sparkSubmitParameters?: string; // is this needed or can everything be configured via the application configuration?
  applicationConfiguration?: ApplicationConfiguration[];
  logging?: boolean; // if set to true, creates a cloudwatch log group
  logGroup?: cloudwatch.ILogGroup; // possibly remove altogether
  logBucket?: s3.IBucket; // possibly remove altogether
  persistentApplicationUI?: boolean;
  jobName?: string;
  tags?: { [key: string]: string };
}

This is just a first crack at it, let's discuss what you think will give us both a smooth experience and enough flexibility.

matthewsvu and others added 4 commits July 3, 2021 21:56
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
@mergify mergify bot dismissed BenChaimberg’s stale review July 4, 2021 02:56

Pull request has been modified.

matthewsvu and others added 9 commits July 3, 2021 21:56
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…art-job-run.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…se-types.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…eate-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…lete-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…lete-virtual-cluster.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
…art-job-run.ts

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
@matthewsvu
Copy link
Author

matthewsvu commented Aug 19, 2021

Getting this when using the L1 cfnvirtualcluster structure -

[no-experimental-dependencies] It is not allowed to depend on experimental modules. @aws-cdk/aws-stepfunctions-tasks added a dependency on experimental module @aws-cdk/aws-emrcontainers

Copy link
Contributor

@BenChaimberg BenChaimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also re-opened some resolved comments from a previous review that need further work. Once you are done making changes on the integration tests, please confirm that you have deployed both of them and successfully execute the state machines.

Comment on lines 446 to 448
'Fn::GetAtt': [
'EMRContainersStartJobRunMonitoringBucket8BB3FC54',
'Arn',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is something broken in your implementation because this should just be "providedbucket"; looks like you are still creating a new bucket even if one is passed in

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
@mergify mergify bot dismissed BenChaimberg’s stale review August 20, 2021 09:10

Pull request has been modified.

Co-authored-by: Ben Chaimberg <youppi3@gmail.com>
@BenChaimberg
Copy link
Contributor

Please replace the current PR description with a brief overview of what you created (three tasks, some custom resources that do X) and some of the design decisions we made (what we do automatically, what the customer needs to do, why that's the case)

throw new Error('Execution role cannot be undefined when the virtual cluster ID is not a concrete value. Provide an execution role with the correct trust policy');
}

this.logGroup = this.props.monitoring?.logGroup ?? this.props.monitoring?.logging ? new logs.LogGroup(this, 'Monitoring Log Group') : undefined;
Copy link
Author

@matthewsvu matthewsvu Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something wrong with the implementation of monitoring configuration for when users provide their own log group and log bucket. Currently, it always results in the automatically generated log group and buckets being used instead of user provided ones.

@BenChaimberg
Copy link
Contributor

@matthewsvu – will you be completing the work on this PR?

@matthewsvu
Copy link
Author

No. Don't have the necessary resources on a personal laptop to run the integration tests and confirm if they work. Someone from my team will pick it up.

@kaizencc kaizencc assigned kaizencc and unassigned BenChaimberg Oct 20, 2021
Co-authored-by: kaizen3031593 <36202692+kaizen3031593@users.noreply.github.com>
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: a4424a0
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: a4424a0
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@kaizencc
Copy link
Contributor

Closing in favor of #17103.

@kaizencc kaizencc closed this Oct 22, 2021
mergify bot pushed a commit that referenced this pull request Nov 24, 2021
This CDK feature adds support for Emr on Eks by implementing API service integrations for the following three APIs.

This PR adds three tasks which support Emr on Eks:
1) [Create Virtual Cluster](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_CreateVirtualCluster.html)
2) [ Start a job run](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_StartJobRun.html)
3) [Delete virtual cluster ](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_DeleteVirtualCluster.html)


Continuation of #15262 by @matthewsvu and @BenChaimberg:

Closes #15234.

----
*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
beezly pushed a commit to beezly/aws-cdk that referenced this pull request Nov 29, 2021
This CDK feature adds support for Emr on Eks by implementing API service integrations for the following three APIs.

This PR adds three tasks which support Emr on Eks:
1) [Create Virtual Cluster](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_CreateVirtualCluster.html)
2) [ Start a job run](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_StartJobRun.html)
3) [Delete virtual cluster ](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_DeleteVirtualCluster.html)


Continuation of aws#15262 by @matthewsvu and @BenChaimberg:

Closes aws#15234.

----
*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
TikiTDO pushed a commit to TikiTDO/aws-cdk that referenced this pull request Feb 21, 2022
This CDK feature adds support for Emr on Eks by implementing API service integrations for the following three APIs.

This PR adds three tasks which support Emr on Eks:
1) [Create Virtual Cluster](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_CreateVirtualCluster.html)
2) [ Start a job run](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_StartJobRun.html)
3) [Delete virtual cluster ](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_DeleteVirtualCluster.html)


Continuation of aws#15262 by @matthewsvu and @BenChaimberg:

Closes aws#15234.

----
*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(stepfunctions-tasks): emr-containers CreateVirtualCluster, DeleteVirtualCluster, and StartJobRun task
6 participants