Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stepfunctions-tasks): support for sagemaker tasks #6639

Closed
wants to merge 10 commits into from

Conversation

Hari-Nagarajan
Copy link

@Hari-Nagarajan Hari-Nagarajan commented Mar 9, 2020

Description

I needed support for these SageMaker endpoints for a project I'm working on. I verified this works by using this in my project to create a workflow that creates a training job, updates the model and then updates the endpoint.

Commit Message

feat(stepfunctions-tasks): support for sagemaker tasks (#6639)

closes #6572

End Commit Message


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

Hari Nagarajan and others added 4 commits March 5, 2020 16:24
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 0c3b26a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: dcc20b8
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@Hari-Nagarajan Hari-Nagarajan marked this pull request as ready for review March 9, 2020 18:35
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 7424301
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 96de9c8
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: e820ce2
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: b0de38f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: d9352d0
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

*
* @experimental
*/
export interface ProductionVariants {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface should be renamed to ProductionVariant instead of ProductionVariants, as it defines one ProductionVariant.
Also defining this interface here will introduce code duplication when PR for L2 sagemaker constructs is merged and ProductionVariant gets into the package. After mentioned PR being live, created step function tasks should use the real defined ProductionVariant in sagemaker instead of introducing another one.

*
* @experimental
*/
export interface ModelContainer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface should be named ContainerDefinition instead of ModelContainer to be consistent with sagemaker api
Also it should be replaced with ContainerDefinition in sagemaker constructs when #6107 is merged to avoid code duplication

readonly endpointConfigName: string;

/**
* Isolates the model container. No inbound or outbound network calls can be made to or from the model container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not relevant to the below property

}

private makePolicyStatements(task: sfn.Task): iam.PolicyStatement[] {
Stack.of(task);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack.of(task); is unused in this methods context

Comment on lines +139 to +148
const policyStatements = [
new iam.PolicyStatement({
actions: ['sagemaker:CreateEndpointConfig'],
resources: ['*'],
}),
new iam.PolicyStatement({
actions: ['sagemaker:ListTags'],
resources: ['*']
}),
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to define the resources as follows

const stack = Stack.of(task);
const policyStatements = [
            new PolicyStatement({
                actions: ['sagemaker:CreateEndpointConfig'],
                resources: [
                    stack.formatArn({
                        service: 'sagemaker',
                        resource: 'endpoint-config',
                        resourceName: '*',
                    }),
                ],
            }),
            new PolicyStatement({
                actions: ['sagemaker:ListTags'],
                resources: [
                    stack.formatArn({
                        service: 'sagemaker',
                        resource: 'endpoint-config',
                        resourceName: '*',
                    }),
                ],
            }),
        ];

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but I'll also add - given that we know the name of the endpoint config at this point, we should be able to add that to resourceName and avoid *.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest specifying the resource name as well similar to:

            new PolicyStatement({
                actions: ['sagemaker:CreateEndpointConfig'],
                resources: [
                    stack.formatArn({
                        service: 'sagemaker',
                        resource: 'endpoint-config',
                        // resource name in sagemaker is lowercase
                        resourceName: Data.isJsonPathString(this.props.endpointConfigName) ? '*' : `${this.props.endpointConfigName.toLowerCase()}`,
                    }),
                ],
            }),

Please note that it is essential to convert resource name to lower case as Sagemaker arns lowercase the name by default(for any resource be it training job, endpoint, ,model etc). I found this case undocumented

}

private makePolicyStatements(task: sfn.Task): iam.PolicyStatement[] {
Stack.of(task);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is unused.

Comment on lines +85 to +94
const policyStatements = [
new iam.PolicyStatement({
actions: ['sagemaker:createEndpoint'],
resources: ['*'],
}),
new iam.PolicyStatement({
actions: ['sagemaker:ListTags'],
resources: ['*']
})
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to comment for EndpointConfig task

}

private makePolicyStatements(task: sfn.Task): iam.PolicyStatement[] {
Stack.of(task);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused line

}

private makePolicyStatements(task: sfn.Task): iam.PolicyStatement[] {
Stack.of(task);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused line

@Stacy-D
Copy link
Contributor

Stacy-D commented Mar 11, 2020

This PR will introduce code repetition if merged before #6107 as it defines interfaces that should come from L2 sagemaker constructs

P.S. I've left my suggestions on potential improvements as I've also implemented those tasks as they were required for my project

Copy link
Contributor

@nija-at nija-at left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this PR. This is going to add much value to CDK customers when it's made available.

Given that this is a big one, I wanted to make you aware that this is going to take several iterations and several weeks for it to be ready to be merged.

Comment on lines +138 to +141
* `tasks.SagemakerCreateModelTask` -- create a SageMaker model
* `tasks.SagemakerCreateEndpointConfigTask` -- create a SageMaker Endpoint Config
* `tasks.SagemakerCreateEndpointTask` -- create a new SageMaker Endpoint
* `tasks.SagemakerUpdateEndpointTask` -- update an existing SageMaker Endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide examples for each of these. (see below)

import { Stack } from '@aws-cdk/core';
import { getResourceArn } from './resource-arn-suffix';
import { DataCaptureConfig, ProductionVariants } from './sagemaker-task-base-types';

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch to using 2 space indent everywhere. From what I can see the rest of the stepfunction-tasks package uses 2-space indent. Follow the same consistency.

import { DataCaptureConfig, ProductionVariants } from './sagemaker-task-base-types';

/**
* @experimental
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the following text and to all other modules to which this apply.

Suggested change
* @experimental
* @experimental - since the API may change when higher level constructs in the `aws-sagemaker` module is available

Comment on lines +12 to +17
/**
* The request accepts the following data in JSON format.
*
* @default - None
*/
readonly dataCaptureConfig?: DataCaptureConfig;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see this as a supported parameter here - https://docs.aws.amazon.com/step-functions/latest/dg/connect-sagemaker.html

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@nija-at nija-at Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all parameters of the API is supported by step functions. You can only configure the ones that are explicitly mentioned in the stepfunctions document (that I linked above).

It seems to me that DataCaptureConfig is not supported.

/**
* The service integration pattern indicates different ways to call SageMaker APIs.
*
* The valid value is FIRE_AND_FORGE.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

@@ -129,6 +129,51 @@
"stability": "experimental",
"awslint": {
"exclude": [
"props-default-doc:@aws-cdk/aws-stepfunctions-tasks.SagemakerUpdateEndpointTaskProps.retainAllVariantProperties",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address all of these exclusions. The general guidance is no new exclusions should be added. It should be rare that we add one here.

Comment on lines +139 to +148
const policyStatements = [
new iam.PolicyStatement({
actions: ['sagemaker:CreateEndpointConfig'],
resources: ['*'],
}),
new iam.PolicyStatement({
actions: ['sagemaker:ListTags'],
resources: ['*']
}),
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but I'll also add - given that we know the name of the endpoint config at this point, we should be able to add that to resourceName and avoid *.

@nija-at
Copy link
Contributor

nija-at commented Mar 12, 2020

Thanks for your comments on here @Stacy-D. We appreciate all contributions that keep our CDK repo healthy and updated. 😊

This PR will introduce code repetition if merged before #6107 as it defines interfaces that should come from L2 sagemaker constructs

I wouldn't be too worried about that yet. It's not clear when PR #6107 would land.
We're intentionally marking the constructs here as experimental, so we can go and bring in the Sagemaker constructs into this library, when it's ready. This way we can provide CDK customers the ability to still use sagemaker via stepfunctions, if they so wish to, with the warning that it is subject to change in future versions.
Hope that makes sense!

@nija-at nija-at changed the title feat(stepfunctions-tasks): support for SageMaker createModel, createEndpointConfig, createEndpoint and updateEndpoint feat(stepfunctions-tasks): support for sagemaker tasks Mar 12, 2020
@Hari-Nagarajan
Copy link
Author

Hari-Nagarajan commented Mar 13, 2020

Thanks for submitting this PR. This is going to add much value to CDK customers when it's made available.

Given that this is a big one, I wanted to make you aware that this is going to take several iterations and several weeks for it to be ready to be merged.

I'll put out an iteration soon.

@aws aws deleted a comment from mergify bot May 19, 2020
@nija-at nija-at added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label May 19, 2020
@github-actions
Copy link

This PR has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels May 20, 2020
@github-actions github-actions bot closed this May 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stepfunctions: support for SageMaker createModel, createEndpointConfig, createEndpoint and updateEndpoint
4 participants