-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sagemaker): Initial version of the L2 construct for SageMaker #2888
Conversation
Sorry for leaving this PR out in the cold @mattmcclean . Unfortunately, you submitted it right before our mad dash to release Now that that's over, I'll be taking over the PR. Expect a review soon. Thanks, and apologies again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an extremely high-quality contribution @mattmcclean , thank you so much!
I have some comments, but most of them are from conventions in the Construct Library that we follow, but maybe haven't documented (and are not checking with our linter) well enough.
One big change though: we use 2 spaces for indentation in our code. I understand 4 is the default in most tools, but could you please change it to 2, and re-format the code?
Thanks again for the awesome contribution!
@@ -15,6 +15,78 @@ | |||
--- | |||
<!--END STABILITY BANNER--> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually like to start the ReadMe with a short blurb about the service itself (usually copied from AWS documentation), and then installation and import instructions. See CodeBuild for an example.
So, this should be something like:
Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost.
Installation
Install the module:
$ npm i @aws-cdk/aws-sagemaker
Import it into your code:
import sagemaker = require('@aws-cdk/aws-sagemaker');
new NotebookInstance(this, 'MyNotebook', { | ||
vpc, | ||
kmsKeyId: key, | ||
instanceType = new ec2.InstanceType('p3.2xlarge'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the InstanceType.of
method here instead? It's a little "nicer" I think :)
new NotebookInstance(this, 'MyNotebook'); | ||
``` | ||
|
||
Add a KMS encryption key, launch in own VPC, set the instance type to 'ml.p3.xlarge', disable direct internet access and set the EBS volume size to 100 GB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence should probably should end with a colon instead of a period.
}); | ||
``` | ||
|
||
Add custom scripts when notebook instance is created and started. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here (period -> colon).
|
||
### Models | ||
|
||
Define a model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Colon :)
const result = super.validate(); | ||
// check we have at least one production variant | ||
if (this.productionVariants.length === 0) { | ||
result.push("Must have at least one Production Variant"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here's an interesting pattern that you can use here.
Since at least one ProductionVariant is required, have in props:
productionVariant: ProductionVariant; // required
extraProductionVariants?: ProductionVariant[]; // optional
This way, you communicate to the clients of this class, at compile time, that at least one variant has to be provided.
* Validates the provided instance type. | ||
* @param instanceType the instance type of the SageMaker endpoint production variant. | ||
*/ | ||
private validateInstanceType(instanceType: string, result: string[]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just have this return string[]
, and merge the errors in the client of this method, instead of mutating a parameter. I think keeping methods pure is simply easier.
*/ | ||
private validateInstanceType(instanceType: string, result: string[]) { | ||
// check if a valid sagemaker instance type | ||
if (!['c4', 'c5', 'm4', 'm5', 'p2', 'p3', 't2'].some(instanceClass => instanceType.indexOf(instanceClass) >= 0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the same as in Notebook - extract to a module-private function (and include the allowed types in the error message).
}); | ||
|
||
// THEN | ||
expect(stack).to(haveResource("AWS::SageMaker::EndpointConfig", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment for all tests: we have also a haveResourceLike
matcher that allows you to check only the parts of the resource that you're interested in in a given test. Might come in handy for these.
|
||
test.done(); | ||
}, | ||
"it throws error when incorrect variant weight given"(test: Test) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please always have an empty line between tests on the same level of indentation.
test.equal(errors.length, 1); | ||
const error = errors[0]; | ||
test.same(error.source, endpoint); | ||
test.equal(error.message, "Must have at least one Production Variant"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplify this with something like:
test.deepEqual(errors.map(e => e.message), ["Must have at least one Production Variant"]);
Same comment for all tests dealing with error checking :)
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
6 similar comments
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
Thanks so much for taking the time to contribute to the AWS CDK ❤️ We will shortly assign someone to review this pull request and help get it
|
Based closely on PR aws#2888, this commit introduces the Endpoint L2 construct and the L2 constructs on which it depends, including Model and EndpointConfig. Departures from PR aws#2888 include: - EndpointConfig definition moved into its own L2 construct as one EndpointConfig resource may be shared by multiple Endpoints. - An Endpoint-specific IEndpointProductionVariant interface was added to support "metric*" and "autoScale*" APIs per endpoint-variant combination. - The Notebook construct was excluded from this new commit to limit changes to model hosting use-cases only. - Feedback on the earlier PR has been incorporated into this new commit. fixes aws#2809 Co-authored-by: Matt McClean <mmcclean@amazon.com> Co-authored-by: Yao <longyao@amazon.com> Co-authored-by: Drew Jetter <60628154+jetterdj@users.noreply.github.com> Co-authored-by: Murali Ganesh <59461079+foxpro24@users.noreply.github.com> Co-authored-by: Abilash Rangoju <988529+rangoju@users.noreply.github.com>
Closing in favor of #6107. |
I have written up the initial L2 Construct for SageMaker. It is quite basic and there can be further enhancements made over time such as adding support for Autoscaling of the endpoint.
It creates 3 L2 resources including:
Fixes #2809
Pull Request Checklist
design
folderBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.