Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): Initial version of the L2 construct for SageMaker #2888

Closed
wants to merge 11 commits into from
Closed
74 changes: 73 additions & 1 deletion packages/@aws-cdk/aws-sagemaker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,78 @@
---
<!--END STABILITY BANNER-->

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually like to start the ReadMe with a short blurb about the service itself (usually copied from AWS documentation), and then installation and import instructions. See CodeBuild for an example.

So, this should be something like:


Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost.

Installation

Install the module:

$ npm i @aws-cdk/aws-sagemaker

Import it into your code:

import sagemaker = require('@aws-cdk/aws-sagemaker');

### Notebook instance

Define a notebook instance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Colon :)


```ts
new NotebookInstance(this, 'MyNotebook');
```

Add a KMS encryption key, launch in own VPC, set the instance type to 'ml.p3.xlarge', disable direct internet access and set the EBS volume size to 100 GB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence should probably should end with a colon instead of a period.


```ts
const vpc = new ec2.Vpc(this, "Vpc");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single quote string? The other ones are single quote, so we should probably try to keep them consistent.

const key = new kms.Key(this, 'Key');
new NotebookInstance(this, 'MyNotebook', {
vpc,
kmsKeyId: key,
instanceType = new ec2.InstanceType('p3.2xlarge'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the InstanceType.of method here instead? It's a little "nicer" I think :)

enableDirectInternetAccess: false,
volumeSizeInGB: 100,
});
```

Add custom scripts when notebook instance is created and started.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (period -> colon).


```ts
const notebook =new NotebookInstance(this, 'MyNotebook');
notebook.addOnCreateScript('echo "Creating Notebook"');
notebook.addOnStartScript('echo "Starting Notebook"');
```

Add a security group to the notebook instance.

```ts
const vpc = new ec2.Vpc(this, "Vpc");
const sg = new ec2.SecurityGroup(stack, 'SecurityGroup', { vpc });
const notebook =new NotebookInstance(this, 'MyNotebook', {
vpc
});
notebook.addSecurityGroup(sg);
```

### Models

Define a model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Colon :)


```ts
new Model(this, 'MyModel', {
primaryContainer: new GenericContainerDefinition( { 'us-west-2': '123456789012.dkr.ecr.us-west-2.amazonaws.com/mymodel:latest' })
});
```

Define a model with a container inference pipeline.

```ts
const model = new Model(this, 'MyModel');
model.addContainer(new GenericContainerDefinition({ 'us-west-2': '123456789012.dkr.ecr.us-west-2.amazonaws.com/mymodel1:latest' }));
model.addContainer(new GenericContainerDefinition({ 'us-west-2': '123456789012.dkr.ecr.us-west-2.amazonaws.com/mymodel2:latest' }));
```

### Endpoonts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo ("endpoonts").


Define an endpoint.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll just say "colons everywhere" :p


```ts
const sagemaker = require('@aws-cdk/aws-sagemaker');
const model = new Model(this, 'MyModel', {
primaryContainer: new GenericContainerDefinition( { 'us-west-2': '123456789012.dkr.ecr.us-west-2.amazonaws.com/mymodel:latest' })
});
const endpooint = new Endpoint(stack, 'MyEndpoint', {
productionVariants: [
{
model,
}
],
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment for all the code samples in this file: we use 2 spaces for code indentation, and so the samples should also be intended with 2 spaces.

```
228 changes: 228 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/endpoint.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
import ec2 = require('@aws-cdk/aws-ec2');
import kms = require('@aws-cdk/aws-kms');
import { Construct, Lazy, Resource, Tag } from '@aws-cdk/core';
import { IModel } from './model';
import { CfnEndpoint, CfnEndpointConfig } from './sagemaker.generated';

/**
* Name tag constant
*/
const NAME_TAG: string = 'Name';

/**
* Construction properties for a SageMaker Endpoint.
*
* @experimental
*/
export interface EndpointProps {

/**
* Name of the endpoint.
*
* @default none
*/
readonly endpointName?: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again - this is a physical name, docs should reflect that.


/**
* Name of the endpoint configuration.
*
* @default same as the endpoint name with 'Config' appended.
*/
readonly configName?: string;

/**
* Optional KMS encryption key associated with this stream.
*
* @default none
*/
readonly kmsKey?: kms.IKey;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called encryptionKey.


/**
* List of production variants.
*
* @default none
*/
readonly productionVariants?: ProductionVariant[];

}

/**
* Construction properties for the Production Variant.
*
* @experimental
*/
export interface ProductionVariant {

/**
* Optional accelerateor type.
*
* @default none
*/
readonly acceleratorType?: AcceleratorType;

/**
* Initial number of instances to be deployed.
*
* @default 1
*/
readonly initialInstanceCount?: number;

/**
* Inital weight of the production variant.
*
* @default 100
*/
readonly initialVariantWeight?: number;

/**
* Instance type of the production variant.
*
* @default ml.c4.xlarge instance type.
*/
readonly instanceType?: ec2.InstanceType;

/**
* The model aligned to the production variant.
*
*/
readonly model: IModel;

/**
* Name of the production variant.
*
* @default same name as the model name
*/
readonly variantName?: string;
}

export enum AcceleratorType {

LARGE = 'ml.eia1.large ',

MEDIUM = 'ml.eia1.medium',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong order? Looks like Medium should be the first one.


XLARGE = 'ml.eia1.xlarge',
}

/**
* Defines a SageMaker Endpoint and associated configuration resource.
*
* @experimental
*/
export class Endpoint extends Resource {

/**
* The name of the endpoint.
*
* @attribute
*/
public readonly endpointName: string;

/**
* Name of the endpoint configuration.
*
* @attribute
*/
public readonly endpointConfigName: string;

private readonly productionVariants: ProductionVariant[] = [];

constructor(scope: Construct, id: string, props: EndpointProps= {}) {
super(scope, id);

// check that the production variants are defined
if (props.productionVariants && props.productionVariants.length > 0) {
this.productionVariants = props.productionVariants;
}

// apply a name tag to the endpoint resource
this.node.applyAspect(new Tag(NAME_TAG, this.node.path));

// create the endpoint configuration resource
const endpointConfig = new CfnEndpointConfig(this, 'EndpointConfig', {
kmsKeyId: (props.kmsKey) ? props.kmsKey.keyArn : undefined,
endpointConfigName: (props.configName) ? props.configName : (props.endpointName) ? props.endpointName + "Config" : undefined,
productionVariants: Lazy.anyValue({ produce: () => this.renderProductionVariants(this.productionVariants) })
});
this.endpointConfigName = endpointConfig.attrEndpointConfigName;

// create the endpoint resource
const endpoint = new CfnEndpoint(this, 'Endpoint', {
endpointConfigName: this.endpointConfigName,
endpointName: (props.endpointName) ? props.endpointName : undefined,
});
this.endpointName = endpoint.attrEndpointName;
}

/**
* Add production variant to the endpoint configution.
*
* @param productionVariant: The production variant to add.
*/
public addProductionVariant(productionVariant: ProductionVariant): void {
this.productionVariants.push(productionVariant);
}

protected validate(): string[] {
const result = super.validate();
// check we have at least one production variant
if (this.productionVariants.length === 0) {
result.push("Must have at least one Production Variant");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's an interesting pattern that you can use here.

Since at least one ProductionVariant is required, have in props:

productionVariant: ProductionVariant; // required
extraProductionVariants?: ProductionVariant[]; // optional

This way, you communicate to the clients of this class, at compile time, that at least one variant has to be provided.

}

// check instance count is greater than zero
this.productionVariants.forEach(v => {
if (v.initialInstanceCount && v.initialInstanceCount < 1) {
result.push("Must have at least one instance");
}
});

// check variant weight is not negative
this.productionVariants.forEach(v => {
if (v.initialVariantWeight && v.initialVariantWeight < 0) {
result.push("Cannot have negative variant weight");
}
});

// validate the instance type
this.productionVariants.forEach(v => {
if (v.instanceType) {
this.validateInstanceType(v.instanceType.toString(), result);
}
});
return result;
}

/**
* Render the list of production variants. Set default values if not defined.
* @param productionVariants the variants to render
*/
private renderProductionVariants(productionVariants: ProductionVariant[]): CfnEndpointConfig.ProductionVariantProperty[] {
return productionVariants.map(v => (
{
// get the initial instance count. Set to 1 if not defined
initialInstanceCount: (v.initialInstanceCount) ? v.initialInstanceCount : 1,
// get the initial variant weight. Set to 100 if not defined
initialVariantWeight: (v.initialVariantWeight) ? v.initialVariantWeight : 100,
// get the instance type. Set to 'ml.c4.xlarge' if not defined
instanceType: 'ml.' + ((v.instanceType) ? (v.instanceType.toString()) :
ec2.InstanceType.of(ec2.InstanceClass.C4, ec2.InstanceSize.XLARGE)),
modelName: v.model.modelName,
// get the variant name. Set to the model name if not defined
variantName: (v.variantName) ? v.variantName : v.model.modelName,
}
));
}

/**
* Validates the provided instance type.
* @param instanceType the instance type of the SageMaker endpoint production variant.
*/
private validateInstanceType(instanceType: string, result: string[]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have this return string[], and merge the errors in the client of this method, instead of mutating a parameter. I think keeping methods pure is simply easier.

// check if a valid sagemaker instance type
if (!['c4', 'c5', 'm4', 'm5', 'p2', 'p3', 't2'].some(instanceClass => instanceType.indexOf(instanceClass) >= 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same as in Notebook - extract to a module-private function (and include the allowed types in the error message).

result.push(`Invalid instance type for a Sagemaker Endpoint Production Variant: ${instanceType}`);
}
}

}
4 changes: 4 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/lib/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
export * from './endpoint';
export * from './model';
export * from './notebook-instance';

// AWS::SageMaker CloudFormation Resources:
export * from './sagemaker.generated';
Loading