Skip to content

Commit

Permalink
feat(sagemaker): add Endpoint L2 construct
Browse files Browse the repository at this point in the history
This is the third and final PR to complete the implementation of RFC
431:
aws/aws-cdk-rfcs#431

closes aws#2809

Co-authored-by: Matt McClean <mmcclean@amazon.com>
Co-authored-by: Long Yao <yl1984108@gmail.com>
Co-authored-by: Drew Jetter <60628154+jetterdj@users.noreply.github.com>
Co-authored-by: Murali Ganesh <59461079+foxpro24@users.noreply.github.com>
Co-authored-by: Abilash Rangoju <988529+rangoju@users.noreply.github.com>
  • Loading branch information
6 people committed Nov 11, 2022
1 parent 0e97c15 commit a139468
Show file tree
Hide file tree
Showing 21 changed files with 4,513 additions and 0 deletions.
57 changes: 57 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,3 +195,60 @@ const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
]
});
```

### Endpoint

When you create an endpoint from an `EndpointConfig`, Amazon SageMaker launches the ML compute
instances and deploys the model or models as specified in the configuration. To get inferences from
the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For
more information about the API, see the
[InvokeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html)
API. Defining an endpoint requires at minimum the associated endpoint configuration:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
```

### AutoScaling

To enable autoscaling on the production variant, use the `autoScaleInstanceCount` method:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant('variantName');
const instanceCount = productionVariant.autoScaleInstanceCount({
maxCapacity: 3
});
instanceCount.scaleOnInvocations('LimitRPS', {
maxRequestsPerSecond: 30,
});
```

For load testing guidance on determining the maximum requests per second per instance, please see
this [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html).

### Metrics

To monitor CloudWatch metrics for a production variant, use one or more of the metric convenience
methods:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant('variantName');
productionVariant.metricModelLatency().createAlarm(this, 'ModelLatencyAlarm', {
threshold: 100000,
evaluationPeriods: 3,
});
```
Loading

0 comments on commit a139468

Please sign in to comment.