-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amazon Sagemaker Training, Deployment, and Inference API's #28
Conversation
|
||
## Sagemaker Image and Hyperparameters below are required fields. | ||
## Common parameters and image info can be found at https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html | ||
sagemaker_image = "433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a legacy XGBoost image (0.72) which is not synced with upstream open-source XGboost. I suggest using
246618743249.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:0.90-2-cpu-py3
. This image is based on open-source XGBoost and is pinned to v0.90
of XGBoost, with the -2
indicating a SageMaker version. For a given version, backwards-compatibility is guaranteed, so better to use that instead of :latest
.
@queueburt Thanks for this great proposal! Do you have any news on the topic? |
Closing this PR in favor of native support for hosting models on Sagemaker with Metaflow |
Pull request for an initial implementation of streamlined API's for Amazon Sagemaker. Includes 3 functions: "fit", "deploy", and "predict". The primary algorithms this was built around were XGBoost and Linear Learner, but it should be compatible with any built in algorithm that accepts 'text/csv' as a content-type.
Two additional environment variables are required to run these flows.
Usage consists of
from metaflow import Sagemaker
.Sagemaker.fit(data, image, hyperparameters, stopping_condition, resource_config)
Returns a string object with the S3 URI of the model artifact generated by the fit.
data is a dictionary with keys that reference Sagemaker "channel names" found here, and values that consist of CSV data with no headers or indexes.
image is a string consisting of a Sagemaker built in algorithm container registry path, also found here. Automatic mapping will be coming soon.
hyperparameters is a dictionary with hyperparameters for the specific algorithm referenced by image. An example for XGBoost can be found here.
stopping_condition and resource_config are optional dictionaries for overriding some defaults, specifically that of a single ml.m4.xlarge training instance with a 5 GB volume, and a 1 hour max runtime. Syntax for those overrides can be found here
Sagemaker.deploy(model_uri, image, instanceType, instanceCount, instanceWeight, variantName)
Returns a string object with the endpoint name generated by the model deployment.
model_uri is a string object with the S3 path for the model. This string is returned by
Sagemaker.fit
image is a string object. It should be the same image used for training.
instanceType, instanceCount, instanceWeight, and variantName are all optional parameters for overriding the defaults of, respectively, "ml.m4.xlarge", 1, 1, and "AllTraffic".
Sagemaker.predict(data, endpoint_name)
Returns a list of predictions.
data is a CSV object with no headers or indexes representing the features for inference.
endpoint_name is a string object with the Sagemaker endpoint to be inferred against. This value is returned by
Sagemaker.deploy
.A short example of the usage can be found here. This PR also brings in an 08-sagemaker tutorial for use with
metaflow tutorials pull
that demonstrates the above sample flow.