Open
Description
Describe the bug
Using a custom container to serve a Pytorch model, defined as below, it throw "No space left on device"
container = {"Image": image, "ModelDataUrl": model_artifact}
create_model_response = sm.create_model(
ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=container
)
create_endpoint_config_response = sm.create_endpoint_config(
EndpointConfigName=endpoint_config_name,
ProductionVariants=[
{
"InstanceType": "ml.g4dn.8xlarge",
"InitialVariantWeight": 1,
"InitialInstanceCount": 1,
"ModelName": model_name,
"VariantName": "AllTraffic",
}
],
)
Docker image size is 17 GB and Torchserve mar file is 8 GB. I was wondering if there is any way to increase the storage for the instances that are serving the model. Going through the doc for endpoint configuration seems there is no setting for specifics about instances.
-- Cloud watch log
Expected behavior
Having knobs to set the storage for the serving instances.
Metadata
Metadata
Assignees
Labels
No labels