Serving a model using custom container, instance run of disk

**Describe the bug**
Using a custom container to serve a Pytorch model, defined as below, it throw "No space left on device"

```
container = {"Image": image, "ModelDataUrl": model_artifact}

create_model_response = sm.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=container
)

create_endpoint_config_response = sm.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.g4dn.8xlarge",
            "InitialVariantWeight": 1,
            "InitialInstanceCount": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

```
Docker image size is 17 GB and Torchserve mar file is 8 GB. I was wondering if there is any way to increase the storage for the instances that are serving the model. Going through the [doc](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html) for endpoint configuration seems there is no setting for specifics about instances. 


-- Cloud watch log

![256717956_890382124957120_3900367258239977898_n](https://user-images.githubusercontent.com/9162336/142566990-ae527283-b6da-42dc-aa49-3e2efbea049f.png)

## Expected behavior

Having knobs to set the storage for the serving instances. 





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Serving a model using custom container, instance run of disk #112

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Serving a model using custom container, instance run of disk #112

Description

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions