-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serving a model using custom container, instance run of disk #112
Comments
cc @nskool |
I believe exposing few knobs for some of the settings including storage for the host instances would be helpful. Thanks @lxning for the offline discussions, it would be great if could add this as a feature to Sagemaker SDK. |
According to SM hosting team, currently SM SDK does not support storage size configuration. The only available solution is to change instance type. Pls refer host-instance-storage-volumes-table |
@lxning this is a limiting factor, as it is easy to hit the limit mostly on gpu instance 30GB, some of Nvidia dockers similar in this case can go up to 21 GB and heavier workloads that chain multiple models can end up having a large model_artifact size that goes beyond the limit. |
Describe the bug
Using a custom container to serve a Pytorch model, defined as below, it throw "No space left on device"
Docker image size is 17 GB and Torchserve mar file is 8 GB. I was wondering if there is any way to increase the storage for the instances that are serving the model. Going through the doc for endpoint configuration seems there is no setting for specifics about instances.
-- Cloud watch log
Expected behavior
Having knobs to set the storage for the serving instances.
The text was updated successfully, but these errors were encountered: