You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug sagemaker-inference recently (10/15) released v1.5.3, which included this commit updating the name of the model server artifact and command from mxnet-model-server to multi-model-server.
all containers defined in this repository install sagemaker-inference as a dependency of this repo itself, on lines
RUN pip install --no-cache-dir "sagemaker-pytorch-inference<2"
and this repo's setup.py has an install_requires which includes sagemaker-inference>=1.3.1. as a result, sagemaker-inference=1.5.3 installed.
so while the Dockerfile's CMD value (which calls mxnet-model-server directly) will succeed, attempts to use the ENTRYPOINT with serve as a build arg will fail with message:
Traceback (most recent call last):
File "/usr/local/bin/dockerd-entrypoint.py", line 22, in <module>
serving.main()
File "/opt/conda/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/serving.py", line 39, in main
_start_model_server()
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 206, in call
return attempt.get(self._wrap_exception)
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/opt/conda/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/serving.py", line 35, in _start_model_server
model_server.start_model_server(handler_service=HANDLER_SERVICE)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/model_server.py", line 94, in start_model_server
subprocess.Popen(multi_model_server_cmd)
File "/opt/conda/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/opt/conda/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'multi-model-server': 'multi-model-server'
To reproduce
build any container
mount a model and inference.py (e.g. half_plus_three) into /opt/ml/model
docker run [tag name] serve
Expected behavior
tensorflow serving serves the mounted model / inference.py
System information
A description of your system. Please provide:
Toolkit version: 2.0.5, but should apply to all versions
Framework version: 1.4, but should apply to all versions
Python version: 3.7
CPU or GPU: cpu, but should apply to both
Custom Docker image (Y/N): N
The text was updated successfully, but these errors were encountered:
Hi @RZachLamberty , I stumbled upon your issue here. I was trying to create a custom docker image and had a similar issue. Installing multi-model-server (pip install multi-model-server) did away with this issue. You can give it a try :)
Describe the bug
sagemaker-inference
recently (10/15) released v1.5.3, which included this commit updating the name of the model server artifact and command frommxnet-model-server
tomulti-model-server
.all containers defined in this repository install
sagemaker-inference
as a dependency of this repo itself, on linesand this repo's
setup.py
has aninstall_requires
which includessagemaker-inference>=1.3.1
. as a result,sagemaker-inference=1.5.3
installed.so while the
Dockerfile
'sCMD
value (which callsmxnet-model-server
directly) will succeed, attempts to use theENTRYPOINT
withserve
as a build arg will fail with message:To reproduce
inference.py
(e.g.half_plus_three
) into/opt/ml/model
docker run [tag name] serve
Expected behavior
tensorflow serving serves the mounted model /
inference.py
System information
A description of your system. Please provide:
The text was updated successfully, but these errors were encountered: