notebooks/deploy-mixtral.ipynb issue #8

existme · 2023-12-16T17:27:32Z

This is not really an issue, but I couldn't find any other way to contact you. I was trying to follow your instructions on https://www.philschmid.de/sagemaker-deploy-mixtral and ended up in this repository.

I tried to follow the deployment instructions, but the deployment was not successful. I got the following error logs on the inference endpoint:

2023-12-15T20:06:10.216+01:00	> File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner model = get_model( File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 310, in get_model return FlashMixtral( File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in __init__ super(FlashMixtral, self).__init__( File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in __init__ SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
2023-12-15T20:06:10.216+01:00	TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

The HF image that I ended up using was 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi1.3.1-gpu-py310-cu121-ubuntu20.04

looking into TGI issues, and found this thread. It seems to be fixed by a commit mentioned in the thread.
But I don't how can I get the latest DLC image of 1.3.3 for a sagemaker deployment, because when I specify the version in image_uris.retrieve or in get_huggingface_llm_image_uri, it complains:

ValueError: Unsupported huggingface-llm version: 1.3.3. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface-llm versions. Supported huggingface-llm version(s): 0.6.0, 0.8.2, 0.9.3, 1.0.3, 1.1.0, 1.2.0, 1.3.1, 0.6, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3.

I don't know the procedure for having the latest version ending up in aws-dkr or how we can use a custom-built DLC image when deploying to Sagemaker.
Can you help in any way, or can you explain how your deployment works?

Thanks in advance

The text was updated successfully, but these errors were encountered:

LvffY · 2023-12-17T11:26:31Z

I just opened an issue on sagemaker itself because I think it's an issue with the sagemaker SDK that's limiting some versions.

existme · 2023-12-17T16:52:39Z

Thank you for taking the time to create the issue 🙏 I hope it gets the needed attention.

existme · 2023-12-18T13:29:18Z

@LvffY, by the way, do you know any other way of deploying the model as an inference? I want to try the model on AWS, but so far, I found no way to do that.

sdkramer10 · 2023-12-18T20:59:25Z

Thanks for adding the ticket! I am also blocked by this issue.

LvffY · 2023-12-18T22:25:31Z

@LvffY, by the way, do you know any other way of deploying the model as an inference? I want to try the model on AWS, but so far, I found no way to do that.

@existme not at the time

rhoentier · 2024-01-11T10:58:56Z

I came with the same problem.

Huggingface has released a newer version of the image which is accessible via sagemaker: 763104351884.dkr.ecr.eu-central-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi1.3.3-gpu-py310-cu121-ubuntu20.04-v1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks/deploy-mixtral.ipynb issue #8

notebooks/deploy-mixtral.ipynb issue #8

existme commented Dec 16, 2023

LvffY commented Dec 17, 2023

existme commented Dec 17, 2023 •

edited

Loading

existme commented Dec 18, 2023

sdkramer10 commented Dec 18, 2023

LvffY commented Dec 18, 2023

rhoentier commented Jan 11, 2024

notebooks/deploy-mixtral.ipynb issue #8

notebooks/deploy-mixtral.ipynb issue #8

Comments

existme commented Dec 16, 2023

LvffY commented Dec 17, 2023

existme commented Dec 17, 2023 • edited Loading

existme commented Dec 18, 2023

sdkramer10 commented Dec 18, 2023

LvffY commented Dec 18, 2023

rhoentier commented Jan 11, 2024

existme commented Dec 17, 2023 •

edited

Loading