Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

notebooks/deploy-mixtral.ipynb issue #8

Open
existme opened this issue Dec 16, 2023 · 6 comments
Open

notebooks/deploy-mixtral.ipynb issue #8

existme opened this issue Dec 16, 2023 · 6 comments

Comments

@existme
Copy link

existme commented Dec 16, 2023

This is not really an issue, but I couldn't find any other way to contact you. I was trying to follow your instructions on https://www.philschmid.de/sagemaker-deploy-mixtral and ended up in this repository.

I tried to follow the deployment instructions, but the deployment was not successful. I got the following error logs on the inference endpoint:

2023-12-15T20:06:10.216+01:00	> File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner model = get_model( File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 310, in get_model return FlashMixtral( File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in __init__ super(FlashMixtral, self).__init__( File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in __init__ SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
2023-12-15T20:06:10.216+01:00	TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' 

The HF image that I ended up using was 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi1.3.1-gpu-py310-cu121-ubuntu20.04

looking into TGI issues, and found this thread. It seems to be fixed by a commit mentioned in the thread.
But I don't how can I get the latest DLC image of 1.3.3 for a sagemaker deployment, because when I specify the version in image_uris.retrieve or in get_huggingface_llm_image_uri, it complains:

ValueError: Unsupported huggingface-llm version: 1.3.3. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface-llm versions. Supported huggingface-llm version(s): 0.6.0, 0.8.2, 0.9.3, 1.0.3, 1.1.0, 1.2.0, 1.3.1, 0.6, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3. 

I don't know the procedure for having the latest version ending up in aws-dkr or how we can use a custom-built DLC image when deploying to Sagemaker.
Can you help in any way, or can you explain how your deployment works?

Thanks in advance

@LvffY
Copy link

LvffY commented Dec 17, 2023

I just opened an issue on sagemaker itself because I think it's an issue with the sagemaker SDK that's limiting some versions.

@existme
Copy link
Author

existme commented Dec 17, 2023

Thank you for taking the time to create the issue 🙏 I hope it gets the needed attention.

@existme
Copy link
Author

existme commented Dec 18, 2023

@LvffY, by the way, do you know any other way of deploying the model as an inference? I want to try the model on AWS, but so far, I found no way to do that.

@sdkramer10
Copy link

Thanks for adding the ticket! I am also blocked by this issue.

@LvffY
Copy link

LvffY commented Dec 18, 2023

@LvffY, by the way, do you know any other way of deploying the model as an inference? I want to try the model on AWS, but so far, I found no way to do that.

@existme not at the time

@rhoentier
Copy link

I came with the same problem.

Huggingface has released a newer version of the image which is accessible via sagemaker: 763104351884.dkr.ecr.eu-central-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.1-tgi1.3.3-gpu-py310-cu121-ubuntu20.04-v1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants