Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to host Llama3.2-11b on Azure A100 80GB server #679

Open
2 of 4 tasks
alokgupta1996 opened this issue Nov 13, 2024 · 2 comments
Open
2 of 4 tasks

Not able to host Llama3.2-11b on Azure A100 80GB server #679

alokgupta1996 opened this issue Nov 13, 2024 · 2 comments

Comments

@alokgupta1996
Copy link

alokgupta1996 commented Nov 13, 2024

System Info

lorax_version=0.12.0
Using Docker to host the 11b model it runs perfectly for Llama3.1-8b
But with LLama3.2-11b I am getting the following error

ModuleNotFoundError: No module named 'lorax_server.utils.attention.utils'

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Set the model path in environment variable
and use docker run to host the model

Expected behavior

The model should have been hosted seemlessly like it does for the older model

@gothaleshubham
Copy link

Hi @alokgupta1996 ,

Please rebuild the docker image using following docker file

FROM ghcr.io/predibase/lorax:0.12.0
COPY utils.py /opt/conda/lib/python3.10/site-packages/lorax_server/utils/attention/utils.py
RUN transformers==4.45.2

and get utils.py file from
https://github.com/predibase/lorax/blob/1d2b5146688547e5023680cd3f1b8b7112edd8a6/server/lorax_server/utils/attention/utils.py

@gothaleshubham
Copy link

Above solution able to host base model only without adapter switching

for mllama model with adapter switching use following docker image

ghcr.io/predibase/lorax:fc2cebb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants