-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: Trying to add codeshell 7b model, but garbled characters #11681
Comments
Please help. I really can’t find a solution. |
Sorry I don't have time to debug in detail, but what I would do is have two debuggers open and step through vLLM (during inference, not in profile run) and HF models line by line and see where the outputs diverge. You can refer to the model tests on how to call vLLM and HF in a consistent way. |
’‘’ |
Can you show the full stack trace? It's hard to see what the problem is from this short snippet. |
after load model : codeshell.py
By hidden_states = self.wte(input_ids) ##Take this step
/usr/local/lib/python3.10/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py
so output_parallel = self.linear_method.embedding(self,
why layer.weight is nan? |
Maybe you didn't load the weights correctly. |
codeshell.py
|
Can you print out the weights of the embedding layer before and after it is loaded? |
As you found before, the embedding layer's weights are set to nan, so of course the outputs are garbage. You should find out why they are set to nan. |
Could you please help me verify whether it is normal in your place? |
update my code codeshell.py
|
It also outputs garbled characters on my end. |
You cannot annotate the position encoding in the attention. If I don’t annotate it, there will be no garbled characters. class CodeShellAttention(nn.Module):
|
use it |
I am still getting garbled text using this code. |
Your current environment
but output garbled characters,Can you help me solve this problem?
![image](https://private-user-images.githubusercontent.com/68003593/399671665-8f90dc99-b953-4710-a8d3-c39fa9eff9fe.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2MTYyNDgsIm5iZiI6MTczOTYxNTk0OCwicGF0aCI6Ii82ODAwMzU5My8zOTk2NzE2NjUtOGY5MGRjOTktYjk1My00NzEwLWE4ZDMtYzM5ZmE5ZWZmOWZlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE1VDEwMzkwOFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTIyMTVhYTEwYTM1ZTcwMjkxMDBlMDk5M2FmZTRhMDBlN2Y4ZjBhY2E1OTdiOTc3ZTNmZGRmZjJkZTZiMDRhNjUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.TroDwgX2tSXNt-VIP8FBYMOwUmPhUTNlBcjMFORs9Xk)
output :
Links to previously submitted related questions:#11451
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: