-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorRT Quantization Breaks for LlamaLinearScalingRotaryEmbedding
#1083
Comments
Is there a solution ? i have the same problem. |
@Sanger2000 I have the same problem with deepseek-coder-6.7b-base model, have you solved the problem?
|
Thank you for pointing out this issue. We will add a fix to more robustly distinguish the actual dense linear layer. |
I am facing the save issue with v0.8.0. Help needed. |
Hi @RalphMao Are there any temporary ways to avoid this problem now? |
@activezhao A hotfix would be modify the
|
@Opdoop OK, thanks. |
Hi @Opdoop I have a question if I set Thanks
|
@Sanger2000 Do you still have the problem? If not, we will close it soon. |
System Info
NVIDIA 4090
TensorRT-0.7.1
In nvidia-ammo, it appears these lines in

ammo/torch/export/layer_utils.py
have an unexpected failure for some Llama variants:In particular, the deepseek models use
LlamaLinearScalingRotaryEmbedding
. This means the module is picked up by theis_linear
check, and is treated as the dense case. However, there is no .weight for this module, so thebuild_linear_config
fails.Lots of easy fixes for this (for example, just checking if "Rotary" in name and skipping that case), happy to contribute (but don't think there is an OSS repo to do so)
Who can help?
@Tracin
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Try compiling then running on fp8 for deepseek-coder-6.7b-base
Expected behavior
I expect the model to generate the tokens
actual behavior
The code throws the error: "no .weight for this module"
additional notes
N/A
The text was updated successfully, but these errors were encountered: