VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

crownz248 · 2024-09-25T10:51:28Z

I can successfully deploy llama3-8b-instruct with EAGLE. But there is a problem when deploying qwen2-7b-instruct with EAGLE.

I have converted the EAGLE-Qwen2-7B-Instruct model according tovllm/model_executor/models/eagle.py:L126.

I encountered another error below:

AssertionError: Attempted to load weight (torch.Size([3584])) into parameter (torch.Size([3584, 7168]))
I lookup to the code vllm/model_executor/models/eagle.py:L139 which is shown as below:

def load_weights(self, weights: Iterable[Tuple[str, torch.Tensor]]):
            ...
            elif name.startswith("fc."):
                weight_loader = getattr(self.fc.weight, "weight_loader",
                                        default_weight_loader)
                weight_loader(self.fc.weight, loaded_weight)
            ...

I think you only consider the name varieble startswith 'fc.' can only be 'fc.weight', but the fc layer of eagle-qwen2 has bias attribute, which means the name varieble can be 'fc.bias'.

Moreover, the qkv_proj layer of EAGLE-Qwen2-7B-Instruct also have bias.

I hope you can fix this in the upcoming upgrade!

The text was updated successfully, but these errors were encountered:

MMuzzammil1 · 2024-10-11T04:06:04Z

I think this issue has been fixed in the release v0.6.2 of vllm now. Please see this: vllm-project/vllm#8790.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

crownz248 commented Sep 25, 2024

MMuzzammil1 commented Oct 11, 2024

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

Comments

crownz248 commented Sep 25, 2024

MMuzzammil1 commented Oct 11, 2024