Optional `bias` for qwen2 model #32892

wavy-jung · 2024-08-20T05:20:33Z

Feature request

bias of linear layers in qwen2 model is hard coded as following:

transformers/src/transformers/models/qwen2/modeling_qwen2.py

Lines 217 to 219 in 85345bb

    
           self.gate_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) 
        
           self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) 
        
           self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=False)

transformers/src/transformers/models/qwen2/modeling_qwen2.py

Lines 271 to 274 in 85345bb

    
           self.q_proj = nn.Linear(self.hidden_size, self.num_heads * self.head_dim, bias=True) 
        
           self.k_proj = nn.Linear(self.hidden_size, self.num_key_value_heads * self.head_dim, bias=True) 
        
           self.v_proj = nn.Linear(self.hidden_size, self.num_key_value_heads * self.head_dim, bias=True) 
        
           self.o_proj = nn.Linear(self.num_heads * self.head_dim, self.hidden_size, bias=False)

It would be good to make bias optionally configurable through a config file to ensure compatibility with the latest models. (e.g. llama)

Motivation

bias is optional in llama model as following:

transformers/src/transformers/models/llama/modeling_llama.py

Lines 286 to 288 in 85345bb

    
           self.gate_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=config.mlp_bias) 
        
           self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=config.mlp_bias) 
        
           self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=config.mlp_bias)

Your contribution

I'll submit PR for this feature

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-08-20T08:26:12Z

cc @ArthurZucker

ArthurZucker · 2024-08-20T17:11:10Z

Answered on the PR~

wavy-jung added the Feature request Request for a new feature label Aug 20, 2024

wavy-jung linked a pull request Aug 20, 2024 that will close this issue

Make qwen2 attention_qkv_bias optional #32893

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optional `bias` for qwen2 model #32892

Optional `bias` for qwen2 model #32892

wavy-jung commented Aug 20, 2024

amyeroberts commented Aug 20, 2024

ArthurZucker commented Aug 20, 2024

Optional bias for qwen2 model #32892

Optional bias for qwen2 model #32892

Comments

wavy-jung commented Aug 20, 2024

Feature request

Motivation

Your contribution

amyeroberts commented Aug 20, 2024

ArthurZucker commented Aug 20, 2024

Optional `bias` for qwen2 model #32892

Optional `bias` for qwen2 model #32892