是否支持启动的时候，指定use_flash_attion_2 ?? #25

awzhgw · 2024-02-05T07:46:26Z

是否支持启动的时候，指定use_flash_attion_2 ??

LinB203 · 2024-02-05T08:05:21Z

支持。参考以下代码。除了Qwen以外都可以这样，因为Qwen会自动开启。
[En] It supports, just modify the code as follows except the Qwen-based, because it will auto-enable the flash-attn.

model = LlavaPhiForCausalLM.from_pretrained(
                    model_args.model_name_or_path,
                    cache_dir=training_args.cache_dir,
                    attn_implementation="flash_attention_2",  # add this line
                    **bnb_model_from_pretrained_args
                )

LinB203 · 2024-02-07T09:35:09Z

Hi, we have tested the flash attention2, but we found the performance degradation. We found the same question....
Therefore, we do not recommend you guys to enable the flash attention2.
huggingface/transformers#28488

LinB203 mentioned this issue Feb 5, 2024

Is llavallama moe supported? #9

Open

awzhgw closed this as completed Feb 6, 2024

LinB203 reopened this Feb 7, 2024

awzhgw closed this as completed Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

是否支持启动的时候，指定use_flash_attion_2 ?? #25

是否支持启动的时候，指定use_flash_attion_2 ?? #25

awzhgw commented Feb 5, 2024

LinB203 commented Feb 5, 2024

LinB203 commented Feb 7, 2024

是否支持启动的时候，指定use_flash_attion_2 ?? #25

是否支持启动的时候，指定use_flash_attion_2 ?? #25

Comments

awzhgw commented Feb 5, 2024

LinB203 commented Feb 5, 2024

LinB203 commented Feb 7, 2024