[Feat] Add WhisperFlashAttention2 #2015

hongziqi · 2025-04-08T08:54:59Z

Test Report

Hard Environment:
Ascend（snt9b|32G）

Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 1.7.0.Bxxx) : 2.5.0
-- Python version (e.g., Python 3.7.5) : 3.10.0
-- OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04.4 LTS
-- GCC/Compiler version (if compiled from source): 11.04

Test Code:

MindSpore

import mindspore
from mindnlp.transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
import time

mindspore.set_device("Ascend", 2)

def generate_with_time(pipe, file_path):
    start_time = time.time()
    result = pipe(file_path)
    generation_time = time.time() - start_time
    return result, generation_time

model_id = "openai/whisper-large-v3"

# default mode: eager
# model = AutoModelForSpeechSeq2Seq.from_pretrained(
#     model_id, 
#     ms_dtype=mindspore.float16, 
#     low_cpu_mem_usage=True,
#     use_safetensors=True,
# )

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, 
    ms_dtype=mindspore.float16, 
    low_cpu_mem_usage=True,
    use_safetensors=True,
    attn_implementation="flash_attention_2",
)

processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    ms_dtype=mindspore.float16,
    return_timestamps=True,
)


# ------ eager mode test result ------ 
# generation_time: 93.65066742897034, result: 青光闪动一柄青钢剑疏地刺出指向中年汉子左肩使肩少年不带剑招用劳外斗
# 剑斜剑锋以削向那汉子右颈哪中年汉子竖剑挡格张来一声响双剑相击嗡嗡作声震声未竭双刃剑功复合你拆了三招中年汉子长剑猛地击落直转少年顶
# 门那少年臂向右侧左手剑绝学隐青钢剑鞠刺呐喊子大腿威两人剑法迅绝全力相搏威徒练武厅东边坐着爱人上手是个四十左右的中年道姑铁青着脸嘴唇
# 紧闭下手是个五十余岁的老者右手掠着长须神情甚是得意两人的座位相距一丈有余身后各站着二十余名男女弟子西边一排椅子上坐着十余位宾客东西
# 双方的目光都集中于场中二人的相斗眼下眼尖的少年与中年汉子已拆到七十余招前招越来越紧物资未分胜败突然周年汉子长剑挥出用力猛了身子微晃肆意摔跌席边
# 宾客中一个身穿青衫的年轻男子忍不住吃得一声笑他随即指导师太忙伸手按住了口


# ------ flash_attention_2 mode test result ------ 
# generation_time: 79.74670958518982, result: 青光闪动一柄青钢剑疏地刺出指向中年汉子左肩使肩少年不带剑招用劳外斗
# 剑斜剑锋以削向那汉子右颈哪中年汉子竖剑挡格张来一声响双剑相击嗡嗡作声震声未竭双刃剑功复合你拆了三招中年汉子长剑猛地击落直转少年顶
# 门那少年臂向右侧左手剑绝学隐青钢剑鞠刺呐喊子大腿威两人剑法迅绝全力相搏威徒练武厅东边坐着爱人上手是个四十左右的中年道姑铁青着脸嘴唇
# 紧闭下手是个五十余岁的老者右手掠着长须神情甚是得意两人的座位相距一丈有余身后各站着二十余名男女弟子西边一排椅子上坐着十余位宾客东西
# 双方的目光都集中于场中二人的相斗眼下眼尖的少年与中年汉子已拆到七十余招前招越来越紧物资未分胜败突然周年汉子长剑挥出用力猛了身子微晃肆意摔跌席边
# 宾客中一个身穿青衫的年轻男子忍不住吃得一声笑他随即指导师太忙伸手按住了口
result, generation_time = generate_with_time(pipe, "/home/candyhong/workspace/whisper_large/tianlong0925.mp3")
print(f"generation_time: {generation_time}, result: {result['text']}")

PyTorch + Ascend

import torch
import torch_npu
import time
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline

torch_npu.npu.set_compile_mode(jit_compile=False)
torch_npu.npu.config.allow_internal_format = False

def generate_with_time(pipe, file_path):
    start_time = time.time()
    result = pipe(file_path)
    generation_time = time.time() - start_time
    return result, generation_time


device = "npu:0"
torch_dtype = torch.float16

model_id = "openai/whisper-large-v3"

# default mode: eager
# model = AutoModelForSpeechSeq2Seq.from_pretrained(
#     model_id, 
#     torch_dtype=torch_dtype, 
#     low_cpu_mem_usage=True,
#     use_safetensors=True,
# )

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id,
    torch_dtype=torch_dtype,
    low_cpu_mem_usage=True,
    use_safetensors=True,
    attn_implementation="flash_attention_2",
)
model.to(device)

processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    torch_dtype=torch_dtype,
    device=device,
    return_timestamps=True,
)

# ------ eager mode test result ------ 
# generation_time: 58.42126774787903 result: 青光闪动一柄青钢剑疏地刺出指向中年汉子左肩使肩少年不带剑招用劳外斗
# 剑斜剑锋以削向那汉子右颈哪中年汉子竖剑挡格张来一声响双剑相击嗡嗡作声震声未竭双刃剑功复合你拆了三招中年汉子长剑猛地击落直展少年顶
# 门那少年臂向右侧左手剑绝学隐青钢剑鞠刺呐喊子大腿威两人剑法迅绝全力相搏威徒练武厅东边坐着爱人上手是个四十左右的中年道姑铁青着脸嘴唇
# 紧闭下手是个五十余岁的老者右手掠着长须神情甚是得意两人的座位相距一丈有余身后各站着二十余名男女弟子西边一排椅子上坐着十余位宾客东西
# 双方的目光都集中于场中二人的相斗眼下眼尖的少年与中年汉子已拆到七十余招前招越来越紧物资未分胜败突然周年汉子长剑挥出用力猛了身子微晃肆意摔跌席边
# 宾客中一个身穿青衫的年轻男子忍不住吃得一声笑他随即知道失态忙伸手按住了口

# ------ flash_attention_2 mode test result ------ 
# generation_time: 252.1833713054657, result: 青光闪动一柄青钢剑疏地刺出指向中年汉子左肩使肩少年不带剑招用劳外斗
# 剑斜剑锋以削向那汉子右颈哪中年汉子竖剑挡格张来一声响双剑相击嗡嗡作声震声未竭双刃剑功复合你拆了三招中年汉子长剑猛地击落直展少年顶
# 门那少年臂向右侧左手剑绝学隐青钢剑鞠刺呐喊子大腿威两人剑法迅绝全力相搏威徒练武厅东边坐着爱人上手是个四十左右的中年道姑铁青着脸嘴唇
# 紧闭下手是个五十余岁的老者右手掠着长须神情甚是得意两人的座位相距一丈有余身后各站着二十余名男女弟子西边一排椅子上坐着十余位宾客东西
# 双方的目光都集中于场中二人的相斗眼下眼尖的少年与中年汉子已拆到七十余招前招越来越紧物资未分胜败突然周年汉子长剑挥出用力猛了身子微晃肆意摔跌席边
# 宾客中一个身穿青衫的年轻男子忍不住吃得一声笑他随即知道失态忙伸手按住了口
result, generation_time = generate_with_time(pipe, "/home/candyhong/workspace/whisper_large/tianlong0925.mp3")
print(f"generation_time: {generation_time}, result: {result['text']}")

Related Issues

Fixes #2014

lvyufeng and others added 10 commits March 20, 2025 17:11

use official mindspore for CI (mindspore-lab#1999)

e9e2039

Update make_wheel_releases.yml

a660aec

Fix mint.nonzero interface call (mindspore-lab#2001)

797fade

【开源实习】align模型微调 IAUOS5 (mindspore-lab#1997)

b228c1d

解决PeftModel.from_pretrained加载权重前后dtype不一致的问题 (mindspore-lab#2007)

4adcdb8

【开源实习】bit模型微调 (mindspore-lab#1995)

59c6eda

【开源实习】 Albert 模型微调 (mindspore-lab#2008)

10b74e8

【开源实习】Mamba2模型迁移 (mindspore-lab#2009)

895e5c0

[Feat] Add WhisperFlashAttention2

c29a748

fix pylint-check

c412e3b

hongziqi force-pushed the feat-whisper-flash-attention-new branch from bf8abe9 to c412e3b Compare April 8, 2025 10:52

lvyufeng force-pushed the master branch from 895e5c0 to 7706367 Compare April 9, 2025 10:51

hongziqi closed this Apr 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat] Add WhisperFlashAttention2 #2015

[Feat] Add WhisperFlashAttention2 #2015

Uh oh!

hongziqi commented Apr 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

[Feat] Add WhisperFlashAttention2 #2015

[Feat] Add WhisperFlashAttention2 #2015

Uh oh!

Conversation

hongziqi commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Report

Test Code:

MindSpore

PyTorch + Ascend

Related Issues

Uh oh!

Uh oh!

hongziqi commented Apr 8, 2025 •

edited

Loading