Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

ylacombe · 2024-03-28T14:17:27Z

What does this PR do?

Supersedes #27924

The attention tests all pass but there are no integration equivalence between the original attention models and the FA ones. I don't hear any difference in quality despite not being the same song, though.

cc @sanchit-gandhi and @amyeroberts, could you review please?

HuggingFaceDocBuilderDev · 2024-03-28T14:36:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sanchit-gandhi

Thanks for adding this!

sanchit-gandhi · 2024-03-28T15:40:34Z

src/transformers/models/musicgen/configuration_musicgen.py

@@ -239,3 +239,20 @@ def from_sub_models_config(
    # This is a property because you might want to change the codec model on the fly
    def sampling_rate(self):
        return self.audio_encoder.sampling_rate
+
+    @property
+    def _attn_implementation(self):


This method is one-to-one the same as in the PreTrainedConfig class:

transformers/src/transformers/configuration_utils.py

Line 407 in 536ea2a

def _attn_implementation(self):

Can we remove it from here?

Not if we want to keep the setter part!

sanchit-gandhi · 2024-03-28T15:41:28Z

src/transformers/models/musicgen/modeling_musicgen.py

+
+MUSICGEN_ATTENTION_CLASSES = {
+    "eager": MusicgenAttention,
+    "flash_attention_2": MusicgenFlashAttention2,


Worth adding sdpa in one go as well? Would enable you to showcase attention implementation through sdpa on free tier Colab T4 GPU (where FA2 is not available)

sanchit-gandhi · 2024-03-28T15:43:03Z

src/transformers/models/musicgen_melody/configuration_musicgen_melody.py

@@ -254,3 +252,20 @@ def from_sub_models_config(
    # This is a property because you might want to change the codec model on the fly
    def sampling_rate(self):
        return self.audio_encoder.sampling_rate
+
+    @property
+    def _attn_implementation(self):


sanchit-gandhi · 2024-03-28T15:43:41Z

tests/models/musicgen/test_modeling_musicgen.py

+                    else outputs_fa.decoder_hidden_states[-1]
+                )
+
+                assert torch.allclose(logits_fa[1:], logits[1:], atol=4e-2, rtol=4e-2)


Good enough for a generative audio model with FA2

I've copied out the same tolerance threshold than any other models (regardless of modality) btw

ylacombe · 2024-03-29T13:08:45Z

I've also added SDPA!

cc @amyeroberts or @ArthurZucker could you review when you have time?

ArthurZucker

LGTM! Tests are ... huge, would be nice if you can use copied from, would help the review 😅

ArthurZucker · 2024-03-30T17:54:04Z

src/transformers/models/musicgen/modeling_musicgen.py

+        self._use_flash_attention_2 = config._attn_implementation == "flash_attention_2"
+        self._use_sdpa = config._attn_implementation == "sdpa"


let's only save self._attn_implementation please

ArthurZucker · 2024-03-30T17:54:45Z

src/transformers/models/musicgen_melody/modeling_musicgen_melody.py

+
+        return attn_output, None, past_key_value
+
+


copied from can be used here as well!

ArthurZucker

Ouf! Thanks for the big PR and adding those tests!

* add FA2 to o.g Musicgen * make style * add FA2 support to Musicgen Melody * add generation FA2 tests to o.g Musicgen * make style and fix copies * add Musicgen to FA2 docs + deprecate list * add sdpa supports to Musicgen's * make style and fix copies * refactor attention implementation arguments * add Copied from to sdpa tests * add copied form in sdpa tests melody * add copied for FA2 generation tests * add FA2 inference copied from * make style

ylacombe added 6 commits March 28, 2024 10:13

add FA2 to o.g Musicgen

a7207c9

make style

cb3fd9f

add FA2 support to Musicgen Melody

716c18a

add generation FA2 tests to o.g Musicgen

a165c30

make style and fix copies

b0f4258

add Musicgen to FA2 docs + deprecate list

3d7e1a5

ylacombe requested review from amyeroberts and sanchit-gandhi March 28, 2024 14:19

sanchit-gandhi approved these changes Mar 28, 2024

View reviewed changes

ylacombe and others added 3 commits March 29, 2024 09:25

Merge branch 'huggingface:main' into add-FA2-musicgen

6ca29ce

add sdpa supports to Musicgen's

0a3ce47

make style and fix copies

4d5f009

ylacombe requested a review from ArthurZucker March 29, 2024 13:08

ArthurZucker reviewed Mar 30, 2024

View reviewed changes

ylacombe and others added 5 commits April 1, 2024 19:05

refactor attention implementation arguments

1231285

add Copied from to sdpa tests

2ed6e9e

Merge branch 'huggingface:main' into add-FA2-musicgen

1479ef2

add copied form in sdpa tests melody

1146c92

add copied for FA2 generation tests

16657c6

ArthurZucker approved these changes Apr 2, 2024

View reviewed changes

ylacombe added 2 commits April 2, 2024 11:23

add FA2 inference copied from

c36bf26

make style

9fbd9ec

ylacombe merged commit 0d04b1e into huggingface:main Apr 2, 2024
21 checks passed

sanchit-gandhi mentioned this pull request Apr 3, 2024

[MusicGen] SDPA gives nans/infs during sampling #30020

Closed

4 tasks

xenova mentioned this pull request Apr 5, 2024

Musicgen ONNX export (text-conditional only) huggingface/optimum#1779

Merged

amosyou mentioned this pull request Apr 19, 2024

Flash Attention 2 for audio/musicgen #27552

Closed

This was referenced May 23, 2024

Flash Attention Support huggingface/parler-tts#55

Open

Add Flash Attention 2 support to ParlerTTS huggingface/parler-tts#59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

ylacombe commented Mar 28, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 28, 2024

sanchit-gandhi left a comment

sanchit-gandhi Mar 28, 2024

ylacombe Mar 28, 2024

sanchit-gandhi Mar 28, 2024

sanchit-gandhi Mar 28, 2024

sanchit-gandhi Mar 28, 2024

ylacombe Mar 28, 2024

ylacombe commented Mar 29, 2024

ArthurZucker left a comment

ArthurZucker Mar 30, 2024

ArthurZucker Mar 30, 2024

ArthurZucker left a comment

		self._use_flash_attention_2 = config._attn_implementation == "flash_attention_2"
		self._use_sdpa = config._attn_implementation == "sdpa"

Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

Add Flash Attention 2 support to Musicgen and Musicgen Melody #29939

Conversation

ylacombe commented Mar 28, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Mar 28, 2024

sanchit-gandhi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylacombe commented Mar 29, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ylacombe commented Mar 28, 2024 •

edited

Loading