Implement MultiHeadAttention Layer #7875

pforderique · 2023-07-26T22:27:38Z

Implements the MultiHeadAttention layer from Keras attention layers.

NOTE:

This implementation does not support RaggedTensors yet.
The Softmax layer was changed to support masking. Let me know if having a separate class for this is better. Regardless, the current change should be non-breaking.

Depends on #7860.

mattsoulanille

LGTM

tfjs-layers/src/layers/advanced_activations.ts

tfjs-layers/src/layers/nlp/multihead_attention.ts

mattsoulanille · 2023-08-07T17:14:16Z

@Linchenn Please take a look when you get a chance. Thanks!

Linchenn

LGTM!

* Implement position embedding * Strip debug ops in jax conversion tests (#7889) INTERNAL This fixes an internal issue with jax tests. See cl/550054296. * Update weights loading (#7872) * Update weights loading * fix tests * remove * fix * fix comments * fix lint * Load python rules in tfjs-converter converters dir (#7892) * Implement MultiHeadAttention Layer (#7875) * Add spec for multi-head attention * Add CachedMultiHeadAttention cache * Fix typos * Lint * Add Transformer Decoder spec * lint * Add Einsum spec * lint * Remove unused type declaration * Move helper functions outside EinsumDense class * Implement Einsum Dense * Address comments * Implement MHA Layer * Add masked softmax support * Fix typo * Check for undef and null * Make buildFromSignature public * Wrap softmax call in tf.tidy * Implement position embedding --------- Co-authored-by: Matthew Soulanille <msoulanille@google.com> Co-authored-by: fengwuyao <131706622+fengwuyao@users.noreply.github.com>

pforderique and others added 18 commits July 13, 2023 09:50

Add spec for multi-head attention

ea472e2

Merge branch 'main' into spec-transformer

20f8358

Add CachedMultiHeadAttention cache

41a105e

Fix typos

6e78ffc

Lint

01d9e2e

Add Transformer Decoder spec

8f08c19

lint

4713c4e

Add Einsum spec

37aca1a

lint

2a6d929

Remove unused type declaration

6dcb7a0

Merge branch 'main' into spec-transformer

db6fc8d

Move helper functions outside EinsumDense class

e589817

Implement Einsum Dense

9bafba5

Address comments

4428cf1

Merge branch 'main' into einsum-dense-impl

871dc34

Implement MHA Layer

9e54a15

Add masked softmax support

acb83e2

Merge branch 'main' into mha-impl

89d4f62

pforderique requested review from mattsoulanille and Linchenn July 26, 2023 22:31

Merge branch 'main' into mha-impl

97c16bf

pforderique mentioned this pull request Jul 31, 2023

Implement CachedMultiHeadAttention layer #7882

Merged

pforderique and others added 4 commits July 31, 2023 16:41

Fix typo

2c74fe4

Check for undef and null

c312335

Merge branch 'master' into mha-impl

d56167e

Make buildFromSignature public

bdc2f4a

pforderique mentioned this pull request Aug 4, 2023

Implement TransformerDecoder layer #7890

Merged

mattsoulanille approved these changes Aug 7, 2023

View reviewed changes

tfjs-layers/src/layers/advanced_activations.ts Show resolved Hide resolved

tfjs-layers/src/layers/nlp/multihead_attention.ts Outdated Show resolved Hide resolved

Wrap softmax call in tf.tidy

c5d65b5

Merge branch 'main' into mha-impl

3bd4828

pforderique enabled auto-merge (squash) August 7, 2023 17:22

Linchenn approved these changes Aug 7, 2023

View reviewed changes

pforderique merged commit 0fd462d into tensorflow:master Aug 7, 2023

pforderique deleted the mha-impl branch August 7, 2023 21:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MultiHeadAttention Layer #7875

Implement MultiHeadAttention Layer #7875

pforderique commented Jul 26, 2023

mattsoulanille left a comment

mattsoulanille commented Aug 7, 2023

Linchenn left a comment

Implement MultiHeadAttention Layer #7875

Implement MultiHeadAttention Layer #7875

Conversation

pforderique commented Jul 26, 2023

mattsoulanille left a comment

Choose a reason for hiding this comment

mattsoulanille commented Aug 7, 2023

Linchenn left a comment

Choose a reason for hiding this comment