[Flax] Add FlaxMBart #12236

stancld · 2021-06-17T21:55:07Z

What does this PR do?

This PR adds flax implementation of MBart.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten @patil-suraj

patil-suraj

Thanks a lot for adding this @stancld!

It's looking great overall, I left a few comments. Specifically

The order of layer norm and attention and
Could we add as much copied from statements as possible?

src/transformers/models/mbart/modeling_flax_mbart.py

tests/test_modeling_flax_mbart.py

* Fix shift_tokens_right method according to MBart implementation * Update shift_tokens_right in tests accordingly * Fix the import issue and update docs file * make style quality

* Change the order of normalization layer and attention * Add some copu statementes

stancld · 2021-06-19T13:43:18Z

@patil-suraj Thank you a lot for your suggestions. I fixed the order of attention and normalization layers and some other minor bugs. Also added some additional copy statements.

I also changed the shift_tokens_right method as this one looks to be different for the MBart models as they don't have a single decoder_start_token_id in contrast to other Bart-like models. => This difference of having no decoder_start_token_id, however, currently leads to some issues within the generate method. (I'll try to have a look what can be done here)

patil-suraj

Looks good to me! Just left a couple of comments that need to be taken care of before merging.

I will run all slow tests and push the checkpoints to the hub before merging.

src/transformers/generation_flax_utils.py

src/transformers/models/mbart/modeling_flax_mbart.py

Besides, add `lang_code_to_id` to MBartTokenizeFast

patil-suraj

The encoder and decoder reqiores an extra layer_norm at the end

src/transformers/models/mbart/modeling_flax_mbart.py

tests/test_modeling_flax_mbart.py

src/transformers/models/mbart/modeling_flax_mbart.py

patil-suraj · 2021-07-06T10:00:18Z

@stancld I pushed a couple of commits to add the layer_norm in encoder and decoder. Now, all slow tests are passing.
@patrickvonplaten could you please take a final look?

patrickvonplaten · 2021-07-06T10:37:55Z

src/transformers/models/mbart/modeling_flax_mbart.py

+        [prev_output_tokens[i, eos_idx] for i, eos_idx in enumerate(index_of_eos)]
+    ).squeeze()
+    # for loop basically does jax-compatible version of prev_output_tokens[:, 1:] = prev_output_tokens[:, :-1].clone()
+    for i in range(prev_output_tokens.shape[1], 0, -1):


src/transformers/models/mbart/modeling_flax_mbart.py

patrickvonplaten

Very nice!

stancld added 3 commits June 17, 2021 21:01

Copy BART to MBart and rename some stuff

2fbe0ca

Add copy statements pointing to FlaxBart

b3f44c1

Update/add some common files

f9af9d7

patil-suraj reviewed Jun 18, 2021

View reviewed changes

stancld added 3 commits June 19, 2021 15:08

Update shift_tokens_rigth + fix imports

0b48981

* Fix shift_tokens_right method according to MBart implementation * Update shift_tokens_right in tests accordingly * Fix the import issue and update docs file * make style quality

Merge master into flax_mbart

e878c8d

Do some minor changes according to patil-suraj suggestions

a785449

* Change the order of normalization layer and attention * Add some copu statementes

Update generate method and add integration test for mBart

8207853

stancld changed the title ~~[WIP] [Flax] Add FlaxMBart~~ [Flax] Add FlaxMBart Jun 20, 2021

patil-suraj approved these changes Jun 22, 2021

View reviewed changes

src/transformers/generation_flax_utils.py Outdated Show resolved Hide resolved

src/transformers/models/mbart/modeling_flax_mbart.py Outdated Show resolved Hide resolved

src/transformers/models/mbart/modeling_flax_mbart.py Outdated Show resolved Hide resolved

patil-suraj requested a review from patrickvonplaten June 22, 2021 17:22

stancld added 5 commits June 23, 2021 16:34

Make a few updates after a review

354fbe1

Besides, add `lang_code_to_id` to MBartTokenizeFast

Merge master into flax_mbart and resolve conflicts

1d65677

fix-copies; make style quality

6ec3dff

Merge master into flax_mbart

7ff0291

Merge branch 'master' into flax_mbart

36634bd

patil-suraj reviewed Jul 6, 2021

View reviewed changes

Apply suggestions from code review

aa673a5

patil-suraj reviewed Jul 6, 2021

View reviewed changes

tests/test_modeling_flax_mbart.py Show resolved Hide resolved

Apply suggestions from code review

dea24ef

patil-suraj reviewed Jul 6, 2021

View reviewed changes

src/transformers/models/mbart/modeling_flax_mbart.py Show resolved Hide resolved

patil-suraj added 2 commits July 6, 2021 15:11

Apply suggestions from code review

9a64268

fix output type, style

ad3deff

patrickvonplaten reviewed Jul 6, 2021

View reviewed changes

src/transformers/models/mbart/modeling_flax_mbart.py Show resolved Hide resolved

patrickvonplaten reviewed Jul 6, 2021

View reviewed changes

src/transformers/models/mbart/modeling_flax_mbart.py Show resolved Hide resolved

patrickvonplaten approved these changes Jul 6, 2021

View reviewed changes

add copied from

42303f7

patil-suraj added 2 commits July 7, 2021 11:59

Merge branch 'master' into flax_mbart

f2b78f1

resolve conflicts

28ac4f4

patil-suraj merged commit 61400e1 into huggingface:master Jul 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flax] Add FlaxMBart #12236

[Flax] Add FlaxMBart #12236

stancld commented Jun 17, 2021

patil-suraj left a comment

stancld commented Jun 19, 2021 •

edited

Loading

patil-suraj left a comment •

edited

Loading

patil-suraj left a comment

patil-suraj commented Jul 6, 2021

patrickvonplaten Jul 6, 2021

patrickvonplaten left a comment

[Flax] Add FlaxMBart #12236

[Flax] Add FlaxMBart #12236

Conversation

stancld commented Jun 17, 2021

What does this PR do?

Before submitting

Who can review?

patil-suraj left a comment

Choose a reason for hiding this comment

stancld commented Jun 19, 2021 • edited Loading

patil-suraj left a comment • edited Loading

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

patil-suraj commented Jul 6, 2021

patrickvonplaten Jul 6, 2021

Choose a reason for hiding this comment

patrickvonplaten left a comment

Choose a reason for hiding this comment

stancld commented Jun 19, 2021 •

edited

Loading

patil-suraj left a comment •

edited

Loading