[`Umt5`] Add google's umt5 to `transformers` #24477

ArthurZucker · 2023-06-26T04:58:14Z

What does this PR do?

Superseeds #22626 which has been stale for quite some time

A kaggle notebook for reproducing and running the originial model:
https://www.kaggle.com/arthurzucker/umt5-inference

Tokenizer is a BertGenerationTokenizer. Here is how to convert it:

!wget https://storage.googleapis.com/t5-data/vocabs/umt5.256000/sentencepiece.model
from transformers import T5Tokenizer

umt5 = T5Tokenizer("/Users/arthurzucker/Work/transformers/sentencepiece.model")
to_add = []
for i in range(300):
    to_add.append(f"<extra_id_{i}>")
tokenizer.add_tokens(list(reversed(to_add)), True)

84 tokens are free to use apparently.

The modeling code is just MT5, addapted to have more relative bias that is not the same for all layers.
For a first conversion I'll be using this:

python src/transformers/models/t5/convert_t5x_checkpoint_to_pytorch.py --t5x_checkpoint_path "/Users/arthurzucker/Work/transformers/checkpoint_1000000" --config_file "/Users/arthurzucker/Work/transformers/checkpoint_1000000/config.json" --pytorch_dump_path ./ArthurZ --scalable_attention

HuggingFaceDocBuilderDev · 2023-06-26T05:20:48Z

The documentation is not available anymore as the PR was closed or merged.

stefan-it · 2023-06-27T07:46:54Z

Hi @ArthurZucker , thanks for updating this! As far as we can tell, it is not just mT5, because of joined/separate key-value in attention. Was this problem solved in latest conversion script of this PR 🤔

/cc @agemagician

ArthurZucker · 2023-06-27T10:42:08Z

The conversion went well, the outputs are still a bit gibberish but didn’t have problem of un matching shape.
They mentioned that the model is closer to MT5, which is why if we we can have minimal changes, it will look like this + adapted conversion

agemagician · 2023-06-27T10:59:21Z

The conversion went well, the outputs are still a bit gibberish but didn’t have problem of un matching shape. They mentioned that the model is closer to MT5, which is why if we we can have minimal changes, it will look like this + adapted conversion

So far, I can see you made similar changes as we did before, which led to gibberish output.
The one addition change you did is allowing fall back to byte for the tokenizer.

I belive the issue still exist because of the way we reshape and convert the q,k and v for the attention as @stefan-it mentioned.

ArthurZucker · 2023-06-27T12:21:20Z

There is also a different logic for postion_bias which seems to be missing.
Joint 3D matrix vs what we have now can be linked to the new sharding scheme, It probably the last thing to check.

ArthurZucker · 2023-06-27T13:58:13Z

Regarding the split / merge, I don't really see a problem with the code. The checkpoints are split, and the actual code is similar to mt5 with the difference being scanning I believe. However feel free to check and I hope we can get coherent outputs!

Co-authored-by: agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <>

…to add-umt5

ArthurZucker · 2023-06-28T09:18:16Z

Update, the outputs match 🔥 The issue was : the tokenizer

docs/source/en/model_doc/umt5.mdx

… into add-umt5

sgugger

Thanks for adding the modeling file. Have a couple more nits.

docs/source/en/model_doc/umt5.md

src/transformers/models/umt5/modeling_umt5.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

docs/source/en/model_doc/umt5.md

src/transformers/models/umt5/__init__.py

src/transformers/models/umt5/modeling_umt5.py

docs/source/en/model_doc/umt5.md

Co-authored-by: Stefan Schweter <stefan@schweter.it>

ArthurZucker · 2023-07-02T04:36:34Z

Currently setting up an instance ton convert an upload the xxl model, other models are available here

add tokenization template

7997b9d

ArthurZucker changed the title ~~[Umt5] Add googl's umt5 to transformesr~~ [Umt5] Add google's umt5 to transformers Jun 26, 2023

ArthurZucker added 3 commits June 26, 2023 19:20

update conversion script

92ddd5f

update modeling code

9a50e90

update

9f38d02

ArthurZucker mentioned this pull request Jun 26, 2023

Add unigram bytefallback huggingface/tokenizers#1217

Merged

ArthurZucker added 4 commits June 26, 2023 21:29

update convert checkpoint

5be546f

update modeling

ef0f5ab

revert changes on convert script

ec27200

new conversion script for new format

74cb2b1

correct position bias

df57c75

cleaning a bit

0f5c507

ArthurZucker and others added 10 commits June 28, 2023 04:55

Credit co authors

eac93b8

Co-authored-by: agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <>

styling

083e3a2

Add docq

5a45e1b

fix copies

ed9da8f

add co author

a5cfeb8

Other Author

abf02ed

Merge branch 'main' of https://github.com/huggingface/transformers in…

24e89b1

…to add-umt5

Merge branch 'main' of https://github.com/huggingface/transformers in…

6c5d9d4

…to add-umt5

add testing

6def1cc

nit

7e4f4e6

stefan-it reviewed Jun 28, 2023

View reviewed changes

docs/source/en/model_doc/umt5.mdx Outdated Show resolved Hide resolved

ArthurZucker added 10 commits June 30, 2023 08:21

update test

d9e4aa5

Merge branch 'add-umt5' of https://github.com/ArthurZucker/transformers…

1d3d449

… into add-umt5

more fixes

9c5c46a

add to mapping

f49cd73

fix-copies

223151c

fix copies

50b03e9

foix retain grad

48b0c41

fix some tests

ee6a9b8

nits

f2a2046

done

f772cbe

sgugger approved these changes Jun 30, 2023

View reviewed changes

docs/source/en/model_doc/umt5.md Outdated Show resolved Hide resolved

src/transformers/models/umt5/modeling_umt5.py Outdated Show resolved Hide resolved

src/transformers/models/umt5/modeling_umt5.py Outdated Show resolved Hide resolved

Update src/transformers/models/umt5/modeling_umt5.py

a00270b

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ArthurZucker commented Jun 30, 2023

View reviewed changes

docs/source/en/model_doc/umt5.md Outdated Show resolved Hide resolved

Update docs/source/en/model_doc/umt5.md

324acb3

ArthurZucker commented Jun 30, 2023

View reviewed changes

src/transformers/models/umt5/__init__.py Outdated Show resolved Hide resolved

ArthurZucker commented Jun 30, 2023

View reviewed changes

src/transformers/models/umt5/modeling_umt5.py Outdated Show resolved Hide resolved

stefan-it reviewed Jul 1, 2023

View reviewed changes

docs/source/en/model_doc/umt5.md Outdated Show resolved Hide resolved

ArthurZucker and others added 9 commits July 1, 2023 20:37

Update src/transformers/models/umt5/__init__.py

d13c544

Update docs/source/en/model_doc/umt5.md

4fbae39

Co-authored-by: Stefan Schweter <stefan@schweter.it>

Update src/transformers/models/umt5/modeling_umt5.py

00063ff

update conversion script + use google checkpoints

3e26583

nits

0a3d470

update test and modelling

cdbaf71

stash slow convert

5be4769

update fixupd

00c7c87

don't change slow

c0b0a67

ArthurZucker merged commit 799df10 into huggingface:main Jul 3, 2023

ArthurZucker mentioned this pull request Jul 3, 2023

Convert T5x "Scalable T5" models to PyTorch #22573

Closed

long21wt mentioned this pull request Oct 2, 2023

UmT5 Flax modelling #26529

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`Umt5`] Add google's umt5 to `transformers` #24477

[`Umt5`] Add google's umt5 to `transformers` #24477

ArthurZucker commented Jun 26, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 26, 2023 •

edited

Loading

stefan-it commented Jun 27, 2023

ArthurZucker commented Jun 27, 2023

agemagician commented Jun 27, 2023

ArthurZucker commented Jun 27, 2023

ArthurZucker commented Jun 27, 2023 •

edited

Loading

ArthurZucker commented Jun 28, 2023

sgugger left a comment

ArthurZucker commented Jul 2, 2023

[Umt5] Add google's umt5 to transformers #24477

[Umt5] Add google's umt5 to transformers #24477

Conversation

ArthurZucker commented Jun 26, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Jun 26, 2023 • edited Loading

stefan-it commented Jun 27, 2023

ArthurZucker commented Jun 27, 2023

agemagician commented Jun 27, 2023

ArthurZucker commented Jun 27, 2023

ArthurZucker commented Jun 27, 2023 • edited Loading

ArthurZucker commented Jun 28, 2023

sgugger left a comment

Choose a reason for hiding this comment

ArthurZucker commented Jul 2, 2023

[`Umt5`] Add google's umt5 to `transformers` #24477

[`Umt5`] Add google's umt5 to `transformers` #24477

ArthurZucker commented Jun 26, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 26, 2023 •

edited

Loading

ArthurZucker commented Jun 27, 2023 •

edited

Loading