Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Seamless M4T model #25693

Merged
merged 250 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from 152 commits
Commits
Show all changes
250 commits
Select commit Hold shift + click to select a range
fb7d0ab
first raw commit
ylacombe Aug 16, 2023
48da0bf
still POC
ylacombe Aug 17, 2023
2c493a5
tentative convert script
ylacombe Aug 17, 2023
ef5106d
almost working speech encoder conversion scripts
ylacombe Aug 18, 2023
d83ea6b
intermediate code for encoder/decoders
ylacombe Aug 18, 2023
f0bc513
add modeling code
ylacombe Aug 18, 2023
70661ae
first version of speech encoder
ylacombe Aug 18, 2023
3874353
make style
ylacombe Aug 18, 2023
c37b7bd
add new adapter layer architecture
ylacombe Aug 18, 2023
3ded19b
add adapter block
ylacombe Aug 18, 2023
0bf81fd
add first tentative config
ylacombe Aug 18, 2023
4bbf681
add working speech encoder conversion
ylacombe Aug 18, 2023
e54bdd5
base model convert works now
ylacombe Aug 20, 2023
0de52f7
make style
ylacombe Aug 20, 2023
ca7a980
remove unnecessary classes
ylacombe Aug 20, 2023
aac2a34
remove unecessary functions
ylacombe Aug 20, 2023
3735b07
add modeling code speech encoder
ylacombe Aug 20, 2023
66225db
rework logics
ylacombe Aug 20, 2023
41a826f
forward pass of sub components work
ylacombe Aug 21, 2023
ae3a7e0
add modeling codes
ylacombe Aug 21, 2023
4f29e2e
some config modifs and modeling code modifs
ylacombe Aug 21, 2023
4692f59
save WIP
ylacombe Aug 22, 2023
451d11e
new edits
ylacombe Aug 22, 2023
319333e
same output speech encoder
ylacombe Aug 22, 2023
5928c72
correct attention mask
ylacombe Aug 23, 2023
8bd3a17
correct attention mask
ylacombe Aug 23, 2023
0342b69
fix generation
ylacombe Aug 23, 2023
09331ac
new generation logics
ylacombe Aug 23, 2023
b20f23b
erase comments
ylacombe Aug 23, 2023
74d06c1
make style
ylacombe Aug 23, 2023
38446d5
fix typo
ylacombe Aug 23, 2023
67cf10e
add some descriptions
ylacombe Aug 23, 2023
8568cfb
new state
ylacombe Aug 23, 2023
5fed5c0
clean imports
ylacombe Aug 23, 2023
66920c1
add tests
ylacombe Aug 23, 2023
b6a5368
make style
ylacombe Aug 23, 2023
6909a02
make beam search and num_return_sequences>1 works
ylacombe Aug 24, 2023
c96f127
correct edge case issue
ylacombe Aug 24, 2023
f525f24
correct SeamlessM4TConformerSamePadLayer copied from
ylacombe Aug 24, 2023
850990b
replace ACT2FN relu by nn.relu
ylacombe Aug 24, 2023
c8d00ea
remove unecessary return variable
ylacombe Aug 24, 2023
fc031e4
move back a class
ylacombe Aug 24, 2023
6ca23e3
change name conformer_attention_mask ->conv_attention_mask
ylacombe Aug 24, 2023
8a907ce
better nit code
ylacombe Aug 24, 2023
f9ae3ac
add some Copied from statements
ylacombe Aug 24, 2023
b5a33fc
small nits
ylacombe Aug 24, 2023
ab97f67
small nit in dict.get
ylacombe Aug 24, 2023
88d1d76
rename t2u model -> conditionalgeneration
ylacombe Aug 24, 2023
ffafd66
ongoing refactoring of structure
ylacombe Aug 24, 2023
66ded60
update models architecture
ylacombe Aug 24, 2023
3fb3100
remove SeamlessM4TMultiModal classes
ylacombe Aug 24, 2023
bf81144
add tests
ylacombe Aug 24, 2023
d0310af
adapt tests
ylacombe Aug 24, 2023
5226aac
some non-working code for vocoder
ylacombe Aug 24, 2023
4b470ea
add seamlessM4T vocoder
ylacombe Aug 25, 2023
8bf0e37
remove buggy line
ylacombe Aug 25, 2023
1e48bc7
fix some hifigan related bugs
ylacombe Aug 25, 2023
42eb3e2
remove hifigan specifc config
ylacombe Aug 25, 2023
e0d8eb9
change
ylacombe Aug 25, 2023
ae11f30
add WIP tokenization
ylacombe Aug 25, 2023
7fa366d
add seamlessM4T working tokenzier
ylacombe Aug 28, 2023
aef9ac3
update tokenization
ylacombe Aug 28, 2023
75099dd
add tentative feature extractor
ylacombe Aug 28, 2023
c97a7a7
Update converting script
ylacombe Aug 28, 2023
a82f7b3
update working FE
ylacombe Aug 29, 2023
9786302
refactor input_values -> input_features
ylacombe Aug 29, 2023
837e160
update FE
ylacombe Aug 29, 2023
9e2ea89
changes in generation, tokenizer and modeling
ylacombe Aug 30, 2023
6a8bd6f
make style and add t2u_decoder_input_ids
ylacombe Aug 30, 2023
c676019
add intermediate outputs for ToSpeech models
ylacombe Aug 30, 2023
5894115
add vocoder to speech models
ylacombe Aug 30, 2023
a9ad3dc
update valueerror
ylacombe Aug 30, 2023
03915d7
update FE with languages
ylacombe Aug 30, 2023
0ebc542
add vocoder convert
ylacombe Aug 30, 2023
f6d5e7c
update config docstrings and names
ylacombe Aug 31, 2023
02b2ba4
update generation code and configuration
ylacombe Aug 31, 2023
82acf95
remove todos and update config.pad_token_id to generation_config.pad_…
ylacombe Aug 31, 2023
7f447b6
move block vocoder
ylacombe Aug 31, 2023
75230e4
remove unecessary code and uniformize tospeech code
ylacombe Aug 31, 2023
e2c4a68
add feature extractor import
ylacombe Aug 31, 2023
87ed6bc
make style and fix some copies from
ylacombe Aug 31, 2023
a1cffc2
correct consistency + make fix-copies
ylacombe Aug 31, 2023
f540155
add processor code
ylacombe Aug 31, 2023
da17767
remove comments
ylacombe Aug 31, 2023
ec4b204
add fast tokenizer support
ylacombe Aug 31, 2023
4a8c7af
correct pad_token_id in M4TModel
ylacombe Sep 1, 2023
e91c55b
correct config
ylacombe Sep 1, 2023
b6e0bc8
update tests and codes + make style
ylacombe Sep 3, 2023
5c1df1f
make some suggested correstion - correct comments and change naming
ylacombe Sep 3, 2023
e92c64e
rename some attributes
ylacombe Sep 3, 2023
46d6085
rename some attributes
ylacombe Sep 3, 2023
d26e04e
remove unecessary sequential
ylacombe Sep 3, 2023
f490ac1
remove option to use dur predictor
ylacombe Sep 3, 2023
3384612
nit
ylacombe Sep 3, 2023
69d5508
refactor hifigan
ylacombe Sep 3, 2023
c45fe50
replace normalize_mean and normalize_var with do_normalize + save lan…
ylacombe Sep 4, 2023
7c0d981
add tests
ylacombe Sep 4, 2023
c2e3547
Merge branch 'main' into add-S2S-model
ylacombe Sep 4, 2023
2d59fa0
change tgt_lang logic
ylacombe Sep 5, 2023
7173baa
update generation ToSpeech
ylacombe Sep 5, 2023
f1a38f7
add support import SeamlessM4TProcessor
ylacombe Sep 5, 2023
305e16c
fix generate
ylacombe Sep 5, 2023
067d918
make tests
ylacombe Sep 5, 2023
c4fb4ce
update integration tests, add option to only return text and update t…
ylacombe Sep 5, 2023
7d39862
fix wrong function call
ylacombe Sep 5, 2023
d177e01
update import and convert script
ylacombe Sep 5, 2023
a85ae94
update integration tests + update repo id
ylacombe Sep 5, 2023
f662725
correct paths and add first test
ylacombe Sep 5, 2023
47c0bc5
update how new attention masks are computed
ylacombe Sep 6, 2023
8060aa4
update tests
ylacombe Sep 6, 2023
cd3878b
take first care of batching in vocoder code
ylacombe Sep 6, 2023
bbb398d
add batching with the vocoder
ylacombe Sep 6, 2023
808366f
add waveform lengths to model outputs
ylacombe Sep 6, 2023
d96eba5
make style
ylacombe Sep 6, 2023
aeb1a67
add generate kwargs + forward kwargs of M4TModel
ylacombe Sep 6, 2023
e62d681
add docstrings forward methods
ylacombe Sep 7, 2023
1d68419
reformate docstrings
ylacombe Sep 7, 2023
ea08dc3
add docstrings t2u model
ylacombe Sep 7, 2023
9e8a8b8
add another round of modeling docstrings + reformate speaker_id -> sp…
ylacombe Sep 7, 2023
7c65688
make style
ylacombe Sep 7, 2023
7779477
fix check_repo
ylacombe Sep 7, 2023
7f613ae
make style
ylacombe Sep 7, 2023
b804e3d
add seamlessm4t to toctree
ylacombe Sep 7, 2023
6af3b28
correct check_config_attributes
ylacombe Sep 7, 2023
cd9e2b4
write config docstrings + some modifs
ylacombe Sep 7, 2023
dff8d8f
make style
ylacombe Sep 7, 2023
a046830
add docstrings tokenizer
ylacombe Sep 7, 2023
703863a
add docstrings to processor, fe and tokenizers
ylacombe Sep 7, 2023
02cc3e7
make style
ylacombe Sep 7, 2023
8128c66
write first version of model docs
ylacombe Sep 7, 2023
e08c86f
fix FE + correct FE test
ylacombe Sep 12, 2023
1bee27d
fix tokenizer + add correct integration tests
ylacombe Sep 12, 2023
22edd86
fix most tokenization tests
ylacombe Sep 12, 2023
22edbb1
make style
ylacombe Sep 12, 2023
9087bcf
correct most processor test
ylacombe Sep 12, 2023
da31ddb
add generation tests and fix num_return_sequences > 1
ylacombe Sep 12, 2023
a2d4f7f
correct integration tests -still one left
ylacombe Sep 12, 2023
548e79a
make style
ylacombe Sep 12, 2023
31a8ea9
correct position embedding
ylacombe Sep 13, 2023
5d6caba
change numbeams to 1
ylacombe Sep 13, 2023
b9deb48
refactor some modeling code and correct one test
ylacombe Sep 13, 2023
43b92cd
make style
ylacombe Sep 13, 2023
1d35ba4
correct typo
ylacombe Sep 13, 2023
ad1e476
refactor intermediate fnn
ylacombe Sep 13, 2023
b5967c1
refactor feedforward conformer
ylacombe Sep 13, 2023
0f2682d
make style
ylacombe Sep 13, 2023
a1d9238
remove comments
ylacombe Sep 13, 2023
95aefed
make style
ylacombe Sep 13, 2023
872789f
fix tokenizer tests
ylacombe Sep 14, 2023
f50ff49
make style
ylacombe Sep 14, 2023
b0ee7e1
correct processor tests
ylacombe Sep 14, 2023
61e880a
make style
ylacombe Sep 14, 2023
95e8c85
correct S2TT integration
ylacombe Sep 15, 2023
8220a9e
Apply suggestions from Sanchit code review
ylacombe Sep 18, 2023
816559d
correct typo
ylacombe Sep 18, 2023
60b8755
replace torch.nn->nn + make style
ylacombe Sep 18, 2023
286960b
change Output naming (waveforms -> waveform) and ordering
ylacombe Sep 18, 2023
411d5bd
nit renaming and formating
ylacombe Sep 18, 2023
c8afa46
remove return None when not necessary
ylacombe Sep 18, 2023
8c407b1
refactor SeamlessM4TConformerFeedForward
ylacombe Sep 18, 2023
25a83ef
nit typo
ylacombe Sep 18, 2023
771f988
remove almost copied from comments
ylacombe Sep 18, 2023
6add43a
add a copied from comment and remove an unecessary dropout
ylacombe Sep 18, 2023
fb85bb4
remove inputs_embeds from speechencoder
ylacombe Sep 18, 2023
82123b7
remove backward compatibiliy function
ylacombe Sep 18, 2023
7c04630
reformate class docstrings for a few components
ylacombe Sep 18, 2023
f02a3cb
remove unecessary methods
ylacombe Sep 18, 2023
7475a9f
split over 2 lines smthg hard to read
ylacombe Sep 18, 2023
19c5700
make style
ylacombe Sep 18, 2023
f7724ed
replace two steps offset by one step as suggested
ylacombe Sep 18, 2023
e1ace1a
nice typo
ylacombe Sep 18, 2023
4effd11
move warnings
ylacombe Sep 18, 2023
bf52c78
remove useless lines from processor
ylacombe Sep 18, 2023
d10fb09
make generation non-standard test more robusts
ylacombe Sep 18, 2023
5cb8df6
remove torch.inference_mode from tests
ylacombe Sep 18, 2023
24038ed
split integration tests
ylacombe Sep 18, 2023
35951a7
enrich md
ylacombe Sep 18, 2023
506fd19
rename control_symbol_vocoder_offset->vocoder_offset
ylacombe Sep 18, 2023
bfab469
clean convert file
ylacombe Sep 18, 2023
4fc1f0f
remove tgt_lang and src_lang from FE
ylacombe Sep 18, 2023
415f674
change generate docstring of ToText models
ylacombe Sep 18, 2023
f69314c
update generate docstring of tospeech models
ylacombe Sep 18, 2023
1d4ce12
unify how to deal withtext_decoder_input_ids
ylacombe Sep 18, 2023
dde7de0
add default spkr_id
ylacombe Sep 18, 2023
d6994c3
unify tgt_lang for t2u_model
ylacombe Sep 18, 2023
46efba8
simplify tgt_lang verification
ylacombe Sep 18, 2023
8b82f20
remove a todo
ylacombe Sep 18, 2023
a0e00a6
change config docstring
ylacombe Sep 18, 2023
4ead78c
make style
ylacombe Sep 18, 2023
ada4824
simplify t2u_tgt_lang_id
ylacombe Sep 18, 2023
a0897f1
make style
ylacombe Sep 18, 2023
5b2367d
enrich/correct comments
ylacombe Sep 18, 2023
eb597c9
enrich .md
ylacombe Sep 18, 2023
c7ec3ce
correct typo in docstrings
ylacombe Sep 19, 2023
1af4ee1
add torchaudio dependency
ylacombe Sep 19, 2023
4138711
Merge branch 'huggingface:main' into add-S2S-model
ylacombe Sep 19, 2023
ded425c
update tokenizer
ylacombe Sep 19, 2023
a527ed0
make style and fix copies
ylacombe Sep 19, 2023
39a8265
modify SeamlessM4TConverter with new tokenizer behaviour
ylacombe Sep 19, 2023
d0f82f4
make style
ylacombe Sep 19, 2023
57b5ad4
correct small typo docs
ylacombe Sep 19, 2023
3785ebe
fix import
ylacombe Sep 19, 2023
d094293
update docs and add requirement to tests
ylacombe Sep 20, 2023
6b41584
Merge branch 'main' into add-S2S-model
ylacombe Sep 20, 2023
273dd9e
add convert_fairseq2_to_hf in utils/not_doctested.txt
ylacombe Sep 21, 2023
a10ff31
Merge branch 'huggingface:main' into add-S2S-model
ylacombe Sep 21, 2023
faae35d
update FE
ylacombe Sep 22, 2023
4e7ea18
fix imports and make style
ylacombe Sep 22, 2023
d9a35a3
remove torchaudio in FE test
ylacombe Sep 22, 2023
ce126eb
add seamless_m4t.md to utils/not_doctested.txt
ylacombe Sep 22, 2023
cb4ccf7
nits and change the way docstring dataset is loaded
ylacombe Sep 28, 2023
0a1bdd4
move checkpoints from ylacombe/ to facebook/ orga
ylacombe Sep 28, 2023
63a01ad
refactor warning/error to be in the 119 line width limit
ylacombe Sep 28, 2023
b1f375b
round overly precised floats
ylacombe Sep 28, 2023
a28f6a2
add stereo audio behaviour
ylacombe Sep 28, 2023
b32bcd2
refactor .md and make style
ylacombe Sep 28, 2023
e9cb1a4
enrich docs with more precised architecture description
ylacombe Oct 6, 2023
1b310fc
readd undocumented models
ylacombe Oct 6, 2023
c4b70fd
Merge branch 'main' into add-S2S-model
ylacombe Oct 6, 2023
0772b68
make fix-copies
ylacombe Oct 6, 2023
9c47abd
apply some suggestions
ylacombe Oct 6, 2023
782c8e3
Apply suggestions from code review
ylacombe Oct 6, 2023
4257721
correct bug from previous commit
ylacombe Oct 6, 2023
102a448
refactor a parameter allowing to clean the code + some small nits
ylacombe Oct 6, 2023
fe9ceca
clean tokenizer
ylacombe Oct 9, 2023
a68ff89
make style and fix
ylacombe Oct 9, 2023
cc4fbfb
make style
ylacombe Oct 9, 2023
15c5bce
clean tokenizers arguments
ylacombe Oct 10, 2023
071532f
add precisions for some tests
ylacombe Oct 10, 2023
789f421
move docs from not_tested to slow
ylacombe Oct 10, 2023
48b3488
modify tokenizer according to last comments
ylacombe Oct 10, 2023
ebee245
add copied from statements in tests
ylacombe Oct 10, 2023
87a5886
correct convert script
ylacombe Oct 11, 2023
e4685cb
Merge branch 'huggingface:main' into add-S2S-model
ylacombe Oct 17, 2023
cad5136
correct parameter docstring style
ylacombe Oct 17, 2023
a4f437d
correct tokenization
ylacombe Oct 18, 2023
c367cb9
correct multi gpus
ylacombe Oct 18, 2023
b137431
make style
ylacombe Oct 18, 2023
8c7f5a4
clean modeling code
ylacombe Oct 18, 2023
22aca15
make style
ylacombe Oct 18, 2023
b0e2626
add copied from statements
ylacombe Oct 18, 2023
bec7235
add copied statements
ylacombe Oct 18, 2023
14c4d4a
add support with ASR pipeline
ylacombe Oct 18, 2023
121187a
remove file added inadvertently
ylacombe Oct 18, 2023
0563778
fix docstrings seamlessM4TModel
ylacombe Oct 19, 2023
7620fd6
add seamlessM4TConfig to OBJECTS_TO_IGNORE due of unconventional mark…
ylacombe Oct 19, 2023
79dda0a
Merge branch 'huggingface:main' into add-S2S-model
ylacombe Oct 20, 2023
e65cf14
add seamlessm4t to assisted generation ignored models
ylacombe Oct 20, 2023
8682fc1
Merge branch 'huggingface:main' into add-S2S-model
ylacombe Oct 20, 2023
6369fd6
Merge branch 'huggingface:main' into add-S2S-model
ylacombe Oct 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -450,6 +450,7 @@ Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://h
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (from WeChatAI) released with the paper [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) by HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou.
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (from ZhuiyiTechnology), released together with the paper [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/abs/2104.09864) by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu.
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (from Bo Peng), released on [this repo](https://github.com/BlinkDL/RWKV-LM) by Bo Peng.
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seamless communication team did not share the author names? 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really long list here, and there are at least 10 authors with equal contribution!
image

1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (from NVIDIA) released with the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (from Meta AI) released with the paper [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf) by Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (from ASAPP) released with the paper [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870) by Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi.
Expand Down
1 change: 1 addition & 0 deletions README_es.md
Original file line number Diff line number Diff line change
Expand Up @@ -427,6 +427,7 @@ Número actual de puntos de control: ![](https://img.shields.io/endpoint?url=htt
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (from WeChatAI) released with the paper [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) by HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou.
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (from ZhuiyiTechnology), released together with the paper [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/abs/2104.09864) by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu.
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (from Bo Peng) released with the paper [this repo](https://github.com/BlinkDL/RWKV-LM) by Bo Peng.
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (from NVIDIA) released with the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (from Meta AI) released with the paper [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf) by Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (from ASAPP) released with the paper [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870) by Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi.
Expand Down
1 change: 1 addition & 0 deletions README_hd.md
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,7 @@ conda install -c huggingface transformers
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (from WeChatAI) released with the paper [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) by HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou.
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (झुईई टेक्नोलॉजी से), साथ में पेपर [रोफॉर्मर: रोटरी पोजिशन एंबेडिंग के साथ एन्हांस्ड ट्रांसफॉर्मर] (https://arxiv.org/pdf/2104.09864v1.pdf) जियानलिन सु और यू लू और शेंगफेंग पैन और बो वेन और युनफेंग लियू द्वारा प्रकाशित।
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (Bo Peng से) Bo Peng. द्वाराअनुसंधान पत्र [this repo](https://github.com/BlinkDL/RWKV-LM) के साथ जारी किया गया
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (from NVIDIA) released with the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (Meta AI से) Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick. द्वाराअनुसंधान पत्र [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf) के साथ जारी किया गया
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (ASAPP से) साथ देने वाला पेपर [भाषण पहचान के लिए अनसुपरवाइज्ड प्री-ट्रेनिंग में परफॉर्मेंस-एफिशिएंसी ट्रेड-ऑफ्स](https ://arxiv.org/abs/2109.06870) फेलिक्स वू, क्वांगयुन किम, जिंग पैन, क्यू हान, किलियन क्यू. वेनबर्गर, योव आर्टज़ी द्वारा।
Expand Down
1 change: 1 addition & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,7 @@ Flax、PyTorch、TensorFlowをcondaでインストールする方法は、それ
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (WeChatAI から) HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou から公開された研究論文: [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf)
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (ZhuiyiTechnology から), Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu から公開された研究論文: [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/abs/2104.09864)
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (Bo Peng から) Bo Peng. から公開された研究論文 [this repo](https://github.com/BlinkDL/RWKV-LM)
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (NVIDIA から) Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo から公開された研究論文: [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203)
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (Meta AI から) Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick. から公開された研究論文 [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf)
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (ASAPP から) Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi から公開された研究論文: [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870)
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (WeChatAI 에서) HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou 의 [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) 논문과 함께 발표했습니다.
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (ZhuiyiTechnology 에서) Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu 의 a [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/pdf/2104.09864v1.pdf) 논문과 함께 발표했습니다.
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (Bo Peng 에서 제공)은 Bo Peng.의 [this repo](https://github.com/BlinkDL/RWKV-LM)논문과 함께 발표했습니다.
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (NVIDIA 에서) Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo 의 [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) 논문과 함께 발표했습니다.
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (Meta AI 에서 제공)은 Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.의 [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf)논문과 함께 발표했습니다.
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (ASAPP 에서) Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi 의 [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870) 논문과 함께 발표했습니다.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Original file line number Diff line number Diff line change
Expand Up @@ -400,6 +400,7 @@ conda install -c huggingface transformers
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (来自 WeChatAI), 伴随论文 [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) 由 HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou 发布。
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (来自 ZhuiyiTechnology), 伴随论文 [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/pdf/2104.09864v1.pdf) 由 Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu 发布。
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (来自 Bo Peng) 伴随论文 [this repo](https://github.com/BlinkDL/RWKV-LM) 由 Bo Peng 发布。
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (来自 NVIDIA) 伴随论文 [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) 由 Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo 发布。
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (来自 Meta AI) 伴随论文 [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf) 由 Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick 发布。
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (来自 ASAPP) 伴随论文 [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870) 由 Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi 发布。
Expand Down
1 change: 1 addition & 0 deletions README_zh-hant.md
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,7 @@ conda install -c huggingface transformers
1. **[RoCBert](https://huggingface.co/docs/transformers/model_doc/roc_bert)** (from WeChatAI) released with the paper [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) by HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou.
1. **[RoFormer](https://huggingface.co/docs/transformers/model_doc/roformer)** (from ZhuiyiTechnology), released together with the paper a [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/pdf/2104.09864v1.pdf) by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu.
1. **[RWKV](https://huggingface.co/docs/transformers/model_doc/rwkv)** (from Bo Peng) released with the paper [this repo](https://github.com/BlinkDL/RWKV-LM) by Bo Peng.
1. **[SeamlessM4T](https://huggingface.co/docs/transformers/main/model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](https://huggingface.co/docs/transformers/model_doc/segformer)** (from NVIDIA) released with the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.
1. **[Segment Anything](https://huggingface.co/docs/transformers/model_doc/sam)** (from Meta AI) released with the paper [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf) by Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.
1. **[SEW](https://huggingface.co/docs/transformers/model_doc/sew)** (from ASAPP) released with the paper [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870) by Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -589,6 +589,8 @@
title: MusicGen
- local: model_doc/pop2piano
title: Pop2Piano
- local: model_doc/seamless_m4t
title: Seamless-M4T
- local: model_doc/sew
title: SEW
- local: model_doc/sew-d
Expand Down
2 changes: 2 additions & 0 deletions docs/source/en/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,7 @@ The documentation is organized into five sections:
1. **[RoCBert](model_doc/roc_bert)** (from WeChatAI) released with the paper [RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining](https://aclanthology.org/2022.acl-long.65.pdf) by HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou.
1. **[RoFormer](model_doc/roformer)** (from ZhuiyiTechnology), released together with the paper [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/abs/2104.09864) by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu.
1. **[RWKV](model_doc/rwkv)** (from Bo Peng), released on [this repo](https://github.com/BlinkDL/RWKV-LM) by Bo Peng.
1. **[SeamlessM4T](model_doc/seamless_m4t)** (from Meta AI) released with the paper [SeamlessM4T — Massively Multilingual & Multimodal Machine Translation](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf) by the Seamless Communication team.
1. **[SegFormer](model_doc/segformer)** (from NVIDIA) released with the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.
1. **[Segment Anything](model_doc/sam)** (from Meta AI) released with the paper [Segment Anything](https://arxiv.org/pdf/2304.02643v1.pdf) by Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.
1. **[SEW](model_doc/sew)** (from ASAPP) released with the paper [Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition](https://arxiv.org/abs/2109.06870) by Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi.
Expand Down Expand Up @@ -437,6 +438,7 @@ Flax), PyTorch, and/or TensorFlow.
| RoFormer | ✅ | ✅ | ✅ |
| RWKV | ✅ | ❌ | ❌ |
| SAM | ✅ | ✅ | ❌ |
| SeamlessM4T | ✅ | ❌ | ❌ |
| SegFormer | ✅ | ✅ | ❌ |
| SEW | ✅ | ❌ | ❌ |
| SEW-D | ✅ | ❌ | ❌ |
Expand Down
Loading
Loading