Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ gemma] Adds support for Gemma 💎 #29167

Merged
merged 142 commits into from
Feb 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
7434ea2
inital commit
ArthurZucker Jan 21, 2024
6165925
update
ArthurZucker Jan 21, 2024
69006fa
update conversion checkpoint
ArthurZucker Jan 21, 2024
cb80199
update conversion script
ArthurZucker Jan 21, 2024
08252ce
nits
ArthurZucker Jan 21, 2024
32ea5fb
some fixes
younesbelkada Jan 21, 2024
888ab95
nits
ArthurZucker Jan 21, 2024
ce0aa57
Merge branch 'add-golden-gate' of https://github.com/huggingface/new-…
ArthurZucker Jan 21, 2024
e303e48
merge
ArthurZucker Jan 21, 2024
aefc4bc
fix permute
younesbelkada Jan 21, 2024
78de9f5
nits
ArthurZucker Jan 21, 2024
fb2917d
Merge branch 'add-golden-gate' of https://github.com/huggingface/new-…
younesbelkada Jan 21, 2024
f3ad1b8
fix
younesbelkada Jan 21, 2024
e3a0bbd
nits
ArthurZucker Jan 21, 2024
bbee069
nits
ArthurZucker Jan 21, 2024
e0e0646
nits
ArthurZucker Jan 21, 2024
5f63b4c
fix rope
ArthurZucker Jan 21, 2024
4734805
fix both rope
ArthurZucker Jan 21, 2024
9b27edd
merge
ArthurZucker Jan 21, 2024
d20574d
nites
ArthurZucker Jan 21, 2024
3f76b2f
style
ArthurZucker Jan 21, 2024
3449b4b
make sure flax works
ArthurZucker Jan 21, 2024
fbbe149
fix flax init code
ArthurZucker Jan 21, 2024
bf8ed52
fix foward
ArthurZucker Jan 21, 2024
8273aa7
nits
ArthurZucker Jan 21, 2024
cf6345c
print flax generation out
ArthurZucker Jan 21, 2024
6f49e21
current code
ArthurZucker Jan 21, 2024
e2b09ee
nits
ArthurZucker Jan 22, 2024
94e1020
SIIIIIIIIIIIIIIIIIII
ArthurZucker Jan 22, 2024
afd89e8
update
ArthurZucker Jan 22, 2024
f21564b
add new tokenizer
ArthurZucker Jan 22, 2024
5c85aa3
correct fast tokenizer
ArthurZucker Jan 22, 2024
342095b
fix conversion
ArthurZucker Jan 22, 2024
b6b6f35
more comments
ArthurZucker Jan 22, 2024
f16f9b0
fix modeling and conversion
ArthurZucker Jan 22, 2024
62077c3
nits and nits
ArthurZucker Jan 22, 2024
274aea7
nits testing
ArthurZucker Jan 22, 2024
9e629dc
add some tokenization tests
ArthurZucker Jan 22, 2024
b128ceb
add some edge cases
ArthurZucker Jan 22, 2024
7bec5c6
add slow tests and fix them
younesbelkada Jan 22, 2024
51872ef
fixup
younesbelkada Jan 22, 2024
c8f89d2
fix copies for modeling
younesbelkada Jan 22, 2024
367cdcd
fix copies
younesbelkada Jan 22, 2024
9fc4fd8
add 7B slow tests
younesbelkada Jan 22, 2024
ea24f25
fix
younesbelkada Jan 22, 2024
e337cf2
fix
younesbelkada Jan 22, 2024
b356561
fix tests
younesbelkada Jan 22, 2024
423d0b1
make tokenizer cis go green
ArthurZucker Jan 23, 2024
ac8ee3d
styling
ArthurZucker Jan 23, 2024
ee4eebf
last tokenizer nits
ArthurZucker Jan 23, 2024
06d3ff7
update jax tests
ArthurZucker Jan 23, 2024
b7b31ca
fix flax for 7b
ArthurZucker Jan 24, 2024
7626e71
add jit testing 🤗
ArthurZucker Jan 24, 2024
5db6ddc
cleanups
ArthurZucker Jan 25, 2024
5123d3c
isolated nit, inv_freq for rotary_emb.inv_freq
ArthurZucker Jan 25, 2024
ca75915
propagate to jax
ArthurZucker Jan 25, 2024
36ecdc9
Apply suggestions from code review
ArthurZucker Jan 26, 2024
792dcf0
adjust test
younesbelkada Jan 26, 2024
d700211
Merge branch 'add-golden-gate' of https://github.com/huggingface/new-…
younesbelkada Jan 26, 2024
dbc976e
fix conversion script
younesbelkada Jan 31, 2024
10ace9f
change name
ArthurZucker Feb 7, 2024
6ff459c
correct file names
ArthurZucker Feb 8, 2024
a5bb7a2
Merge branch 'add-golden-gate' of github.com:huggingface/new-model-ad…
ArthurZucker Feb 8, 2024
2126d1c
update conversion script
ArthurZucker Feb 8, 2024
ac7ac87
Fix bos and eos token ids in the model configuration (#3)
pcuenca Feb 13, 2024
bd2a760
update modelling
ArthurZucker Feb 13, 2024
123d72c
update conversion script
ArthurZucker Feb 13, 2024
aab3110
merge
ArthurZucker Feb 13, 2024
0d2acff
Merge remote-tracking branch 'gg/main' into HEAD
younesbelkada Feb 13, 2024
4e08dd9
add static cache for gemma
younesbelkada Feb 13, 2024
6976cab
fix sdpa generate
younesbelkada Feb 14, 2024
92c7a7f
fix batched
younesbelkada Feb 14, 2024
f453a42
multiple fixes
younesbelkada Feb 14, 2024
491ea9f
fix FA2
younesbelkada Feb 14, 2024
4b4b621
final fix
younesbelkada Feb 14, 2024
e9d36be
Rename a few missing strings and filenames (#4)
pcuenca Feb 15, 2024
cdf9333
Merge branch 'main' into add-golden-gate
younesbelkada Feb 19, 2024
51a7d9f
merge with upstream main
younesbelkada Feb 19, 2024
89384e8
fix copies
younesbelkada Feb 19, 2024
648ef04
fix copies
younesbelkada Feb 19, 2024
fd35d32
fix fixup
younesbelkada Feb 19, 2024
58d7146
fix fixup
younesbelkada Feb 19, 2024
14d0f54
fix
younesbelkada Feb 19, 2024
3822c4c
fix
younesbelkada Feb 19, 2024
e50721d
final tests
younesbelkada Feb 19, 2024
a764543
fix fx gemma tests
sanchit-gandhi Feb 20, 2024
32a0492
fix fx bf16/fp16 tests
sanchit-gandhi Feb 20, 2024
069b8b5
update slow fx tests
sanchit-gandhi Feb 20, 2024
154e1df
fx slow tests: one logits, one generation
sanchit-gandhi Feb 20, 2024
a4074b0
move jit test standalone
sanchit-gandhi Feb 20, 2024
d7c2eb2
Merge commit '0996a10077219de0556281511fc02f3ab68002d5' into add-gold…
ArthurZucker Feb 21, 2024
fc6ac3b
Apply suggestions from code review
ArthurZucker Feb 21, 2024
a259c04
nits
ArthurZucker Feb 21, 2024
c590a9a
Merge branch 'add-golden-gate' of github.com:huggingface/new-model-ad…
ArthurZucker Feb 21, 2024
b8d8edb
tokenizer updates
ArthurZucker Feb 21, 2024
dc9ecf6
more tokenization updates: custom GemmaSentencepieceExtrator
ArthurZucker Feb 21, 2024
1a381bd
style
ArthurZucker Feb 21, 2024
f7496a4
Update src/transformers/cache_utils.py
ArthurZucker Feb 21, 2024
170832a
Update src/transformers/models/gemma/__init__.py
ArthurZucker Feb 21, 2024
b32adc0
Update tests/models/gemma/test_modeling_flax_gemma.py
ArthurZucker Feb 21, 2024
24d5191
small nits
ArthurZucker Feb 21, 2024
6d5a2fe
Merge branch 'add-golden-gate' of github.com:huggingface/new-model-ad…
ArthurZucker Feb 21, 2024
403c796
style
ArthurZucker Feb 21, 2024
3633bb0
update tokenization test
ArthurZucker Feb 21, 2024
df86580
fix the rotary embedding
ArthurZucker Feb 21, 2024
565028a
with style
ArthurZucker Feb 21, 2024
362d2cd
fix slow tests
younesbelkada Feb 21, 2024
fb87315
WARNING this commit might be very important for precisions
ArthurZucker Feb 21, 2024
b63d2e0
Merge branch 'add-golden-gate' of github.com:huggingface/new-model-ad…
ArthurZucker Feb 21, 2024
9b834f7
Update tests/models/gemma/test_modeling_flax_gemma.py
ArthurZucker Feb 21, 2024
e33fb3d
Update src/transformers/models/gemma/configuration_gemma.py
ArthurZucker Feb 21, 2024
0aa94f1
Update src/transformers/models/gemma/modeling_flax_gemma.py
ArthurZucker Feb 21, 2024
e06809c
small nits here and there!
ArthurZucker Feb 21, 2024
eb2c69d
Merge branch 'add-golden-gate' of github.com:huggingface/new-model-ad…
ArthurZucker Feb 21, 2024
db63274
forgotten nit
ArthurZucker Feb 21, 2024
275ee4a
remove on the fly computation of inv_freq
ArthurZucker Feb 21, 2024
6d81a99
revert previous change, let's be safe and for now re-compute freq cis…
ArthurZucker Feb 21, 2024
c9a4d86
Apply suggestions from code review
younesbelkada Feb 21, 2024
1c42d4a
Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
younesbelkada Feb 21, 2024
9c4d439
Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
younesbelkada Feb 21, 2024
a56f555
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
03c8e8f
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
f37f517
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
e5bfdd0
Update tests/models/gemma/test_modeling_flax_gemma.py
younesbelkada Feb 21, 2024
d83e098
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
09717b6
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
cce69c0
Update tests/models/gemma/test_tokenization_gemma.py
younesbelkada Feb 21, 2024
dde30a5
Update tests/models/gemma/test_tokenization_gemma.py
younesbelkada Feb 21, 2024
02a2d38
Update tests/models/gemma/test_tokenization_gemma.py
younesbelkada Feb 21, 2024
c975dae
Update tests/models/gemma/test_tokenization_gemma.py
younesbelkada Feb 21, 2024
5cf4da3
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
f198015
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
ac82e00
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
7253c9f
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
bc7ebaf
Update tests/models/gemma/test_modeling_gemma.py
younesbelkada Feb 21, 2024
4ad0d52
nit conversion script link
ArthurZucker Feb 21, 2024
7db13fa
fix some tests
younesbelkada Feb 21, 2024
60f8ba6
add not doctest and pr doctest
ArthurZucker Feb 21, 2024
1bf51b3
Merge branch 'add-golden-gate' of github.com:huggingface/transformers…
ArthurZucker Feb 21, 2024
2a4d326
repo consistency
ArthurZucker Feb 21, 2024
ea9eb10
fix last CIs 🚀
ArthurZucker Feb 21, 2024
556f743
update all readmes
ArthurZucker Feb 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,7 @@ Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://h
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (from Microsoft Research) released with the paper [Focal Modulation Networks](https://arxiv.org/abs/2203.11926) by Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao.
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (from CMU/Google Brain) released with the paper [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236) by Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (from ADEPT) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. Released with the paper [blog post](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (from Google) released with the paper [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) by the Gemma Google team.
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (from Microsoft Research) released with the paper [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100) by Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (from OpenAI) released with the paper [Improving Language Understanding by Generative Pre-Training](https://openai.com/research/language-unsupervised/) by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
Expand Down
1 change: 1 addition & 0 deletions README_es.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,7 @@ Número actual de puntos de control: ![](https://img.shields.io/endpoint?url=htt
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (from Microsoft Research) released with the paper [Focal Modulation Networks](https://arxiv.org/abs/2203.11926) by Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao.
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (from CMU/Google Brain) released with the paper [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236) by Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (from ADEPT) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. Released with the paper [blog post](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (from Google) released with the paper [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) by the Gemma Google team.
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (from Microsoft Research) released with the paper [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100) by Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (from OpenAI) released with the paper [Improving Language Understanding by Generative Pre-Training](https://openai.com/research/language-unsupervised/) by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
Expand Down
1 change: 1 addition & 0 deletions README_fr.md
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,7 @@ Nombre actuel de points de contrôle : ![](https://img.shields.io/endpoint?url=h
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (de Microsoft Research) publié dans l'article [Réseaux de modulation focale](https://arxiv.org/abs/2203.11926) par Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao.
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (de l'Université Carnegie Mellon/Google Brain) publié dans l'article [Funnel-Transformer : Filtrer la redondance séquentielle pour un traitement efficace du langage](https://arxiv.org/abs/2006.03236) par Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (de ADEPT) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. Publié dans l'article [billet de blog](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (de Google) publié dans l'article [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) parthe Gemma Google team.
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (de Microsoft Research) publié dans l'article [GIT : Un transformateur génératif d'images en texte pour la vision et le langage](https://arxiv.org/abs/2205.14100) par Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (de la KAIST) publié dans l'article [Réseaux de chemins globaux-locaux pour l'estimation de profondeur monoculaire avec Vertical CutDepth](https://arxiv.org/abs/2201.07436) par Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (d'OpenAI) publié dans l'article [Améliorer la compréhension du langage par l'apprentissage préalable génératif](https://openai.com/research/language-unsupervised/) par Alec Radford, Karthik Narasimhan, Tim Salimans et Ilya Sutskever.
Expand Down
1 change: 1 addition & 0 deletions README_hd.md
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,7 @@ conda install conda-forge::transformers
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (Microsoft Research से) Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao. द्वाराअनुसंधान पत्र [Focal Modulation Networks](https://arxiv.org/abs/2203.11926) के साथ जारी किया गया
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (सीएमयू/गूगल ब्रेन से) साथ में कागज [फ़नल-ट्रांसफॉर्मर: कुशल भाषा प्रसंस्करण के लिए अनुक्रमिक अतिरेक को छानना](https://arxiv.org/abs/2006.03236) जिहांग दाई, गुओकुन लाई, यिमिंग यांग, क्वोक वी. ले द्वारा रिहाई।
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (ADEPT से) रोहन बाविशी, एरिच एलसेन, कर्टिस हॉथोर्न, मैक्सवेल नी, ऑगस्टस ओडेना, अरुशी सोमानी, सागनाक तासिरलार [blog post](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (Google से) the Gemma Google team. द्वाराअनुसंधान पत्र [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) के साथ जारी किया गया
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (from Microsoft Research) released with the paper [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100) by Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (KAIST से) साथ वाला पेपर [वर्टिकल कटडेप्थ के साथ मोनोकुलर डेप्थ एस्टीमेशन के लिए ग्लोबल-लोकल पाथ नेटवर्क्स](https://arxiv.org/abs/2201.07436) डोयोन किम, वूंगह्युन गा, प्युंगवान आह, डोंगग्यू जू, सेहवान चुन, जुनमो किम द्वारा।
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (OpenAI से) साथ में दिया गया पेपर [जेनरेटिव प्री-ट्रेनिंग द्वारा भाषा की समझ में सुधार](https://openai.com/research/language-unsupervised/) एलेक रैडफोर्ड, कार्तिक नरसिम्हन, टिम सालिमन्स और इल्या सुत्स्केवर द्वारा।
Expand Down
1 change: 1 addition & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,7 @@ Flax、PyTorch、TensorFlowをcondaでインストールする方法は、それ
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (Microsoft Research から) Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao. から公開された研究論文 [Focal Modulation Networks](https://arxiv.org/abs/2203.11926)
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (CMU/Google Brain から) Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le から公開された研究論文: [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236)
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (ADEPT から) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. から公開された研究論文 [blog post](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (Google から) the Gemma Google team. から公開された研究論文 [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/)
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (Microsoft Research から) Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang. から公開された研究論文 [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100)
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (KAIST から) Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim から公開された研究論文: [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436)
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (OpenAI から) Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever から公開された研究論文: [Improving Language Understanding by Generative Pre-Training](https://openai.com/research/language-unsupervised/)
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (from Microsoft Research) released with the paper [Focal Modulation Networks](https://arxiv.org/abs/2203.11926) by Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao.
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (from CMU/Google Brain) released with the paper [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236) by Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (from ADEPT) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. 논문과 함께 공개 [blog post](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (Google 에서 제공)은 the Gemma Google team.의 [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/)논문과 함께 발표했습니다.
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (from Microsoft Research) released with the paper [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100) by Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (from OpenAI) released with the paper [Improving Language Understanding by Generative Pre-Training](https://openai.com/research/language-unsupervised/) by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,7 @@ conda install conda-forge::transformers
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (来自 Microsoft Research) 伴随论文 [Focal Modulation Networks](https://arxiv.org/abs/2203.11926) 由 Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao 发布。
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (来自 CMU/Google Brain) 伴随论文 [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236) 由 Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le 发布。
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (来自 ADEPT) 伴随论文 [blog post](https://www.adept.ai/blog/fuyu-8b) 由 Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar 发布。
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (来自 Google) 伴随论文 [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) 由 the Gemma Google team 发布。
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (来自 Microsoft Research) 伴随论文 [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100) 由 Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang 发布。
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (来自 KAIST) 伴随论文 [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436) 由 Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim 发布。
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (来自 OpenAI) 伴随论文 [Improving Language Understanding by Generative Pre-Training](https://openai.com/research/language-unsupervised/) 由 Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever 发布。
Expand Down
1 change: 1 addition & 0 deletions README_zh-hant.md
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,7 @@ conda install conda-forge::transformers
1. **[FocalNet](https://huggingface.co/docs/transformers/model_doc/focalnet)** (from Microsoft Research) released with the paper [Focal Modulation Networks](https://arxiv.org/abs/2203.11926) by Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao.
1. **[Funnel Transformer](https://huggingface.co/docs/transformers/model_doc/funnel)** (from CMU/Google Brain) released with the paper [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236) by Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
1. **[Fuyu](https://huggingface.co/docs/transformers/model_doc/fuyu)** (from ADEPT) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. Released with the paper [blog post](https://www.adept.ai/blog/fuyu-8b)
1. **[Gemma](https://huggingface.co/docs/transformers/main/model_doc/gemma)** (from Google) released with the paper [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) by the Gemma Google team.
1. **[GIT](https://huggingface.co/docs/transformers/model_doc/git)** (from Microsoft Research) released with the paper [GIT: A Generative Image-to-text Transformer for Vision and Language](https://arxiv.org/abs/2205.14100) by Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
1. **[GPT](https://huggingface.co/docs/transformers/model_doc/openai-gpt)** (from OpenAI) released with the paper [Improving Language Understanding by Generative Pre-Training](https://openai.com/research/language-unsupervised/) by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,8 @@
title: Funnel Transformer
- local: model_doc/fuyu
title: Fuyu
- local: model_doc/gemma
title: Gemma
- local: model_doc/openai-gpt
title: GPT
- local: model_doc/gpt_neo
Expand Down
1 change: 1 addition & 0 deletions docs/source/en/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ Flax), PyTorch, and/or TensorFlow.
| [FocalNet](model_doc/focalnet) | ✅ | ❌ | ❌ |
| [Funnel Transformer](model_doc/funnel) | ✅ | ✅ | ❌ |
| [Fuyu](model_doc/fuyu) | ✅ | ❌ | ❌ |
| [Gemma](model_doc/gemma) | ✅ | ❌ | ✅ |
| [GIT](model_doc/git) | ✅ | ❌ | ❌ |
| [GLPN](model_doc/glpn) | ✅ | ❌ | ❌ |
| [GPT Neo](model_doc/gpt_neo) | ✅ | ❌ | ✅ |
Expand Down
Loading
Loading