Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REALM #13292

Merged
merged 114 commits into from
Jan 18, 2022
Merged
Show file tree
Hide file tree
Changes from 110 commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
e68583f
REALM initial commit
qqaatw Aug 26, 2021
4d85596
Retriever OK (Update new_gelu).
qqaatw Aug 28, 2021
1ff4364
Encoder prediction score OK
qqaatw Aug 29, 2021
baee376
Encoder pretrained model OK
qqaatw Aug 29, 2021
7ed7265
Update retriever comments
qqaatw Aug 29, 2021
dd3fb73
Update docs, tests, and imports
qqaatw Aug 29, 2021
927b106
Prune unused models
qqaatw Aug 29, 2021
96615bd
Make embedder as a module `RealmEmbedder`
qqaatw Aug 29, 2021
dea3b2f
Add RealmRetrieverOutput
qqaatw Aug 29, 2021
66859e1
Update tokenization
qqaatw Aug 30, 2021
ae889ee
Pass all tests in test_modeling_realm.py
qqaatw Aug 31, 2021
1b3bba2
Prune RealmModel
qqaatw Aug 31, 2021
766d663
Update docs
qqaatw Aug 31, 2021
eb1837b
Add training test.
qqaatw Aug 31, 2021
b316149
Remove completed TODO
qqaatw Aug 31, 2021
1b14c70
Style & Quality
qqaatw Aug 31, 2021
ce0ef70
Prune `RealmModel`
qqaatw Aug 31, 2021
4760751
Merge branch 'master' into add_realm
qqaatw Sep 1, 2021
92a6a5b
Fixup
qqaatw Sep 1, 2021
28b8dac
Changes:
qqaatw Sep 1, 2021
bd6d2eb
Fix up
qqaatw Sep 1, 2021
089fd65
Merge branch 'master' into add_realm
qqaatw Sep 1, 2021
633e452
Style
qqaatw Sep 1, 2021
728ef3c
Merge branch 'master' into add_realm
qqaatw Sep 1, 2021
609d7f3
Add tokenization tests
qqaatw Sep 1, 2021
8066399
Update `from_pretrained` tests
qqaatw Sep 3, 2021
09a280c
Apply suggestions
qqaatw Sep 3, 2021
769e8fb
Style & Quality
qqaatw Sep 4, 2021
d89a3bd
Merge branch 'master' into add_realm
qqaatw Sep 9, 2021
9d5175b
Copy BERT model
qqaatw Sep 9, 2021
6f64029
Fix comment to avoid docstring copying
qqaatw Sep 9, 2021
dc3695b
Make RealmBertModel private
qqaatw Sep 9, 2021
850c38c
Fix bug
qqaatw Sep 12, 2021
a11d5c5
Style
qqaatw Sep 13, 2021
831a230
Basic QA
qqaatw Sep 3, 2021
81985e6
Save
qqaatw Sep 19, 2021
dbd925d
Complete reader logits
qqaatw Sep 21, 2021
f6ffc1e
Add searcher
qqaatw Sep 23, 2021
8ee98d7
Complete searcher & reader
qqaatw Sep 26, 2021
7158fe8
Move block records init to constructor
qqaatw Sep 27, 2021
938ad0a
Fix training bug
qqaatw Sep 28, 2021
55f6531
Add some outputs to RealmReader
qqaatw Sep 28, 2021
89fd9c7
Add finetuned checkpoint variable names parsing
qqaatw Sep 29, 2021
3e57b52
Fix bug
qqaatw Sep 29, 2021
136b3ff
Update REALM config
qqaatw Oct 1, 2021
9f62961
Add RealmForOpenQA
qqaatw Oct 1, 2021
de1f3f0
Update convert_tfrecord logits
qqaatw Oct 2, 2021
bd16314
Fix bugs
qqaatw Oct 2, 2021
c917ef9
Complete imports
qqaatw Oct 2, 2021
113807f
Update docs
qqaatw Oct 2, 2021
f46b43e
Update naming
qqaatw Oct 2, 2021
a7b727d
Add brute-force searcher
qqaatw Oct 4, 2021
e1f3a45
Merge branch 'add_realmqa' into add_realm
qqaatw Oct 5, 2021
426121c
Pass realm model tests
qqaatw Oct 8, 2021
15d3978
Merge branch 'add_realmqa' into add_realm
qqaatw Oct 8, 2021
63e9e70
Merge branch 'master' into add_realmqa
qqaatw Oct 8, 2021
225b2e6
Style
qqaatw Oct 8, 2021
93f315a
Exclude RealmReader from common tests
qqaatw Oct 8, 2021
f656bd4
Merge branch 'add_realmqa' into add_realm
qqaatw Oct 8, 2021
4cad343
Fix
qqaatw Oct 8, 2021
dd81591
Fix
qqaatw Oct 8, 2021
01ca717
Merge branch 'add_realmqa' into add_realm
qqaatw Oct 8, 2021
afa214b
fix readme
patrickvonplaten Dec 24, 2021
36e7f1a
convert docs
patrickvonplaten Dec 24, 2021
a41734c
up
patrickvonplaten Dec 24, 2021
fb53dad
up
patrickvonplaten Dec 24, 2021
02bae05
more make style
patrickvonplaten Dec 27, 2021
8f50e8c
up
patrickvonplaten Dec 27, 2021
fef8cf3
upload
patrickvonplaten Dec 27, 2021
8b723ae
up
patrickvonplaten Dec 27, 2021
348936e
Fix
qqaatw Dec 28, 2021
b86139b
Update src/transformers/__init__.py
patrickvonplaten Dec 29, 2021
581c20c
adapt testing
patrickvonplaten Dec 29, 2021
2e6f91f
Merge branch 'add_realm' of https://github.com/qqaatw/transformers in…
patrickvonplaten Dec 29, 2021
c39b31f
change modeling code
patrickvonplaten Dec 29, 2021
bb78ce5
fix test
patrickvonplaten Dec 29, 2021
851d9ea
up
patrickvonplaten Dec 29, 2021
8e340a1
up
patrickvonplaten Dec 29, 2021
620ac36
up
patrickvonplaten Dec 29, 2021
9708e44
correct more
patrickvonplaten Dec 29, 2021
818718c
make retriever work
patrickvonplaten Dec 30, 2021
6492942
update
patrickvonplaten Dec 30, 2021
0fb7b86
make style
patrickvonplaten Dec 30, 2021
38e2ed0
finish main structure
patrickvonplaten Jan 3, 2022
bc56dbb
Resolve merge conflict
qqaatw Jan 4, 2022
a62ae6f
Make everything work
qqaatw Jan 4, 2022
167b17b
Style
qqaatw Jan 4, 2022
9627fe8
Fixup
qqaatw Jan 4, 2022
8642eca
Merge upstream master
qqaatw Jan 4, 2022
8bbebd4
Fixup
qqaatw Jan 4, 2022
5c9118a
Update training test
qqaatw Jan 5, 2022
f0a2438
Merge branch 'master' of https://github.com/huggingface/transformers …
patrickvonplaten Jan 5, 2022
d6d94be
fix retriever
patrickvonplaten Jan 5, 2022
e172e73
remove hardcoded path
patrickvonplaten Jan 5, 2022
ec695cb
Fix
qqaatw Jan 6, 2022
8550f4e
Merge branch 'add_realm' of https://github.com/qqaatw/transformers in…
qqaatw Jan 6, 2022
dc865ca
Fix modeling test
qqaatw Jan 6, 2022
ebd507c
Update model links
qqaatw Jan 6, 2022
db2f4fe
Initial retrieval test
qqaatw Jan 6, 2022
16577d7
Fix modeling test
qqaatw Jan 6, 2022
881bbd2
Complete retrieval tests
qqaatw Jan 6, 2022
1927e4f
Fix
qqaatw Jan 6, 2022
4048d7d
style
qqaatw Jan 6, 2022
34322b5
Fix tests
qqaatw Jan 6, 2022
06a5412
Fix docstring example
qqaatw Jan 6, 2022
712c4b7
Minor fix of retrieval test
qqaatw Jan 8, 2022
1701f70
Update license headers and docs
qqaatw Jan 8, 2022
493aa10
Apply suggestions from code review
qqaatw Jan 9, 2022
fb43dd5
Style
qqaatw Jan 9, 2022
54ee5cb
Merge branch 'master' into add_realm
sgugger Jan 17, 2022
a3cbaf0
Apply suggestions from code review
qqaatw Jan 17, 2022
0f4721b
Add an example to RealmEmbedder
qqaatw Jan 17, 2022
894ce5f
Fix
qqaatw Jan 17, 2022
d655e5f
Merge branch 'master' into add_realm
qqaatw Jan 18, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,7 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
1. **[REALM](https://huggingface.co/transformers/master/model_doc/realm.html)** (from Google Research) released with the paper [REALM: Retrieval-Augmented Language Model Pre-Training](https://arxiv.org/abs/2002.08909) by Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang.
1. **[Reformer](https://huggingface.co/docs/transformers/model_doc/reformer)** (from Google Research) released with the paper [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451) by Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya.
1. **[RemBERT](https://huggingface.co/docs/transformers/model_doc/rembert)** (from Google Research) released with the paper [Rethinking embedding coupling in pre-trained language models](https://arxiv.org/pdf/2010.12821.pdf) by Hyung Won Chung, Thibault Févry, Henry Tsai, M. Johnson, Sebastian Ruder.
1. **[RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta)** (from Facebook), released together with the paper a [Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
1. **[REALM](https://huggingface.co/transformers/master/model_doc/realm.html)** (from Google Research) released with the paper [REALM: Retrieval-Augmented Language Model Pre-Training](https://arxiv.org/abs/2002.08909) by Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang.
1. **[Reformer](https://huggingface.co/docs/transformers/model_doc/reformer)** (from Google Research) released with the paper [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451) by Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya.
1. **[RemBERT](https://huggingface.co/docs/transformers/model_doc/rembert)** (from Google Research) released with the paper [Rethinking embedding coupling in pre-trained language models](https://arxiv.org/pdf/2010.12821.pdf) by Hyung Won Chung, Thibault Févry, Henry Tsai, M. Johnson, Sebastian Ruder.
1. **[RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta)** (from Facebook), released together with the paper a [Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,7 @@ conda install -c huggingface transformers
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (来自 VinAI Research) 伴随论文 [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) 由 Dat Quoc Nguyen and Anh Tuan Nguyen 发布。
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (来自 Microsoft Research) 伴随论文 [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) 由 Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou 发布。
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (来自 NVIDIA) 伴随论文 [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) 由 Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius 发布。
1. **[REALM](https://huggingface.co/transformers/master/model_doc/realm.html)** (来自 Google Research) 伴随论文 [REALM: Retrieval-Augmented Language Model Pre-Training](https://arxiv.org/abs/2002.08909) 由 Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang 发布。
1. **[Reformer](https://huggingface.co/docs/transformers/model_doc/reformer)** (来自 Google Research) 伴随论文 [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451) 由 Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya 发布。
1. **[RemBERT](https://huggingface.co/docs/transformers/model_doc/rembert)** (来自 Google Research) 伴随论文 [Rethinking embedding coupling in pre-trained language models](https://arxiv.org/pdf/2010.12821.pdf) 由 Hyung Won Chung, Thibault Févry, Henry Tsai, M. Johnson, Sebastian Ruder 发布。
1. **[RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta)** (来自 Facebook), 伴随论文 [Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) 由 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov 发布。
Expand Down
1 change: 1 addition & 0 deletions README_zh-hant.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,7 @@ conda install -c huggingface transformers
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
1. **[REALM](https://huggingface.co/transformers/master/model_doc/realm.html)** (from Google Research) released with the paper [REALM: Retrieval-Augmented Language Model Pre-Training](https://arxiv.org/abs/2002.08909) by Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang.
1. **[Reformer](https://huggingface.co/docs/transformers/model_doc/reformer)** (from Google Research) released with the paper [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451) by Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya.
1. **[RemBERT](https://huggingface.co/docs/transformers/model_doc/rembert)** (from Google Research) released with the paper [Rethinking embedding coupling in pre-trained language models](https://arxiv.org/pdf/2010.12821.pdf) by Hyung Won Chung, Thibault Févry, Henry Tsai, M. Johnson, Sebastian Ruder.
1. **[RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta)** (from Facebook), released together with the paper a [Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,8 @@
title: QDQBert
- local: model_doc/rag
title: RAG
- local: model_doc/realm
title: REALM
- local: model_doc/reformer
title: Reformer
- local: model_doc/rembert
Expand Down
2 changes: 2 additions & 0 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ conversion utilities for the following models.
1. **[PhoBERT](model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
1. **[REALM](https://huggingface.co/transformers/master/model_doc/realm.html)** (from Google Research) released with the paper [REALM: Retrieval-Augmented Language Model Pre-Training](https://arxiv.org/abs/2002.08909) by Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang.
1. **[Reformer](model_doc/reformer)** (from Google Research) released with the paper [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451) by Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya.
1. **[RemBERT](model_doc/rembert)** (from Google Research) released with the paper [Rethinking embedding coupling in pre-trained language models](https://arxiv.org/pdf/2010.12821.pdf) by Hyung Won Chung, Thibault Févry, Henry Tsai, M. Johnson, Sebastian Ruder.
1. **[RoBERTa](model_doc/roberta)** (from Facebook), released together with the paper a [Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
Expand Down Expand Up @@ -244,6 +245,7 @@ Flax), PyTorch, and/or TensorFlow.
| ProphetNet | ✅ | ❌ | ✅ | ❌ | ❌ |
| QDQBert | ❌ | ❌ | ✅ | ❌ | ❌ |
| RAG | ✅ | ❌ | ✅ | ✅ | ❌ |
| Realm | ✅ | ❌ | ✅ | ❌ | ❌ |
| Reformer | ✅ | ✅ | ✅ | ❌ | ❌ |
| RemBERT | ✅ | ✅ | ✅ | ✅ | ❌ |
| RetriBERT | ✅ | ✅ | ✅ | ❌ | ❌ |
Expand Down
80 changes: 80 additions & 0 deletions docs/source/model_doc/realm.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# REALM

## Overview

The REALM model was proposed in `REALM: Retrieval-Augmented Language Model Pre-Training
<https://arxiv.org/abs/2002.08909>`__ by Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang. It's a
retrieval-augmented language model that firstly retrieves documents from a textual knowledge corpus and then
utilizes retrieved documents to process question answering tasks.

The abstract from the paper is the following:

*Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks
such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network,
requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we
augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend
over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the
first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language
modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We
demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the
challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both
explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous
methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as
interpretability and modularity.*

Comment on lines +34 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more nit for follow-up PR.

Since this is a complex model it would be awesome to add a usage example either here or in reserach_projects dir like RAG.

This model was contributed by `qqaatw <https://huggingface.co/qqaatw>`__. The original code can be found `here
<https://github.com/google-research/language/tree/master/language/realm>`__.

## RealmConfig

[[autodoc]] RealmConfig

## RealmTokenizer

[[autodoc]] RealmTokenizer
- build_inputs_with_special_tokens
- get_special_tokens_mask
- create_token_type_ids_from_sequences
- save_vocabulary
- batch_encode_candidates

## RealmRetriever

[[autodoc]] RealmRetriever

## RealmEmbedder

[[autodoc]] RealmEmbedder
- forward

## RealmScorer

[[autodoc]] RealmScorer
- forward

## RealmKnowledgeAugEncoder

[[autodoc]] RealmKnowledgeAugEncoder
- forward

## RealmReader

[[autodoc]] RealmReader
- forward

## RealmForOpenQA

[[autodoc]] RealmForOpenQA
- forward
26 changes: 26 additions & 0 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@
"models.prophetnet": ["PROPHETNET_PRETRAINED_CONFIG_ARCHIVE_MAP", "ProphetNetConfig", "ProphetNetTokenizer"],
"models.qdqbert": ["QDQBERT_PRETRAINED_CONFIG_ARCHIVE_MAP", "QDQBertConfig"],
"models.rag": ["RagConfig", "RagRetriever", "RagTokenizer"],
"models.realm": ["REALM_PRETRAINED_CONFIG_ARCHIVE_MAP", "RealmConfig", "RealmTokenizer"],
"models.reformer": ["REFORMER_PRETRAINED_CONFIG_ARCHIVE_MAP", "ReformerConfig"],
"models.rembert": ["REMBERT_PRETRAINED_CONFIG_ARCHIVE_MAP", "RemBertConfig"],
"models.retribert": ["RETRIBERT_PRETRAINED_CONFIG_ARCHIVE_MAP", "RetriBertConfig", "RetriBertTokenizer"],
Expand Down Expand Up @@ -1199,6 +1200,19 @@
_import_structure["models.rag"].extend(
["RagModel", "RagPreTrainedModel", "RagSequenceForGeneration", "RagTokenForGeneration"]
)
_import_structure["models.realm"].extend(
[
"REALM_PRETRAINED_MODEL_ARCHIVE_LIST",
"RealmEmbedder",
"RealmForOpenQA",
"RealmKnowledgeAugEncoder",
"RealmPreTrainedModel",
"RealmReader",
"RealmRetriever",
"RealmScorer",
"load_tf_weights_in_realm",
]
)
_import_structure["models.reformer"].extend(
[
"REFORMER_PRETRAINED_MODEL_ARCHIVE_LIST",
Expand Down Expand Up @@ -2353,6 +2367,7 @@
from .models.prophetnet import PROPHETNET_PRETRAINED_CONFIG_ARCHIVE_MAP, ProphetNetConfig, ProphetNetTokenizer
from .models.qdqbert import QDQBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, QDQBertConfig
from .models.rag import RagConfig, RagRetriever, RagTokenizer
from .models.realm import REALM_PRETRAINED_CONFIG_ARCHIVE_MAP, RealmConfig, RealmTokenizer
from .models.reformer import REFORMER_PRETRAINED_CONFIG_ARCHIVE_MAP, ReformerConfig
from .models.rembert import REMBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, RemBertConfig
from .models.retribert import RETRIBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, RetriBertConfig, RetriBertTokenizer
Expand Down Expand Up @@ -3128,6 +3143,17 @@
ProphetNetPreTrainedModel,
)
from .models.rag import RagModel, RagPreTrainedModel, RagSequenceForGeneration, RagTokenForGeneration
from .models.realm import (
REALM_PRETRAINED_MODEL_ARCHIVE_LIST,
RealmEmbedder,
RealmForOpenQA,
RealmKnowledgeAugEncoder,
RealmPreTrainedModel,
RealmReader,
RealmRetriever,
RealmScorer,
load_tf_weights_in_realm,
)
from .models.reformer import (
REFORMER_PRETRAINED_MODEL_ARCHIVE_LIST,
ReformerAttention,
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
prophetnet,
qdqbert,
rag,
realm,
reformer,
rembert,
retribert,
Expand Down
3 changes: 3 additions & 0 deletions src/transformers/models/auto/configuration_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
CONFIG_MAPPING_NAMES = OrderedDict(
[
# Add configs here
("realm", "RealmConfig"),
("nystromformer", "NystromformerConfig"),
("imagegpt", "ImageGPTConfig"),
("qdqbert", "QDQBertConfig"),
Expand Down Expand Up @@ -117,6 +118,7 @@
CONFIG_ARCHIVE_MAP_MAPPING_NAMES = OrderedDict(
[
# Add archive maps here
("realm", "REALM_PRETRAINED_CONFIG_ARCHIVE_MAP"),
("nystromformer", "NYSTROMFORMER_PRETRAINED_CONFIG_ARCHIVE_MAP"),
("imagegpt", "IMAGEGPT_PRETRAINED_CONFIG_ARCHIVE_MAP"),
("qdqbert", "QDQBERT_PRETRAINED_CONFIG_ARCHIVE_MAP"),
Expand Down Expand Up @@ -192,6 +194,7 @@
MODEL_NAMES_MAPPING = OrderedDict(
[
# Add full (and cased) model names here
("realm", "Realm"),
("nystromformer", "Nystromformer"),
("imagegpt", "ImageGPT"),
("qdqbert", "QDQBert"),
Expand Down
Loading