From 6a8b5f7ff6d37a5e1d4831467733493d0cc88dae Mon Sep 17 00:00:00 2001 From: Branden Chan Date: Thu, 22 Oct 2020 11:55:31 +0200 Subject: [PATCH 1/3] add readme --- model_cards/deepset/gbert-base/README.md | 42 ++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 model_cards/deepset/gbert-base/README.md diff --git a/model_cards/deepset/gbert-base/README.md b/model_cards/deepset/gbert-base/README.md new file mode 100644 index 00000000000000..5d87b5b4e3d02c --- /dev/null +++ b/model_cards/deepset/gbert-base/README.md @@ -0,0 +1,42 @@ +# German BERT base + +Released, Oct 2020, this is a German BERT language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that it outperforms its predecessors. + +## Overview +**Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) +**Architecture:** BERT base +**Language:** German + +## Performance +``` +GermEval18 Coarse: 78.17 +GermEval18 Fine: 50.90 +GermEval14: 87.98 +``` + +See also: +deepset/gbert-base +deepset/gbert-large +deepset/gelectra-base +deepset/gelectra-large +deepset/gelectra-base-generator +deepset/gelectra-large-generator + +## Authors +Branden Chan: `branden.chan [at] deepset.ai` +Stefan Schweter: `stefan [at] schweter.eu` +Timo Möller: `timo.moeller [at] deepset.ai` + +## About us +![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png) + +We bring NLP to the industry via open source! +Our focus: Industry specific language models & large scale QA systems. + +Some of our work: +- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) +- [FARM](https://github.com/deepset-ai/FARM) +- [Haystack](https://github.com/deepset-ai/haystack/) + +Get in touch: +[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) From 083570a2c6d8c8f359ca3e5d88c2a2deaed891b5 Mon Sep 17 00:00:00 2001 From: Branden Chan Date: Thu, 22 Oct 2020 11:59:01 +0200 Subject: [PATCH 2/3] add readmes --- model_cards/deepset/gbert-large/README.md | 44 ++++++++++++++++++ .../deepset/gelectra-base-generator/README.md | 37 +++++++++++++++ model_cards/deepset/gelectra-base/README.md | 42 +++++++++++++++++ .../gelectra-large-generator/README.md | 46 +++++++++++++++++++ model_cards/deepset/gelectra-large/README.md | 42 +++++++++++++++++ 5 files changed, 211 insertions(+) create mode 100644 model_cards/deepset/gbert-large/README.md create mode 100644 model_cards/deepset/gelectra-base-generator/README.md create mode 100644 model_cards/deepset/gelectra-base/README.md create mode 100644 model_cards/deepset/gelectra-large-generator/README.md create mode 100644 model_cards/deepset/gelectra-large/README.md diff --git a/model_cards/deepset/gbert-large/README.md b/model_cards/deepset/gbert-large/README.md new file mode 100644 index 00000000000000..61466ab39425c6 --- /dev/null +++ b/model_cards/deepset/gbert-large/README.md @@ -0,0 +1,44 @@ +# German BERT large + +Released, Oct 2020, this is a German BERT language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that it outperforms its predecessors. + +## Overview +**Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) +**Architecture:** BERT large +**Language:** German + +## Performance +``` +GermEval18 Coarse: 80.08 +GermEval18 Fine: 52.48 +GermEval14: 88.16 +``` + +See also: +deepset/gbert-base +deepset/gbert-large +deepset/gelectra-base +deepset/gelectra-large +deepset/gelectra-base-generator +deepset/gelectra-large-generator + +## Authors +Branden Chan: `branden.chan [at] deepset.ai` +Stefan Schweter: `stefan [at] schweter.eu` +Timo Möller: `timo.moeller [at] deepset.ai` + +## About us +![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png) + +We bring NLP to the industry via open source! +Our focus: Industry specific language models & large scale QA systems. + +Some of our work: +- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) +- [FARM](https://github.com/deepset-ai/FARM) +- [Haystack](https://github.com/deepset-ai/haystack/) + +Get in touch: +[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) + + diff --git a/model_cards/deepset/gelectra-base-generator/README.md b/model_cards/deepset/gelectra-base-generator/README.md new file mode 100644 index 00000000000000..54b0119abd8977 --- /dev/null +++ b/model_cards/deepset/gelectra-base-generator/README.md @@ -0,0 +1,37 @@ +# German ELECTRA base generator + +Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. + +The generator is useful for performing masking experiments. If you are looking for a regular language model for embedding extraction, or downstream tasks like NER, classification or QA, please use deepset/gelectra-base. + +## Overview +**Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) +**Architecture:** ELECTRA base (generator) +**Language:** German + +See also: +deepset/gbert-base +deepset/gbert-large +deepset/gelectra-base +deepset/gelectra-large +deepset/gelectra-base-generator +deepset/gelectra-large-generator + +## Authors +Branden Chan: `branden.chan [at] deepset.ai` +Stefan Schweter: `stefan [at] schweter.eu` +Timo Möller: `timo.moeller [at] deepset.ai` + +## About us +![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png) + +We bring NLP to the industry via open source! +Our focus: Industry specific language models & large scale QA systems. + +Some of our work: +- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) +- [FARM](https://github.com/deepset-ai/FARM) +- [Haystack](https://github.com/deepset-ai/haystack/) + +Get in touch: +[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) diff --git a/model_cards/deepset/gelectra-base/README.md b/model_cards/deepset/gelectra-base/README.md new file mode 100644 index 00000000000000..ed4b196bb2362a --- /dev/null +++ b/model_cards/deepset/gelectra-base/README.md @@ -0,0 +1,42 @@ +# German ELECTRA base + +Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. Our evaluation suggests that this model is somewhat undertrained. For best performance from a base sized model, we recommend deepset/gbert-base + +## Overview +**Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) +**Architecture:** ELECTRA base (discriminator) +**Language:** German + +## Performance +``` +GermEval18 Coarse: 76.02 +GermEval18 Fine: 42.22 +GermEval14: 86.02 +``` + +See also: +deepset/gbert-base +deepset/gbert-large +deepset/gelectra-base +deepset/gelectra-large +deepset/gelectra-base-generator +deepset/gelectra-large-generator + +## Authors +Branden Chan: `branden.chan [at] deepset.ai` +Stefan Schweter: `stefan [at] schweter.eu` +Timo Möller: `timo.moeller [at] deepset.ai` + +## About us +![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png) + +We bring NLP to the industry via open source! +Our focus: Industry specific language models & large scale QA systems. + +Some of our work: +- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) +- [FARM](https://github.com/deepset-ai/FARM) +- [Haystack](https://github.com/deepset-ai/haystack/) + +Get in touch: +[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) diff --git a/model_cards/deepset/gelectra-large-generator/README.md b/model_cards/deepset/gelectra-large-generator/README.md new file mode 100644 index 00000000000000..7b4b7a3b19a6c5 --- /dev/null +++ b/model_cards/deepset/gelectra-large-generator/README.md @@ -0,0 +1,46 @@ +# German ELECTRA large generator + +Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. + +The generator is useful for performing masking experiments. If you are looking for a regular language model for embedding extraction, or downstream tasks like NER, classification or QA, please use deepset/gelectra-large. + +## Overview +**Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) +**Architecture:** ELECTRA large (generator) +**Language:** German + +## Performance +``` +GermEval18 Coarse: 80.70 +GermEval18 Fine: 55.16 +GermEval14: 88.95 +``` + +See also: +deepset/gbert-base +deepset/gbert-large +deepset/gelectra-base +deepset/gelectra-large +deepset/gelectra-base-generator +deepset/gelectra-large-generator + +## Authors +Branden Chan: `branden.chan [at] deepset.ai` +Stefan Schweter: `stefan [at] schweter.eu` +Timo Möller: `timo.moeller [at] deepset.ai` + +## About us +![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png) + +We bring NLP to the industry via open source! +Our focus: Industry specific language models & large scale QA systems. + +Some of our work: +- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) +- [FARM](https://github.com/deepset-ai/FARM) +- [Haystack](https://github.com/deepset-ai/haystack/) + +Get in touch: +[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) + + diff --git a/model_cards/deepset/gelectra-large/README.md b/model_cards/deepset/gelectra-large/README.md new file mode 100644 index 00000000000000..cf14f8e320c2f4 --- /dev/null +++ b/model_cards/deepset/gelectra-large/README.md @@ -0,0 +1,42 @@ +# German ELECTRA large + +Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that this is the state of the art German language model. + +## Overview +**Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) +**Architecture:** ELECTRA large (discriminator) +**Language:** German + +## Performance +``` +GermEval18 Coarse: 80.70 +GermEval18 Fine: 55.16 +GermEval14: 88.95 +``` + +See also: +deepset/gbert-base +deepset/gbert-large +deepset/gelectra-base +deepset/gelectra-large +deepset/gelectra-base-generator +deepset/gelectra-large-generator + +## Authors +Branden Chan: `branden.chan [at] deepset.ai` +Stefan Schweter: `stefan [at] schweter.eu` +Timo Möller: `timo.moeller [at] deepset.ai` + +## About us +![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png) + +We bring NLP to the industry via open source! +Our focus: Industry specific language models & large scale QA systems. + +Some of our work: +- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) +- [FARM](https://github.com/deepset-ai/FARM) +- [Haystack](https://github.com/deepset-ai/haystack/) + +Get in touch: +[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) From f3bcc772bae9f560215ca37d4efa5027b82ea894 Mon Sep 17 00:00:00 2001 From: Branden Chan Date: Mon, 26 Oct 2020 11:33:45 +0100 Subject: [PATCH 3/3] Add metadata --- model_cards/deepset/gbert-base/README.md | 9 +++++++++ model_cards/deepset/gbert-large/README.md | 10 ++++++++++ model_cards/deepset/gelectra-base-generator/README.md | 9 +++++++++ model_cards/deepset/gelectra-base/README.md | 9 +++++++++ model_cards/deepset/gelectra-large-generator/README.md | 10 ++++++++++ model_cards/deepset/gelectra-large/README.md | 10 ++++++++++ 6 files changed, 57 insertions(+) diff --git a/model_cards/deepset/gbert-base/README.md b/model_cards/deepset/gbert-base/README.md index 5d87b5b4e3d02c..d6404262d0b468 100644 --- a/model_cards/deepset/gbert-base/README.md +++ b/model_cards/deepset/gbert-base/README.md @@ -1,3 +1,12 @@ +--- +language: de +license: mit +datasets: +- wikipedia +- OPUS +- OpenLegalData +--- + # German BERT base Released, Oct 2020, this is a German BERT language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that it outperforms its predecessors. diff --git a/model_cards/deepset/gbert-large/README.md b/model_cards/deepset/gbert-large/README.md index 61466ab39425c6..a8aea0d6c20630 100644 --- a/model_cards/deepset/gbert-large/README.md +++ b/model_cards/deepset/gbert-large/README.md @@ -1,3 +1,13 @@ +--- +language: de +license: mit +datasets: +- wikipedia +- OPUS +- OpenLegalData +- OSCAR +--- + # German BERT large Released, Oct 2020, this is a German BERT language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that it outperforms its predecessors. diff --git a/model_cards/deepset/gelectra-base-generator/README.md b/model_cards/deepset/gelectra-base-generator/README.md index 54b0119abd8977..ed7ee78e51fb53 100644 --- a/model_cards/deepset/gelectra-base-generator/README.md +++ b/model_cards/deepset/gelectra-base-generator/README.md @@ -1,3 +1,12 @@ +--- +language: de +license: mit +datasets: +- wikipedia +- OPUS +- OpenLegalData +--- + # German ELECTRA base generator Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. diff --git a/model_cards/deepset/gelectra-base/README.md b/model_cards/deepset/gelectra-base/README.md index ed4b196bb2362a..a0b2e2f0ed8dd4 100644 --- a/model_cards/deepset/gelectra-base/README.md +++ b/model_cards/deepset/gelectra-base/README.md @@ -1,3 +1,12 @@ +--- +language: de +license: mit +datasets: +- wikipedia +- OPUS +- OpenLegalData +--- + # German ELECTRA base Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. Our evaluation suggests that this model is somewhat undertrained. For best performance from a base sized model, we recommend deepset/gbert-base diff --git a/model_cards/deepset/gelectra-large-generator/README.md b/model_cards/deepset/gelectra-large-generator/README.md index 7b4b7a3b19a6c5..606e332547aa13 100644 --- a/model_cards/deepset/gelectra-large-generator/README.md +++ b/model_cards/deepset/gelectra-large-generator/README.md @@ -1,3 +1,13 @@ +--- +language: de +license: mit +datasets: +- wikipedia +- OPUS +- OpenLegalData +- OSCAR +--- + # German ELECTRA large generator Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. diff --git a/model_cards/deepset/gelectra-large/README.md b/model_cards/deepset/gelectra-large/README.md index cf14f8e320c2f4..a76f8a928daccf 100644 --- a/model_cards/deepset/gelectra-large/README.md +++ b/model_cards/deepset/gelectra-large/README.md @@ -1,3 +1,13 @@ +--- +language: de +license: mit +datasets: +- wikipedia +- OPUS +- OpenLegalData +- OSCAR +--- + # German ELECTRA large Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that this is the state of the art German language model.