Revert "[Model Card] new cross lingual sentence model for German and …

…English (huggingface#8026)" This reverts commit 7d615ad.
fabiocapsouza · Nov 15, 2020 · 40292f8 · 40292f8
1 parent 3c50138
commit 40292f8
Show file tree

Hide file tree

Showing 3 changed files with 13 additions and 116 deletions.
diff --git a/model_cards/T-Systems-onsite/bert-german-dbmdz-uncased-sentence-stsb/README.md b/model_cards/T-Systems-onsite/bert-german-dbmdz-uncased-sentence-stsb/README.md
@@ -4,6 +4,4 @@ license: mit
 ---
 
 # bert-german-dbmdz-uncased-sentence-stsb
-**This model is outdated!**
-
-The new [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) model is better for German language. It is also the current best model for English language and works cross-lingually. Please consider using that model.
+**This model is outdated! Please use this improved version: <https://huggingface.co/T-Systems-onsite/german-roberta-sentence-transformer-v2>**
diff --git a/model_cards/T-Systems-onsite/cross-en-de-roberta-sentence-transformer/README.md b/model_cards/T-Systems-onsite/cross-en-de-roberta-sentence-transformer/README.md
diff --git a/model_cards/T-Systems-onsite/german-roberta-sentence-transformer-v2/README.md b/model_cards/T-Systems-onsite/german-roberta-sentence-transformer-v2/README.md
@@ -1,31 +1,16 @@
 ---
 language: de
 license: mit
-tags:
-- sentence_embedding
-- search
-- pytorch 
-- xlm-roberta 
-- roberta
-- xlm-r-distilroberta-base-paraphrase-v1
-- paraphrase
-datasets:
-- STSbenchmark
-metrics:
-- Spearman’s rank correlation
-- cosine similarity
 ---
 
 # German RoBERTa for Sentence Embeddings V2
-**The new [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) model is slightly better for German language. It is also the current best model for English language and works cross-lingually. Please consider using that model.**
-
 This model is intended to [compute sentence (text embeddings)](https://www.sbert.net/docs/usage/computing_sentence_embeddings.html) for German text. These embeddings can then be compared with [cosine-similarity](https://en.wikipedia.org/wiki/Cosine_similarity) to find sentences with a similar semantic meaning. For example this can be useful for [semantic textual similarity](https://www.sbert.net/docs/usage/semantic_textual_similarity.html), [semantic search](https://www.sbert.net/docs/usage/semantic_search.html), or [paraphrase mining](https://www.sbert.net/docs/usage/paraphrase_mining.html). To do this you have to use the [Sentence Transformers Python framework](https://github.com/UKPLab/sentence-transformers).
 
-> Sentence-BERT (SBERT) is a  modification  of  the  pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT.
+> Sentence-BERT (SBERT) is a  modification  of  the  pretrained BERT network that use siamese and triplet network structures to derive semantically mean-ingful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT.
 
 Source: [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084)
 
-This model is fine-tuned from [Philip May](https://eniak.de/) and open-sourced by [T-Systems-onsite](https://www.t-systems-onsite.de/). Special thanks to [Nils Reimers](https://www.nils-reimers.de/) for your awesome open-source work, the Sentence Transformers, the models and your help on GitHub.
+This model is fine-tuned from [Philip May](https://eniak.de/) and open-sourced by [T-Systems-onsite](https://www.t-systems-onsite.de/). Special thanks to [Nils Reimers](https://www.nils-reimers.de/) for your awesome open-source work, the Sentence Transformers, the models and all your help on GitHub.
 
 ## How to use
 **The usage description above - provided by Hugging Face - is wrong for sentence embeddings! Please use this:**
@@ -43,7 +28,7 @@ For details of usage and examples see here:
 - [Paraphrase Mining](https://www.sbert.net/docs/usage/paraphrase_mining.html)
 - [Semantic Search](https://www.sbert.net/docs/usage/semantic_search.html)
 - [Cross-Encoders](https://www.sbert.net/docs/usage/cross-encoder.html)
-- [Examples on GitHub](https://github.com/UKPLab/sentence-transformers/tree/master/examples)
+- [Examples on GitHub](https://github.com/UKPLab/sentence-transformers/tree/master/examples/applications)
 
 ## Training
 The base model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base). This model has been further trained by [Nils Reimers](https://www.nils-reimers.de/) on a large scale paraphrase dataset for 50+ languages. [Nils Reimers](https://www.nils-reimers.de/) about this [on GitHub](https://github.com/UKPLab/sentence-transformers/issues/509#issuecomment-712243280):
@@ -58,25 +43,24 @@ The base model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base). T
 
 The resulting model called `xlm-r-distilroberta-base-paraphrase-v1` has been released here: <https://github.com/UKPLab/sentence-transformers/releases/tag/v0.3.8>
 
-Building on this cross language model we fine-tuned it for German language on the [deepl.com](https://www.deepl.com/translator) dataset of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark).
+Building on this cross language model we fine-tuned it for German language on the deepl.com dataset of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark).
 
-We did an automatic hyperparameter search for 102 trials with [Optuna](https://github.com/optuna/optuna). Using 10-fold crossvalidation on the deepl.com test and dev dataset we found the following best hyperparameters:
+We did an automatic hyperprameter search for 102 trials with [Optuna](https://github.com/optuna/optuna). Using crossvalidation on the deepl.com test and dev dataset we found the following best hyperprameters:
 - batch_size = 15
 - num_epochs = 4
 - lr = 2.2995320905210864e-05
 - eps = 1.8979875906303792e-06
 - weight_decay = 0.003314045812507563
 - warmup_steps_proportion = 0.46141685205829014
 
-The final model was trained with these hyperparameters on the combination of `sts_de_train.csv` and `sts_de_dev.csv`. The `sts_de_test.csv` was left for testing.
+The final model was trained with these hyperparameters on the combination of `sts_de_train.csv` and `sts_de_dev.csv`. The `sts_de_test.csv` was left for testing. The AWS dataset has not been used.
 
 # Evaluation
 The evaluation has been done on the test set of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark). The code is available on [Colab](https://colab.research.google.com/drive/1aCWOqDQx953kEnQ5k4Qn7uiixokocOHv?usp=sharing). As the metric for evaluation we use the Spearman’s rank correlation between the  cosine-similarity of the sentence embeddings and STSbenchmark labels.
 
-| Model Name                           | Spearman rank correlation<br/>(German)           |
-|--------------------------------------|-------------------------------------|
-| xlm-r-distilroberta-base-paraphrase-v1                        | 0.8079     |
-| xlm-r-100langs-bert-base-nli-stsb-mean-tokens                 | 0.8194     |
-| xlm-r-bert-base-nli-stsb-mean-tokens                          | 0.8194     |
-| **T-Systems-onsite/<br/>german-roberta-sentence-transformer-v2**   | **0.8529** |
-| **[T-Systems-onsite/<br/>cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer)** | **0.8550** |
+| Model Name                           | Spearman rank correlation         |
+|--------------------------------------|-----------------------------------|
+| xlm-r-distilroberta-base-paraphrase-v1                      | 0.8079     |
+| xlm-r-100langs-bert-base-nli-stsb-mean-tokens               | 0.8194     |
+| xlm-r-bert-base-nli-stsb-mean-tokens                        | 0.8194     |
+| **T-Systems-onsite/german-roberta-sentence-transformer-v2** | **0.8529** |