Skip to content

Commit

Permalink
Merge branch 'embeddings-benchmark:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
imenelydiaker authored May 14, 2024
2 parents dfcf815 + d87a920 commit 3c9748d
Show file tree
Hide file tree
Showing 65 changed files with 3,224 additions and 182 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,7 @@ You may also want to read and cite the amazing work that has extended MTEB & int
- Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff. "[C-Pack: Packaged Resources To Advance General Chinese Embedding](https://arxiv.org/abs/2309.07597)" arXiv 2023
- Michael Günther, Jackmin Ong, Isabelle Mohr, Alaeddine Abdessalem, Tanguy Abel, Mohammad Kalim Akram, Susana Guzman, Georgios Mastrapas, Saba Sturua, Bo Wang, Maximilian Werk, Nan Wang, Han Xiao. "[Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents](https://arxiv.org/abs/2310.19923)" arXiv 2023
- Silvan Wehrli, Bert Arnrich, Christopher Irrgang. "[German Text Embedding Clustering Benchmark](https://arxiv.org/abs/2401.02709)" arXiv 2024
- Dawei Zhu, Liang Wang, Nan Yang, Yifan Song, Wenhao Wu, Furu Wei, and Sujian Li. "[LongEmbed: Extending Embedding Models for Long Context Retrieval](https://arxiv.org/abs/2404.12096)" arXiv 2024
- Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini. "[FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions](https://arxiv.org/abs/2403.15246)" arXiv 2024
- Dawei Zhu, Liang Wang, Nan Yang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li. "[LongEmbed: Extending Embedding Models for Long Context Retrieval](https://arxiv.org/abs/2404.12096)" arXiv 2024

For works that have used MTEB for benchmarking, you can find them on the [leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
3 changes: 3 additions & 0 deletions docs/mmteb/points.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ Please also add your first name and last name are as you want them to appear in
| MathieuCiancone | Mathieu | Ciancone | | | Wikit, Lyon, France |
| MartinBernstorff | Martin | Bernstorff | martinbernstorff@gmail.com | ~Martin_Bernstorff1 | Aarhus University, Denmark |
| staoxiao | Shitao | Xiao | 2906698981@qq.com | ~Shitao_Xiao1 | Beijing Academy of Artificial Intelligence |
| ZhengLiu101 | Zheng | Liu | zhengliu1026@gmail.com | ~Zheng_Liu4 | Beijing Academy of Artificial Intelligence |
| achibb | Aaron | Chibb | | | N/A |
| cassanof | Federico | Cassano | federico.cassanno@federico.codes | ~Federico_Cassano1 | Northeastern University, Boston, USA |
| taidnguyen | Nguyen | Tai | taing@seas.upenn.edu | ~Nguyen_Tai1 | University of Pennsylvania |
Expand Down Expand Up @@ -75,5 +76,7 @@ Please also add your first name and last name are as you want them to appear in
| SaitejaUtpala | Saiteja | Utpala | saitejautpala@gmail.com | ~Saiteja_Utpala1 | Microsoft Research|
| mmhamdy | Mohammed | Hamdy | mhamdy.res@gmail.com | ~Mohammed_Hamdy1 | Cohere For AI Community|
| jupyterjazz | Saba | Sturua | saba.sturua@jina.ai | ~Saba_Sturua1 | Jina AI |
| kranthigv | Kranthi Kiran | GV | kranthi.gv@nyu.edu | ~Kranthi_Kiran_GV1 | New York University|
| jupyterjazz | Saba | Sturua | saba.sturua@jina.ai | ~Saba_Sturua1 | Jina AI
| shreeya-dhakal | Shreeya | Dhakal | ssdhakal57@gmail.com | | Individual Contributor |
| dipam7 | Dipam | Vasani | dipam44@gmail.com | ~Dipam_Vasani1 | Individual Contributor |
2 changes: 2 additions & 0 deletions docs/mmteb/points/682.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"GitHub": "awinml", "New dataset": 20}
{"GitHub": "KennethEnevoldsen", "Review PR": 2}
3 changes: 3 additions & 0 deletions docs/mmteb/points/690.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"GitHub": "kranthigv", "New dataset": 20}
{"GitHub": "awinml", "Review PR": 2}
{"GitHub": "KennethEnevoldsen", "Review PR": 2}
2 changes: 2 additions & 0 deletions docs/mmteb/points/693.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"GitHub": "awinml", "New dataset": 10}
{"GitHub": "KennethEnevoldsen", "Review PR": 2}
2 changes: 2 additions & 0 deletions docs/mmteb/points/712.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"GitHub": "imenelydiaker", "Bug fixes": 4}
{"GitHub": "Muennighoff", "Review PR": 2}
3 changes: 2 additions & 1 deletion docs/mmteb/points/scores_from_old_system.jsonl
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@
{"GitHub": "rasdani", "New dataset": 4}
{"GitHub": "PhilipMay", "Review PR": 2}
{"GitHub": "slvnwhrl", "New dataset": 12}
{"GitHub": "staoxiao", "New dataset": 50}
{"GitHub": "staoxiao", "New dataset": 40}
{"GitHub": "ZhengLiu101", "New dataset": 10}
{"GitHub": "NouamaneTazi", "Review PR": 2}
{"GitHub": "rafalposwiata", "New dataset": 32}
{"GitHub": "violenil", "New dataset": 26}
Expand Down
18 changes: 10 additions & 8 deletions docs/mmteb/points_table.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ _Note_: this table is **autogenerated** and should not be edited. It is intended

| GitHub | New dataset | Review PR | New task | Bug fixes | Dataset annotations | Coordination | Running Models | Total |
|:------------------|--------------:|------------:|-----------:|------------:|----------------------:|---------------:|-----------------:|--------:|
| KennethEnevoldsen | 68 | 172 | 0 | 46 | 9 | 11 | 0 | 306 |
| KennethEnevoldsen | 68 | 178 | 0 | 46 | 9 | 11 | 0 | 312 |
| isaac-chung | 102 | 138 | 0 | 8 | 0 | 4 | 0 | 252 |
| imenelydiaker | 120 | 112 | 0 | 8 | 0 | 0 | 0 | 240 |
| imenelydiaker | 120 | 112 | 0 | 12 | 0 | 0 | 0 | 244 |
| awinml | 202 | 2 | 0 | 0 | 0 | 0 | 0 | 204 |
| davidstap | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 176 |
| awinml | 172 | 0 | 0 | 0 | 0 | 0 | 0 | 172 |
| x-tabdeveloping | 144 | 8 | 12 | 2 | 0 | 1 | 0 | 167 |
| jaygala24 | 149 | 0 | 0 | 0 | 0 | 0 | 0 | 149 |
| jupyterjazz | 108 | 0 | 0 | 0 | 0 | 0 | 0 | 108 |
Expand All @@ -21,17 +21,18 @@ _Note_: this table is **autogenerated** and should not be edited. It is intended
| digantamisra98 | 71 | 0 | 0 | 0 | 0 | 0 | 0 | 71 |
| Rysias | 58 | 0 | 0 | 0 | 0 | 0 | 0 | 58 |
| shreeya-dhakal | 46 | 8 | 0 | 0 | 0 | 0 | 0 | 54 |
| staoxiao | 50 | 0 | 0 | 0 | 0 | 0 | 0 | 50 |
| asparius | 34 | 12 | 0 | 0 | 0 | 0 | 0 | 46 |
| Akash190104 | 46 | 0 | 0 | 0 | 0 | 0 | 0 | 46 |
| asparius | 34 | 12 | 0 | 0 | 0 | 0 | 0 | 46 |
| staoxiao | 40 | 0 | 0 | 0 | 0 | 0 | 0 | 40 |
| rafalposwiata | 36 | 0 | 0 | 0 | 0 | 0 | 0 | 36 |
| orionw | 0 | 4 | 10 | 20 | 0 | 0 | 0 | 34 |
| Muennighoff | 0 | 26 | 0 | 0 | 0 | 0 | 0 | 26 |
| Muennighoff | 0 | 28 | 0 | 0 | 0 | 0 | 0 | 28 |
| violenil | 26 | 0 | 0 | 0 | 0 | 0 | 0 | 26 |
| dwzhu-pku | 24 | 0 | 0 | 0 | 0 | 0 | 0 | 24 |
| taeminlee | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 22 |
| rbroc | 20 | 0 | 0 | 0 | 0 | 0 | 0 | 20 |
| mmhamdy | 20 | 0 | 0 | 0 | 0 | 0 | 0 | 20 |
| kranthigv | 20 | 0 | 0 | 0 | 0 | 0 | 0 | 20 |
| Andrian0s | 14 | 4 | 0 | 2 | 0 | 0 | 0 | 20 |
| manandey | 18 | 0 | 0 | 0 | 0 | 0 | 0 | 18 |
| MartinBernstorff | 2 | 8 | 0 | 7 | 0 | 0 | 0 | 17 |
Expand All @@ -46,13 +47,14 @@ _Note_: this table is **autogenerated** and should not be edited. It is intended
| ABorghini | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| xu3kev | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| guangyusong | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| ljvmiranda921 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| HLasse | 0 | 0 | 0 | 5 | 5 | 0 | 0 | 10 |
| ZhengLiu101 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| ljvmiranda921 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| bp-high | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
| cassanof | 8 | 0 | 0 | 1 | 0 | 0 | 1 | 10 |
| loicmagne | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 8 |
| marcobellagente93 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 6 |
| izhx | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 6 |
| marcobellagente93 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 6 |
| rasdani | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 4 |
| hanhainebula | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
| isaac-chung | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 2 |
Expand Down
Loading

0 comments on commit 3c9748d

Please sign in to comment.