You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seems like the implementation in benchmarks.py uses the STSBenchmarkMultilingualSTS task, but the version on the current MTEB leaderboard does not use this task. This makes the scores between v1 and v2 incompatible.
I see two solutions:
removing it from benchmarks.py
the old leaderboard is incorrect and doesn't change anything (this will lead to a few cases where top models in v1 appear at the top in v2 as they don't have the score)
@rafalposwiata I will leave it to you to decide between the two. I would probably prefer 1.
Task STSBenchmarkMultilingualSTS was not previously included in the PL-MTEB so it can be removed.
I think the results for clustering tasks need to be verified, as v2 versions of the tasks have appeared. There may be an incompatibility here. I will check it out.
Seems like the implementation in
benchmarks.py
uses theSTSBenchmarkMultilingualSTS
task, but the version on the current MTEB leaderboard does not use this task. This makes the scores between v1 and v2 incompatible.I see two solutions:
@rafalposwiata I will leave it to you to decide between the two. I would probably prefer 1.
related to #1867
The text was updated successfully, but these errors were encountered: