Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish leaderboard and benchmark does not match #1917

Closed
KennethEnevoldsen opened this issue Feb 1, 2025 · 2 comments · Fixed by #1956
Closed

Polish leaderboard and benchmark does not match #1917

KennethEnevoldsen opened this issue Feb 1, 2025 · 2 comments · Fixed by #1956
Assignees
Labels
leaderboard issues related to the leaderboard

Comments

@KennethEnevoldsen
Copy link
Contributor

KennethEnevoldsen commented Feb 1, 2025

Seems like the implementation in benchmarks.py uses the STSBenchmarkMultilingualSTS task, but the version on the current MTEB leaderboard does not use this task. This makes the scores between v1 and v2 incompatible.

I see two solutions:

  1. removing it from benchmarks.py
  2. the old leaderboard is incorrect and doesn't change anything (this will lead to a few cases where top models in v1 appear at the top in v2 as they don't have the score)

@rafalposwiata I will leave it to you to decide between the two. I would probably prefer 1.

related to #1867

@KennethEnevoldsen
Copy link
Contributor Author

I will assume that the previous leaderboard is correct and remove STSBenchmarkMultilingualSTS

@rafalposwiata
Copy link
Contributor

Task STSBenchmarkMultilingualSTS was not previously included in the PL-MTEB so it can be removed.

I think the results for clustering tasks need to be verified, as v2 versions of the tasks have appeared. There may be an incompatibility here. I will check it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leaderboard issues related to the leaderboard
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants