Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable analyzing evaluators/annotators on data without multiple generator models #293

Merged
merged 6 commits into from
Apr 28, 2024

Conversation

rdnfn
Copy link
Contributor

@rdnfn rdnfn commented Apr 27, 2024

Currently, using alpaca_eval.main.analyze_evaluators evaluators/annotators can only be analyzed on data (like the original AlpacaEval dataset) that contains texts from more than one generator model. If a dataset only contains a single generator model, the computation of the (Spearman/Pearson) correlation between the winrates of these models under different annotators fails and throws an error. The computation fails because there are not enough values to correlate.

This PR makes the correlation computation optional: if the winrate correlation computation fails, np.nan values get returned instead and a warning log message is printed. The rest of the computed metrics get returned and no error is thrown. This allows analyzing evaluators on new kinds of data without multiple generator models. Other metrics, such as human agreement, can still be correctly computed in this case (correct me if I am wrong about this).

@YannDubs
Copy link
Collaborator

LGTM, thanks @rdnfn !

@YannDubs YannDubs merged commit c4a4ca7 into tatsu-lab:main Apr 28, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants