Add openbuddy-llama-30b-v7.1 to AlpacaEval #108

44670 · 2023-08-02T09:52:35Z

We are pleased to submit our results to the leaderboard, with a win_rate of 81.55.

However, we encountered a problem. The model did not update the csv file in the repo, instead, there is a one-sample csv file in the results/openbuddy-llama-30b-v7.1 directory.

The program outputs as follows before finishing:

INFO:root:drop 3 outputs that are not[0, 1, 2]
INFO:root:Saving all results to results/openbuddy-llama-30b-v7.1
INFO:root:Not saving the result to the cached leaderboard because precomputed_leaderboard is not a path but <class 'NoneType'>.
                          win_rate  standard_error  n_total  avg_length
openbuddy-llama-30b-v7.1     81.55            1.37      802         968

For your reference, the attachment below contains all the files we found in the results/openbuddy-llama-30b-v7.1 directory.

openbuddy-llama-30b-v7.1.zip

YannDubs · 2023-08-02T12:41:17Z

Thanks @44670, those are some cool results, especially for this length!

Sorry, for the leaderboard issue. It was a known issue #77 which is now solved.

Can please you run

alpaca_eval --model_outputs="results/openbuddy-llama-30b-v7.1/outputs.json"

that will generate the cache leaderboard you should add to the PR. Note that annotations are cached so you will not actually reannotate anything.

44670 · 2023-08-02T13:13:51Z

Hi, looks like the file "results/openbuddy-llama-30b-v7.1/outputs.json" does not exist.

ls results/openbuddy-llama-30b-v7.1
annotations.json  leaderboard.csv  model_outputs.json  reference_outputs.json

YannDubs · 2023-08-02T13:16:29Z

Sorry I meant
alpaca_eval --model_outputs="results/openbuddy-llama-30b-v7.1/model_outputs.json"

44670 · 2023-08-02T13:21:03Z

It works!

I have just pushed the updated alpaca_eval_gpt4_leaderboard.csv file.

44670 · 2023-08-02T13:40:18Z

I did pushed a commit on https://github.com/OpenBuddy/alpaca_eval but looks like it doesn't show up on this page.

Should I do anything more on my side in GitHub?

EDIT: It should have been a bug of GitHub, I have added another more commit and things works now.

YannDubs · 2023-08-02T15:36:47Z

src/alpaca_eval/models_configs/openbuddy-falcon-7b-v6/configs.yaml

@@ -0,0 +1,13 @@
+openbuddy-falcon-7b-v6:


I don't see the evaluations for this model, can you remove this configs from the PR or evaluate the model?

YannDubs · 2023-08-02T15:37:00Z

src/alpaca_eval/models_configs/openbuddy-llama-65b-v8/configs.yaml

@@ -0,0 +1,13 @@
+openbuddy-llama-65b-v8:


I don't see the evaluations for this model, can you remove this config from the PR or evaluate the model?

YannDubs · 2023-08-02T15:39:24Z

thanks @44670
small comment: either evaluate the model configs you added or remove them. Once that's done I can merge the PR and ti will show up on the leaderboard!

44670 · 2023-08-02T15:42:11Z

Thanks!
For these two models, the evaluations are still running, I will inform you when they are done.

44670 · 2023-08-02T23:22:39Z

Hi! All the results has been pushed.

Please let me know if you need further assistance.

YannDubs · 2023-08-03T00:39:41Z

Great thanks @44670 !

Add openbuddy-llama-30b-v7.1 to AlpacaEval

c8e6702

update leaderboard for openbuddy

f8a9616

Add cfg files for other openbuddy models

166291a

YannDubs reviewed Aug 2, 2023

View reviewed changes

44670 added 2 commits August 3, 2023 05:13

Add results for openbuddy-falcon

461f0dc

Add openbuddy llama 65b result

59a1c5f

YannDubs merged commit ce1123b into tatsu-lab:main Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add openbuddy-llama-30b-v7.1 to AlpacaEval #108

Add openbuddy-llama-30b-v7.1 to AlpacaEval #108

44670 commented Aug 2, 2023

YannDubs commented Aug 2, 2023

44670 commented Aug 2, 2023 •

edited

Loading

YannDubs commented Aug 2, 2023

44670 commented Aug 2, 2023

44670 commented Aug 2, 2023 •

edited

Loading

YannDubs Aug 2, 2023

YannDubs Aug 2, 2023

YannDubs commented Aug 2, 2023

44670 commented Aug 2, 2023

44670 commented Aug 2, 2023

YannDubs commented Aug 3, 2023

Add openbuddy-llama-30b-v7.1 to AlpacaEval #108

Add openbuddy-llama-30b-v7.1 to AlpacaEval #108

Conversation

44670 commented Aug 2, 2023

YannDubs commented Aug 2, 2023

44670 commented Aug 2, 2023 • edited Loading

YannDubs commented Aug 2, 2023

44670 commented Aug 2, 2023

44670 commented Aug 2, 2023 • edited Loading

YannDubs Aug 2, 2023

Choose a reason for hiding this comment

YannDubs Aug 2, 2023

Choose a reason for hiding this comment

YannDubs commented Aug 2, 2023

44670 commented Aug 2, 2023

44670 commented Aug 2, 2023

YannDubs commented Aug 3, 2023

44670 commented Aug 2, 2023 •

edited

Loading

44670 commented Aug 2, 2023 •

edited

Loading