refresh Cohere #141

sanderland · 2023-09-30T13:39:00Z

Refreshes Cohere outputs to reflect the most recent model
- Removes 'chat' entry as they were close before and this inference method is more stable.
Corrects pricing estimates
Some minor fixes for tests that were throwing warnings.

I see there are various images and results that are derived from the old results. If you need anything else to update these, please let me know.

…s_configs/cohere-chat/configs.yaml and 3 other changes

…oards/evaluators/evaluators_leaderboard.csv

YannDubs · 2023-09-30T22:29:04Z

src/alpaca_eval/models_configs/cohere-chat/configs.yaml

-    model_name: "command"
-    mode: "chat"
-    max_tokens: 2048
-  pretty_name: "Cohere Chat"


Why remove the chat model?
if it's because it was updated, can we get the updated results on the leaderboard instead?

The chat mode of the command model was not really a different model before, just a different prompting style. These have been further unified (essentially turning plain instructions into single-turn chats), making them not worth listing as separate entries.
Having a single entry also will make it easier to give you more regular updates.

I see, so client.chat is being depreciated?
Then let's remove that from the code also?

It's more for multi-turn chats, indeed less appropriate for tests like this. Removed in eb52df3

YannDubs · 2023-09-30T22:36:30Z

Nice improvement @sanderland! Congrats!

Updating the command results with the last model makes sense thanks for doing that!

However, let's refrain from removing models like cohere-chat unless there are updates. Considering that some papers reference the leaderboard, I think it's important to maintain consistency by keeping all models listed. Let me know your thoughts.

YannDubs · 2023-10-01T21:06:36Z

Merged, thanks @sanderland !

sanderland added 11 commits September 22, 2023 07:17

edit src/alpaca_eval/decoders/cohere.py, remove src/alpaca_eval/model…

8cb2b7c

…s_configs/cohere-chat/configs.yaml and 3 other changes

edit src/alpaca_eval/decoders/cohere.py, edit src/alpaca_eval/leaderb…

b2f3585

…oards/evaluators/evaluators_leaderboard.csv

updated results and leaderboard

84293a2

clean

cc29216

fix tests

b034925

fix tests

2a20826

fix tests more

6552e8f

r

99a9afc

restore val

d047756

restore val

22a235f

restore val

b774cb5

YannDubs reviewed Sep 30, 2023

View reviewed changes

remove chat mode

eb52df3

YannDubs merged commit 0ac9b14 into tatsu-lab:main Oct 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refresh Cohere #141

refresh Cohere #141

sanderland commented Sep 30, 2023 •

edited

Loading

YannDubs Sep 30, 2023

sanderland Oct 1, 2023

YannDubs Oct 1, 2023

sanderland Oct 1, 2023

YannDubs commented Sep 30, 2023 •

edited

Loading

YannDubs commented Oct 1, 2023

refresh Cohere #141

refresh Cohere #141

Conversation

sanderland commented Sep 30, 2023 • edited Loading

YannDubs Sep 30, 2023

Choose a reason for hiding this comment

sanderland Oct 1, 2023

Choose a reason for hiding this comment

YannDubs Oct 1, 2023

Choose a reason for hiding this comment

sanderland Oct 1, 2023

Choose a reason for hiding this comment

YannDubs commented Sep 30, 2023 • edited Loading

YannDubs commented Oct 1, 2023

sanderland commented Sep 30, 2023 •

edited

Loading

YannDubs commented Sep 30, 2023 •

edited

Loading