Releases: tatsu-lab/alpaca_eval
Releases · tatsu-lab/alpaca_eval
Release v0.5.1
Release v0.5.0
What's Changed
- Fix mssg check by @Muennighoff in #174
- Add MiniChat-1.5-3B to AlpacaEval and Fix MiniChat-3B by @GeneZC in #176
- Add 01-ai/Yi-34B-Chat to AlpacaEval by @HyperdriveHustle in #175
- feat: add way to verify results by @YannDubs in #177
- show img in readme by @YannDubs in #178
- Add PairRM best-of-16 to AlpacaEval by @jdf-prog in #181
- Verify Yi by @YannDubs in #182
- chore: add phi-2 sft by @lxuechen in #184
- add cut-13b by @wwxu21 in #186
- chore: add phi-2 dpo by @lxuechen in #185
- Support phi2, Support SOLAR 10.7B LMCocktail by @yhyu13 in #183
- Update openai.py by @Muennighoff in #188
- chore: add link for phi-2-sft by @lxuechen in #190
- chore: fix links by @lxuechen in #191
- Add deita-7b-v1.0 model by @VPeterV in #192
- [ENH] Azure OAI client & more general way of switching between client configs by @YannDubs in #193
- [ENH] Weighted win rates by @YannDubs in #189
- [ENH] new models: Gemini / claude2.1 / mistral / mixtral / .. by @YannDubs in #195
- [ENH] alpaca_eval 2.0 by @YannDubs in #196
New Contributors
- @Muennighoff made their first contribution in #174
- @HyperdriveHustle made their first contribution in #175
- @jdf-prog made their first contribution in #181
- @lxuechen made their first contribution in #184
- @wwxu21 made their first contribution in #186
- @yhyu13 made their first contribution in #183
- @VPeterV made their first contribution in #192
Full Changelog: v0.3.6...v0.5.0
Release v0.3.6
What's Changed
- feat: verify all the cohere model & use it as eval by @YannDubs in #170
- Add Tulu 2 models to AlpacaEval by @hamishivi in #171
New Contributors
- @hamishivi made their first contribution in #171
Full Changelog: v0.3.5...v0.3.6
Release v0.3.5
Release v0.3.4
Release v0.3.3
Release v0.3.2
What's Changed
- add UltraLM-13b-V2.0/UltraLM-13b-V2.0-best-of-16/UltraLM-13b-best-of-16 to AlpacaEval by @lifan-yuan in #139
- Add annotations & fix leaderboard by @YannDubs in #142
- refresh Cohere by @sanderland in #141
- Add PlatoLM-7B to AlpacaEval by @renatz in #143
- Add evo-7b to AlpacaEval by @zfang in #144
- Add NEFTune models to AlpacaEval by @neelsjain in #146
- Add claude2-alpaca-13b, recycled-wizardlm-7b-v1.0, recycled-wizardlm-… by @MingLiiii in #147
- Add CausalLM/14B to AlpacaEval by @CausalLM in #148
- Add Zephyr 7B evals by @lewtun in #152
- Add Evo v2 7B by @zfang in #153
- Add decoder for calling Anthropic models via Amazon Bedrock by @billcai in #151
- cohere update by @sanderland in #155
- feat: upgrade to openai 1.0.0 by @YannDubs in #157
New Contributors
- @lifan-yuan made their first contribution in #139
- @renatz made their first contribution in #143
- @zfang made their first contribution in #144
- @neelsjain made their first contribution in #146
- @MingLiiii made their first contribution in #147
- @CausalLM made their first contribution in #148
- @lewtun made their first contribution in #152
- @billcai made their first contribution in #151
Full Changelog: v0.3.1...v0.3.2
Release v0.3.1
Release v0.3.0
Release v0.2.9
What's Changed
- Ensure primary keys are string & decrease processes for OpenAI by @YannDubs in #116
- Add JinaChat to the leaderboards by @jupyterjazz in #117
- [BUG] jina chat error in configs by @YannDubs in #118
- Add Humpback to AlpacaEval by @xianxl in #120
- update Humpback results by @xianxl in #121
- add link to Humpback paper by @xianxl in #122
- Add
vllm
decoder for model inference by @44670 in #124 - [ENH] return
completions_all
and allow sequence of max_tokens by @YannDubs in #125
New Contributors
- @jupyterjazz made their first contribution in #117
- @xianxl made their first contribution in #120
Full Changelog: v0.2.8...v0.2.9