Releases · tatsu-lab/alpaca_eval

10 Jan 06:16

github-actions

v0.5.1

91a903f

Release v0.5.1

What's Changed

[BUG] fix no OAI org id set by @YannDubs in #200

Full Changelog: v0.5.0...v0.5.1

Contributors

YannDubs

Assets 2

10 Jan 02:32

github-actions

v0.5.0

0c14d6f

Release v0.5.0

What's Changed

Fix mssg check by @Muennighoff in #174
Add MiniChat-1.5-3B to AlpacaEval and Fix MiniChat-3B by @GeneZC in #176
Add 01-ai/Yi-34B-Chat to AlpacaEval by @HyperdriveHustle in #175
feat: add way to verify results by @YannDubs in #177
show img in readme by @YannDubs in #178
Add PairRM best-of-16 to AlpacaEval by @jdf-prog in #181
Verify Yi by @YannDubs in #182
chore: add phi-2 sft by @lxuechen in #184
add cut-13b by @wwxu21 in #186
chore: add phi-2 dpo by @lxuechen in #185
Support phi2, Support SOLAR 10.7B LMCocktail by @yhyu13 in #183
Update openai.py by @Muennighoff in #188
chore: add link for phi-2-sft by @lxuechen in #190
chore: fix links by @lxuechen in #191
Add deita-7b-v1.0 model by @VPeterV in #192
[ENH] Azure OAI client & more general way of switching between client configs by @YannDubs in #193
[ENH] Weighted win rates by @YannDubs in #189
[ENH] new models: Gemini / claude2.1 / mistral / mixtral / .. by @YannDubs in #195
[ENH] alpaca_eval 2.0 by @YannDubs in #196

New Contributors

@Muennighoff made their first contribution in #174
@HyperdriveHustle made their first contribution in #175
@jdf-prog made their first contribution in #181
@lxuechen made their first contribution in #184
@wwxu21 made their first contribution in #186
@yhyu13 made their first contribution in #183
@VPeterV made their first contribution in #192

Full Changelog: v0.3.6...v0.5.0

Contributors

lxuechen, yhyu13, and 7 other contributors

Assets 2

24 Nov 22:50

github-actions

v0.3.6

9e8e898

Release v0.3.6

What's Changed

feat: verify all the cohere model & use it as eval by @YannDubs in #170
Add Tulu 2 models to AlpacaEval by @hamishivi in #171

New Contributors

@hamishivi made their first contribution in #171

Full Changelog: v0.3.5...v0.3.6

Contributors

hamishivi and YannDubs

Assets 2

16 Nov 23:19

github-actions

v0.3.5

ba9e449

Release v0.3.5

What's Changed

[WIP] GPT4 turbo as evaluator by @YannDubs in #160
[ENH] add GPT4 turbo as evaluator in README by @YannDubs in #165
Add minichat-3b to AlpacaEval by @GeneZC in #167
fix: filter openai spam filter by @YannDubs in #169

New Contributors

@GeneZC made their first contribution in #167

Full Changelog: v0.3.3...v0.3.5

Contributors

GeneZC and YannDubs

Assets 2

16 Nov 23:14

github-actions

vv0.3.4

ca0f1f6

Release v0.3.4

What's Changed

[WIP] GPT4 turbo as evaluator by @YannDubs in #160
[ENH] add GPT4 turbo as evaluator in README by @YannDubs in #165
Add minichat-3b to AlpacaEval by @GeneZC in #167
fix: filter openai spam filter by @YannDubs in #169

New Contributors

@GeneZC made their first contribution in #167

Full Changelog: v0.3.3...vv0.3.4

Contributors

GeneZC and YannDubs

Assets 2

08 Nov 08:25

github-actions

v0.3.3

b4ae018

Release v0.3.3

What's Changed

Gpt4 turbo by @YannDubs in #159

Full Changelog: v0.3.2...v0.3.3

Contributors

YannDubs

Assets 2

08 Nov 07:18

github-actions

v0.3.2

c206a98

Release v0.3.2

What's Changed

add UltraLM-13b-V2.0/UltraLM-13b-V2.0-best-of-16/UltraLM-13b-best-of-16 to AlpacaEval by @lifan-yuan in #139
Add annotations & fix leaderboard by @YannDubs in #142
refresh Cohere by @sanderland in #141
Add PlatoLM-7B to AlpacaEval by @renatz in #143
Add evo-7b to AlpacaEval by @zfang in #144
Add NEFTune models to AlpacaEval by @neelsjain in #146
Add claude2-alpaca-13b, recycled-wizardlm-7b-v1.0, recycled-wizardlm-… by @MingLiiii in #147
Add CausalLM/14B to AlpacaEval by @CausalLM in #148
Add Zephyr 7B evals by @lewtun in #152
Add Evo v2 7B by @zfang in #153
Add decoder for calling Anthropic models via Amazon Bedrock by @billcai in #151
cohere update by @sanderland in #155
feat: upgrade to openai 1.0.0 by @YannDubs in #157

New Contributors

@lifan-yuan made their first contribution in #139
@renatz made their first contribution in #143
@zfang made their first contribution in #144
@neelsjain made their first contribution in #146
@MingLiiii made their first contribution in #147
@CausalLM made their first contribution in #148
@lewtun made their first contribution in #152
@billcai made their first contribution in #151

Full Changelog: v0.3.1...v0.3.2

Contributors

zfang, billcai, and 8 other contributors

Assets 2

19 Sep 20:58

github-actions

v0.3.1

1dc5011

Release v0.3.1

What's Changed

Add results of Xwin-LM by @nbl97 in #135
[ENH] add gpt 3.5 instruct by @YannDubs in #137

New Contributors

@nbl97 made their first contribution in #135

Full Changelog: v0.3.0...v0.3.1

Contributors

YannDubs and nbl97

Assets 2

01 Sep 05:30

github-actions

v0.3.0

e8d151d

Release v0.3.0

What's Changed

[ENH] add fixed gpt4 version annotator by @YannDubs in #127
Add openbuddy-llama2-13b-v11.1 by @44670 in #129
[ENH] add max concurrency oai by @YannDubs in #131

Full Changelog: v0.2.9...v0.3.0

Contributors

44670 and YannDubs

Assets 2

23 Aug 02:46

github-actions

v0.2.9

be2aae3

Release v0.2.9

What's Changed

Ensure primary keys are string & decrease processes for OpenAI by @YannDubs in #116
Add JinaChat to the leaderboards by @jupyterjazz in #117
[BUG] jina chat error in configs by @YannDubs in #118
Add Humpback to AlpacaEval by @xianxl in #120
update Humpback results by @xianxl in #121
add link to Humpback paper by @xianxl in #122
Add vllm decoder for model inference by @44670 in #124
[ENH] return completions_all and allow sequence of max_tokens by @YannDubs in #125

New Contributors

@jupyterjazz made their first contribution in #117
@xianxl made their first contribution in #120

Full Changelog: v0.2.8...v0.2.9

Contributors

44670, YannDubs, and 2 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

Releases: tatsu-lab/alpaca_eval

Release v0.5.1

What's Changed

Contributors

Release v0.5.0

What's Changed

New Contributors

Contributors

Release v0.3.6

What's Changed

New Contributors

Contributors

Release v0.3.5

What's Changed

New Contributors

Contributors

Release v0.3.4

What's Changed

New Contributors

Contributors

Release v0.3.3

What's Changed

Contributors

Release v0.3.2

What's Changed

New Contributors

Contributors

Release v0.3.1

What's Changed

New Contributors

Contributors

Release v0.3.0

What's Changed

Contributors

Release v0.2.9

What's Changed

New Contributors

Contributors