Skip to content

Commit

Permalink
Automated leaderboard update
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Aug 8, 2024
1 parent f49b647 commit 804c245
Showing 1 changed file with 3 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,15 @@ Claude 3.5 Sonnet (06/20),52.36675427146999,40.56021409682828,1488,,https://gith
Yi-Large Preview,51.894415134099546,57.46724251946292,2335,,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/yi-large-preview/model_outputs.json,verified
Storm-7B,50.45110959343775,50.26886905528583,2045,https://huggingface.co/jieliu/Storm-7B,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Storm-7B/model_outputs.json,community
GPT-4 Preview (11/06),50.0,50.0,2049,,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/gpt4_1106_preview/model_outputs.json,minimal
Infinity-Instruct-7M-0729-Llama3_1-70B,46.10043331712677,37.46327383827497,1654,https://huggingface.co/BAAI/Infinity-Instruct-7M-0729-Llama3_1-70B,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Infinity-Instruct-7M-0729-Llama3_1-70B/model_outputs.json,community
ExPO + Llama-3-Instruct-8B-SimPO,45.78021783946177,40.63285400856655,1765,https://huggingface.co/chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Llama-3-Instruct-8B-SimPO-ExPO/model_outputs.json,community
Llama-3-Instruct-8B-SimPO,44.65131348921881,40.52977498461182,1825,https://huggingface.co/princeton-nlp/Llama-3-Instruct-8B-SimPO,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Llama-3-Instruct-8B-SimPO/model_outputs.json,community
Nanbeige Plus Chat v0.1,44.45966240337981,56.70300973017392,2587,https://huggingface.co/spaces/Nanbeige/Nanbeige-Plus-Chat-v0.1,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Nanbeige-Plus-Chat-v0.1/model_outputs.json,community
Qwen1.5 110B Chat,43.90555221078692,33.77709527565118,1631,https://huggingface.co/Qwen/Qwen1.5-110B-Chat,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Qwen1.5-110B-Chat/model_outputs.json,community
Aligner 2B+Claude 3 Opus,41.823071715247664,34.46337362321739,1669,https://github.com/AlignInc/aligner-replication,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/aligner-2b_claude-3-opus-20240229/model_outputs.json,community
Nanbeige2 16B Chat,40.591286349562864,37.03608605005168,1867,https://huggingface.co/Nanbeige/Nanbeige2-16B-Chat,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Nanbeige2-16B-Chat/model_outputs.json,community
Claude 3 Opus (02/29),40.5095080124761,29.10526953334248,1388,,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/claude-3-opus-20240229/model_outputs.json,minimal
Infinity-Instruct-7M-0729-mistral-7B,39.66949964831439,34.347412485016434,1742,https://huggingface.co/BAAI/Infinity-Instruct-7M-0729-mistral-7B,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Infinity-Instruct-7M-0729-mistral-7B/model_outputs.json,community
Llama 3.1 405B Instruct,39.25732749961743,39.10666895419877,1988,https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Meta-Llama-3.1-405B-Instruct-Turbo/model_outputs.json,minimal
SPPO-Llama-3-Instruct-8B-PairRM,38.56280663670214,39.67286090605648,2066,https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/SPPO-Llama-3-Instruct-8B-PairRM/model_outputs.json,community
GPT-4,38.12808974440021,23.576789314782605,1365,,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/gpt4/model_outputs.json,verified
Expand All @@ -34,6 +36,7 @@ Ein 70B v0.1,35.029054008520646,24.84472049689441,1467,https://huggingface.co/SF
Claude 3 Sonnet (02/29),34.87247436243302,25.556325292273296,1420,,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/claude-3-sonnet-20240229/model_outputs.json,minimal
FsfairX-Zephyr-Chat-v0.1,34.78744762311656,35.94648644102434,2275,https://huggingface.co/sfairXC/FsfairX-Zephyr-Chat-v0.1,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/FsfairX-Zephyr-Chat-v0.1/model_outputs.json,community
Llama 3 70B Instruct,34.42459717459881,33.17785695886864,1919,https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Meta-Llama-3-70B-Instruct/model_outputs.json,minimal
Infinity-Instruct-7M-0729-Llama3_1-8B,33.918371039899895,28.525177213928405,1640,https://huggingface.co/BAAI/Infinity-Instruct-7M-0729-Llama3_1-8B,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/Infinity-Instruct-7M-0729-Llama3_1-8B/model_outputs.json,community
Mistral Large (24/02),32.65207998531868,21.43877598137888,1362,https://mistral.ai/news/la-plateforme/,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/mistral-large-2402/model_outputs.json,verified
ExPO + SPPO-Mistral7B-PairRM,31.822321960655582,35.4431306716895,2288,https://huggingface.co/chujiezheng/Mistral7B-PairRM-SPPO-ExPO,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/SPPO-Mistral7B-PairRM-ExPO/model_outputs.json,community
merlinite-7B-AOT,31.721885287042845,29.89635084070223,1855,,https://github.com/tatsu-lab/alpaca_eval/blob/main/results/merlinite-7B-AOT/model_outputs.json,community
Expand Down

0 comments on commit 804c245

Please sign in to comment.