Skip to content

Commit

Permalink
Merge pull request #78 from tatsu-lab/claude2
Browse files Browse the repository at this point in the history
[ENH] add claude v2
  • Loading branch information
rtaori authored Jul 12, 2023
2 parents 9f16f4d + 5021224 commit 6a1d42d
Show file tree
Hide file tree
Showing 7 changed files with 4,855 additions and 3 deletions.
4,832 changes: 4,832 additions & 0 deletions results/claude-2/model_outputs.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion src/alpaca_eval/decoders/anthropic.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def _anthropic_completion_helper(
def _get_price_per_token(model):
"""Returns the price per token for a given model"""
# https://cdn2.assets-servd.host/anthropic-website/production/images/model_pricing_may2023.pdf
if "claude-v1" in model:
if "claude-v1" in model or "claude-2" in model:
return (
11.02 / 1e6
) # that's not completely true because decoding is 32.68 but close enough given that most is context
Expand Down
12 changes: 12 additions & 0 deletions src/alpaca_eval/evaluators_configs/claude_2/configs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
claude_2:
prompt_template: "claude/basic_prompt.txt"
fn_completions: "anthropic_completions"
completions_kwargs:
model_name: "claude_2"
max_tokens_to_sample: 50
temperature: 0
completion_parser_kwargs:
outputs_to_match:
1: '(?:^|\n) ?Output \(a\)'
2: '(?:^|\n) ?Output \(b\)'
batch_size: 1
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ claude_ranking:
fn_completions: "anthropic_completions"
completions_kwargs:
model_name: "claude-v1"
max_tokens: 100
max_tokens_to_sample: 100
temperature: 0
fn_completion_parser: "ranking_parser"
batch_size: 1
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
,win_rate,standard_error,mode,n_draws,n_total,n_wins,n_wins_base
gpt4,95.27950310559004,0.716281440286153,minimal,12,805,761,32
claude-2,91.35572139303483,0.9897323784630048,minimal,1,804,734,69
vicuna-33b-v1.3,88.99253731343283,1.095692216068168,verified,5,804,713,86
claude,88.38509316770187,1.1144875403283188,minimal,9,805,707,89
openchat-v2-w-13b,87.1268656716418,1.1769197439396015,community,3,804,699,102
Expand Down
7 changes: 7 additions & 0 deletions src/alpaca_eval/models_configs/claude-2/configs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
claude-2:
prompt_template: "claude/prompt.txt"
fn_completions: "anthropic_completions"
completions_kwargs:
model_name: "claude-2"
max_tokens_to_sample: 2048
pretty_name: "Claude 2"
2 changes: 1 addition & 1 deletion src/alpaca_eval/models_configs/claude/configs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ claude:
fn_completions: "anthropic_completions"
completions_kwargs:
model_name: "claude-v1"
max_tokens: 2048
max_tokens_to_sample: 2048
pretty_name: "Claude"

0 comments on commit 6a1d42d

Please sign in to comment.