-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JinaChat to the leaderboards #117
Conversation
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
src/alpaca_eval/leaderboards/data_AlpacaEval/alpaca_eval_gpt4_leaderboard.csv
Outdated
Show resolved
Hide resolved
src/alpaca_eval/decoders/jinachat.py
Outdated
logging.info(f"Completed {n_examples} examples in {t}.") | ||
|
||
# refer to https://chat.jina.ai/billing | ||
price = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
price is not 0. Pleasier either:
- use
[np.nan] * len(completions)
to say that the price is not given
or - (better) use the estimated price which seems to be approximately
[0 if len(c) < 100 else 0.08 for c in completions]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the standard package, I calculate pricing in the following way:
if msg tokens is more than 300, price is 0.08
otherwise 0
Thanks for the contribution, JinaChar seems like a cool project/product! Please make the few small changes above and I'll merge |
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
@YannDubs suggestions applied! can you take a look again? thx |
LGTM although I don't have access to the API key so I can't test. |
JinaChat Evaluation
PR includes JinaChat evaluation on AlpacaEval dataset using both
gpt4
andclaude
evaluators.Instructions to evaluate the model
Set JinaChat api key as an env variable
and run