BFCL May 14th Release (GPT-4o and Gemini) #426

Fanjia-Yan · 2024-05-15T04:27:02Z

This PR makes 3 models(4 entries) available for inference on BFCL:

gpt-4o-2024-05-13 (Function Calling Mode and Prompting Mode)
gemini-1.5-pro-preview-0514 (Function Calling Mode)
gemini-1.5-flash-preview-0514(Function Calling Mode)

You can start the evaluation by running python openfunctions_evaluation.py --model MODEL_NAME and get score by running python ./eval_runner.py --model MODEL_NAME. For more detail, refer to Readme under the BFCL page.

Score changes are reflected in #428 .

This PR also updated different models' pricing:

For Gemini, when prompts are less than 128K tokens, the new Gemini series' prices are lowered by around half(https://ai.google.dev/pricing). All the BFCL test cases are less than 128K tokens.
For Anthropic Models, the prices have decreased for claude-2.1 and Claude-instant-1.2 which have updated accordingly
For Mistral Models, the prices have been halved for Mistral-large and Mistral-Small
For OpenAI Models, we have corrected GPT-3.5-turbo-0125 to the price it should have

remove gcp project name

CharlieJCJ · 2024-05-15T10:40:52Z

Reviewed costs and synced the most up-to-date costs with various model providers.

…rilla into gemini_0514_gpt_4o

HuanzhiMao

Ready to ship

…mini) (#428) As mentioned in #426, this PR addes 4 new models to the leaderboard. The model costs are also updated accordingly. This PR **DOES** change the leaderboard ranking. This PR **DOES NOT** change the leaderboard score other than the added models.

This PR makes 3 models(4 entries) available for inference on BFCL: - gpt-4o-2024-05-13 (Function Calling Mode and Prompting Mode) - gemini-1.5-pro-preview-0514 (Function Calling Mode) - gemini-1.5-flash-preview-0514(Function Calling Mode) You can start the evaluation by running `python openfunctions_evaluation.py --model MODEL_NAME` and get score by running `python ./eval_runner.py --model MODEL_NAME`. For more detail, refer to Readme under the BFCL page. Score changes are reflected in ShishirPatil#428 . This PR also updated different models' pricing: - For Gemini, when prompts are less than 128K tokens, the new Gemini series' prices are lowered by around half(https://ai.google.dev/pricing). All the BFCL test cases are less than 128K tokens. - For Anthropic Models, the prices have decreased for claude-2.1 and Claude-instant-1.2 which have updated accordingly - For Mistral Models, the prices have been halved for Mistral-large and Mistral-Small - For OpenAI Models, we have corrected GPT-3.5-turbo-0125 to the price it should have --------- Co-authored-by: Huanzhi Mao <huanzhimao@gmail.com>

Fanjia-Yan and others added 6 commits May 14, 2024 20:41

add support for gemini-0514 and gpt-4o

716c1d4

remove gcp project name

update gemini series price per mil tokens

a8676d6

gpt-4o pricing

17465c7

update change log

94141f6

update change log

439c709

Merge remote-tracking branch 'upstream/main' into gemini_0514_gpt_4o

dc4cc4c

HuanzhiMao mentioned this pull request May 15, 2024

Leaderboard Update, in sync with BFCL May 14th Release (GPT-4o and Gemini) #428

Merged

update model metadata mapping to distinguish gemini 1.5 pro series

3be7e1b

Fanjia-Yan marked this pull request as ready for review May 15, 2024 10:05

Fanjia-Yan added 2 commits May 15, 2024 03:24

update gpt-3.5-turbo-0125 pricing

4c5b8ca

update legacy pricing

22ae6a8

Fanjia-Yan and others added 3 commits May 15, 2024 04:07

update available models

da05d22

update change log

bf8e95f

Merge branch 'gemini_0514_gpt_4o' of https://github.com/Fanjia-Yan/go…

7df9ffe

…rilla into gemini_0514_gpt_4o

HuanzhiMao approved these changes May 15, 2024

View reviewed changes

update available models

54c9c39

ShishirPatil approved these changes May 15, 2024

View reviewed changes

ShishirPatil merged commit 5da8835 into ShishirPatil:main May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BFCL May 14th Release (GPT-4o and Gemini) #426

BFCL May 14th Release (GPT-4o and Gemini) #426

Fanjia-Yan commented May 15, 2024 •

edited

Loading

CharlieJCJ commented May 15, 2024

HuanzhiMao left a comment

BFCL May 14th Release (GPT-4o and Gemini) #426

BFCL May 14th Release (GPT-4o and Gemini) #426

Conversation

Fanjia-Yan commented May 15, 2024 • edited Loading

CharlieJCJ commented May 15, 2024

HuanzhiMao left a comment

Choose a reason for hiding this comment

Fanjia-Yan commented May 15, 2024 •

edited

Loading