BFCL April 8th Release #330

HuanzhiMao · 2024-04-09T06:50:39Z

This PR is for the leaderboard April 8th release:

Fixed an oversight that was introduced in Leaderboard Update April 1 #299. For function-calling (FC) models that cannot take float type in input, when the parameter type is a float, the evaluation procedure will convert that type to number in the model input and mention in the parameter description that This is a float type value.. An additional field format: float will also be included in the model input to make it clear about the type.
Update the model handler for Claude, Mistral, and OSS to better parse the model output. This is to patch the handler we released in Leaderboard Update April 1 #299, as it sometimes fails to parse even though the model output is valid. This affects only the prompting models; the FC models are unaffected.

This PR DOES change the leaderboard score. We will update the leaderboard website shortly, in a different PR.

Co-authored-by: Charlie Cheng-Jie Ji charliechengjieji@berkeley.edu
Co-authored-by: Fanjia Yan fanjiayan@berkeley.edu

… the model output.

Fanjia-Yan · 2024-04-11T05:56:34Z

LGTM

CharlieJCJ

LGTM

This PR is for the leaderboard April 8th release: 1. Fixed an oversight that was introduced in ShishirPatil#299. For function-calling (FC) models that cannot take `float` type in input, when the parameter type is a `float`, the evaluation procedure will convert that type to `number` in the model input and mention in the parameter description that `This is a float type value.`. An additional field `format: float` will also be included in the model input to make it clear about the type. 2. Update the model handler for Claude, Mistral, and OSS to better parse the model output. This is to patch the handler we released in ShishirPatil#299, as it sometimes fails to parse even though the model output is valid. This affects only the prompting models; the FC models are unaffected. This PR **DOES** change the leaderboard score. We will update the leaderboard website shortly, in a different PR. --------- Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu> Co-authored-by: Fanjia Yan <fanjiayan@berkeley.edu>

HuanzhiMao added 5 commits April 8, 2024 23:38

update utils to support special handle for float type

0253f2e

update change log

f6fa8c5

Update the model handler for Claude, Mistral, and OSS to better parse…

e566b6f

… the model output.

typo fix

cd5736e

update change log

cc0d8d0

HuanzhiMao marked this pull request as ready for review April 11, 2024 01:04

Merge branch 'main' into main

81efcf9

HuanzhiMao changed the title ~~Leaderboard April 8th Update~~ BFCL April 8th Release Apr 11, 2024

HuanzhiMao changed the title ~~BFCL April 8th Release~~ [WIP] BFCL April 8th Release Apr 11, 2024

HuanzhiMao changed the title ~~[WIP] BFCL April 8th Release~~ BFCL April 8th Release Apr 11, 2024

CharlieJCJ approved these changes Apr 11, 2024

View reviewed changes

ShishirPatil approved these changes Apr 11, 2024

View reviewed changes

ShishirPatil merged commit ace82af into ShishirPatil:main Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BFCL April 8th Release #330

BFCL April 8th Release #330

HuanzhiMao commented Apr 9, 2024 •

edited

Loading

Fanjia-Yan commented Apr 11, 2024

CharlieJCJ left a comment

BFCL April 8th Release #330

BFCL April 8th Release #330

Conversation

HuanzhiMao commented Apr 9, 2024 • edited Loading

Fanjia-Yan commented Apr 11, 2024

CharlieJCJ left a comment

Choose a reason for hiding this comment

HuanzhiMao commented Apr 9, 2024 •

edited

Loading