Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFCL April 8th Release #330

Merged
merged 6 commits into from
Apr 11, 2024
Merged

BFCL April 8th Release #330

merged 6 commits into from
Apr 11, 2024

Conversation

HuanzhiMao
Copy link
Collaborator

@HuanzhiMao HuanzhiMao commented Apr 9, 2024

This PR is for the leaderboard April 8th release:

  1. Fixed an oversight that was introduced in Leaderboard Update April 1 #299. For function-calling (FC) models that cannot take float type in input, when the parameter type is a float, the evaluation procedure will convert that type to number in the model input and mention in the parameter description that This is a float type value.. An additional field format: float will also be included in the model input to make it clear about the type.
  2. Update the model handler for Claude, Mistral, and OSS to better parse the model output. This is to patch the handler we released in Leaderboard Update April 1 #299, as it sometimes fails to parse even though the model output is valid. This affects only the prompting models; the FC models are unaffected.

This PR DOES change the leaderboard score. We will update the leaderboard website shortly, in a different PR.


Co-authored-by: Charlie Cheng-Jie Ji charliechengjieji@berkeley.edu
Co-authored-by: Fanjia Yan fanjiayan@berkeley.edu

@HuanzhiMao HuanzhiMao marked this pull request as ready for review April 11, 2024 01:04
@HuanzhiMao HuanzhiMao changed the title Leaderboard April 8th Update BFCL April 8th Release Apr 11, 2024
@HuanzhiMao HuanzhiMao changed the title BFCL April 8th Release [WIP] BFCL April 8th Release Apr 11, 2024
@HuanzhiMao HuanzhiMao changed the title [WIP] BFCL April 8th Release BFCL April 8th Release Apr 11, 2024
@Fanjia-Yan
Copy link
Collaborator

LGTM

Copy link
Collaborator

@CharlieJCJ CharlieJCJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShishirPatil ShishirPatil merged commit ace82af into ShishirPatil:main Apr 11, 2024
devanshamin pushed a commit to devanshamin/gorilla that referenced this pull request Jul 9, 2024
This PR is for the leaderboard April 8th release:

1. Fixed an oversight that was introduced in ShishirPatil#299. For function-calling
(FC) models that cannot take `float` type in input, when the parameter
type is a `float`, the evaluation procedure will convert that type to
`number` in the model input and mention in the parameter description
that `This is a float type value.`. An additional field `format: float`
will also be included in the model input to make it clear about the
type.
2. Update the model handler for Claude, Mistral, and OSS to better parse
the model output. This is to patch the handler we released in ShishirPatil#299, as
it sometimes fails to parse even though the model output is valid. This
affects only the prompting models; the FC models are unaffected.


This PR **DOES** change the leaderboard score. We will update the
leaderboard website shortly, in a different PR.

---------

Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu>
Co-authored-by: Fanjia Yan <fanjiayan@berkeley.edu>
aw632 pushed a commit to vinaybagade/gorilla that referenced this pull request Aug 22, 2024
This PR is for the leaderboard April 8th release:

1. Fixed an oversight that was introduced in ShishirPatil#299. For function-calling
(FC) models that cannot take `float` type in input, when the parameter
type is a `float`, the evaluation procedure will convert that type to
`number` in the model input and mention in the parameter description
that `This is a float type value.`. An additional field `format: float`
will also be included in the model input to make it clear about the
type.
2. Update the model handler for Claude, Mistral, and OSS to better parse
the model output. This is to patch the handler we released in ShishirPatil#299, as
it sometimes fails to parse even though the model output is valid. This
affects only the prompting models; the FC models are unaffected.


This PR **DOES** change the leaderboard score. We will update the
leaderboard website shortly, in a different PR.

---------

Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu>
Co-authored-by: Fanjia Yan <fanjiayan@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants