Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rerun selected test cases #743

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

Raymond112514
Copy link

@Raymond112514 Raymond112514 commented Nov 8, 2024

This PR adds functionality to the BFCLv3 dataset, allowing users to select specific test cases to run. Users can specify the selected test cases in a .txt file within the Berkeley Function Calling Leaderboard directory and run these cases by adding the --rerun argument:

bfcl generate --model MODEL_NAME --test-category TEST_CATEGORY --num-threads 1 --rerun FILE.txt

(Similarly for locally hosted models). For instance, FILE.txt can contain:

simple_0
simple_23
simple_45

Main change: added the overwrite function in base_handler, which takes in a newly executed result and replaces the old one in place.

@Raymond112514
Copy link
Author

Added the --rerun-all flag. When this flag is present, the results are overwritten.
Changed the logic of collect_test_case slightly.

@Raymond112514
Copy link
Author

Added the --result-dir and --score dir option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants