-
Notifications
You must be signed in to change notification settings - Fork 336
Open
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Description
Currently, we follow qwen-math github to parse the evaluation logic. However, many are false negatives - the responses are mostly correct but wrongly parsed.
We should use LLM to check the response.
tyler-griggs
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed