fix: correct regex group extraction in browsecomp_eval.py #93

Neph0s · 2025-07-04T10:17:36Z

Problem

The grade_sample method in browsecomp_eval.py has a regex bug that prevents correct grading
evaluation.

Current code:

match = re.search(r"correct: (yes|no)", grading_response)
return match.group(0) if match else "no"  # Returns "correct: yes" or "correct: no"

...
grade_result = self.grade_sample(problem, answer, response_text)

# Metrics based on grading response
is_correct = grade_result == "yes"
is_incorrect = grade_result == "no"

Issue:
- match.group(0) returns the entire match ("correct: yes" or "correct: no")
- But the code later compares grade_result == "yes" which always fails
- This causes is_correct and is_incorrect metrics to always be False

Solution

Change match.group(0) to match.group(1) to extract the captured group:

match = re.search(r"correct: (yes|no)", grading_response)
return match.group(1) if match else "no"  # Returns "yes" or "no"

Impact

- Fixes broken grading logic that was always marking responses as incorrect

Testing

The fix ensures that:
- Input "The answer is correct: yes" → returns "yes" (was "correct: yes")
- Input "The answer is correct: no" → returns "no" (was "correct: no")
- Comparisons grade_result == "yes" now work correctly

fix: correct regex group extraction in browsecomp_eval.py

0ebbb0d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: correct regex group extraction in browsecomp_eval.py #93

fix: correct regex group extraction in browsecomp_eval.py #93

Uh oh!

Neph0s commented Jul 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: correct regex group extraction in browsecomp_eval.py #93

Are you sure you want to change the base?

fix: correct regex group extraction in browsecomp_eval.py #93

Uh oh!

Conversation

Neph0s commented Jul 4, 2025

Problem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant