Skip to content

increase the min number of test cases passed #259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 2, 2025

Conversation

misrasaurabh1
Copy link
Contributor

@misrasaurabh1 misrasaurabh1 commented May 31, 2025

User description

I have never seen any good optimization with 5 passed test cases.


PR Type

Enhancement


Description

  • Introduce MIN_TESTCASE_PASSED_THRESHOLD constant

  • Raise minimum test pass count to 6

  • Replace hardcoded threshold in critic


Changes walkthrough 📝

Relevant files
Configuration changes
config_consts.py
Add minimum test-passed threshold constant                             

codeflash/code_utils/config_consts.py

  • Added MIN_TESTCASE_PASSED_THRESHOLD = 6
+1/-0     
Enhancement
critic.py
Use constant for test pass threshold                                         

codeflash/result/critic.py

  • Imported MIN_TESTCASE_PASSED_THRESHOLD
  • Updated quantity_of_tests_critic to use new constant
  • Removed hardcoded threshold (4)
  • +2/-2     

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • I have never seen any good optimization with 5 passed test cases.
    @misrasaurabh1 misrasaurabh1 requested a review from KRRT7 May 31, 2025 02:09
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Constant Organization

    Consider grouping or ordering constants in a logical or alphabetical way for consistency and easier maintenance.

    MIN_TESTCASE_PASSED_THRESHOLD = 6
    Missing Tests

    No tests were added or updated to verify the behavior around the new MIN_TESTCASE_PASSED_THRESHOLD; please add tests to cover edge and typical cases.

    def quantity_of_tests_critic(candidate_result: OptimizedCandidateResult) -> bool:
        test_results = candidate_result.behavior_test_results
        report = test_results.get_test_pass_fail_report_by_type()
    
        pass_count = 0
        for test_type in report:
            pass_count += report[test_type]["passed"]
    
        if pass_count >= MIN_TESTCASE_PASSED_THRESHOLD:
            return True

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Cap threshold to available tests

    Cap the threshold to the total number of available tests so the condition remains
    reachable even if the configured threshold exceeds actual tests. This prevents
    always failing the check when fewer tests exist.

    codeflash/result/critic.py [53-54]

    -if pass_count >= MIN_TESTCASE_PASSED_THRESHOLD:
    +total_tests = sum(report[t]["passed"] + report[t]["failed"] for t in report)
    +effective_threshold = min(MIN_TESTCASE_PASSED_THRESHOLD, total_tests)
    +if pass_count >= effective_threshold:
         return True
    Suggestion importance[1-10]: 6

    __

    Why: Capping MIN_TESTCASE_PASSED_THRESHOLD to total_tests prevents unreachable conditions when fewer tests exist, improving robustness.

    Low
    Prevent KeyError on replay test

    Guard against missing REPLAY_TEST entries to avoid KeyError by using get with
    default values.

    codeflash/result/critic.py [56]

    -return bool(pass_count == 1 and report[TestType.REPLAY_TEST]["passed"] == 1)
    +return pass_count == 1 and report.get(TestType.REPLAY_TEST, {}).get("passed", 0) == 1
    Suggestion importance[1-10]: 5

    __

    Why: Using get guards against missing REPLAY_TEST entries and avoids potential KeyError, improving reliability.

    Low
    General
    Use sum comprehension for clarity

    Replace the explicit loop with a comprehension and sum for conciseness and
    readability.

    codeflash/result/critic.py [49-51]

    -pass_count = 0
    -for test_type in report:
    -    pass_count += report[test_type]["passed"]
    +pass_count = sum(r["passed"] for r in report.values())
    Suggestion importance[1-10]: 4

    __

    Why: Replacing the loop with a comprehension enhances readability and conciseness without altering functionality.

    Low

    KRRT7
    KRRT7 previously approved these changes May 31, 2025
    Copy link
    Contributor

    @KRRT7 KRRT7 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    I think we should double it to 10 but LGTM

    KRRT7
    KRRT7 previously approved these changes May 31, 2025
    Copy link

    openhands-ai bot commented May 31, 2025

    Looks like there are a few issues preventing this PR from being merged!

    • GitHub Actions are failing:
      • end-to-end-test

    If you'd like me to help, just leave a comment, like

    @OpenHands please fix the failing actions on PR #259

    Feel free to include any additional details that might help me get this PR into a better state.

    You can manage your notification settings

    @misrasaurabh1 misrasaurabh1 merged commit e1d8fe0 into main Jun 2, 2025
    16 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants