test(reasoning): Add tests for Reasoning base class #21

DipayanDasgupta · 2025-10-03T08:59:29Z

Description

This Pull Request addresses a gap in test coverage for the abstract Reasoning base class, located in mesa_llm/reasoning/reasoning.py. By increasing the test coverage for this foundational module, we improve the overall robustness of the library and ensure that core methods shared by all reasoning strategies (like CoT, ReAct, etc.) are reliable.

Key Changes

Added New Test Class: Introduced a new class, TestReasoningBase, to tests/test_reasoning/test_reasoning.py.
Tested Core Logic: The primary test, test_execute_tool_call_generates_plan, verifies the logic of the execute_tool_call method on the base class.
Isolated for Accuracy: To test the abstract Reasoning class, a minimal ConcreteReasoning subclass was created within the test scope. The agent, its llm.generate method, and tool_manager.get_all_tools_schema were mocked to isolate the function's behavior and assert its correctness.

Impact: Before and After

Before: Test coverage for mesa_llm/reasoning/reasoning.py was 63%.
After: Test coverage has increased to 78%.

This is a significant improvement that makes the core reasoning logic more resilient to future changes.

How to Verify

Check out this branch.
Ensure development dependencies are installed with pip install -e ".[dev]".
Run the test suite:
```
pytest --cov=mesa_llm tests/
```
Confirm that all 156 tests pass and that the coverage report shows the increase for mesa_llm/reasoning/reasoning.py to 78%.

Summary by CodeRabbit

Tests
- Added new test coverage for the core reasoning flow, verifying tool-call execution produces a plan and expected behavior.
- Introduced a minimal concrete reasoning class within tests to exercise base logic safely.
- Enhances reliability and guards against regressions.
- No functional or API changes; users should see unchanged behavior with improved stability over time.

coderabbitai · 2025-10-03T08:59:36Z

Walkthrough

Adds a new test in tests/test_reasoning/test_reasoning.py to validate Reasoning.execute_tool_call behavior using a minimal ConcreteReasoning subclass defined within the test. Updates imports to include Reasoning.

Changes

Cohort / File(s)	Summary
Tests — Reasoning execute_tool_call `tests/test_reasoning/test_reasoning.py`	Added TestReasoningBase with test_execute_tool_call_generates_plan; introduced an inline ConcreteReasoning subclass to exercise the base method; updated imports to include Reasoning.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I thump my paw at test-time’s call,
A plan springs forth, concise and small.
In burrowed mocks, the tools align,
Reasoning hops, the checks all shine.
Carrot-green ticks across the log—
Another day, a happy dev-rabbit’s jog. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title accurately and succinctly describes the main change by indicating that tests are being added for the abstract Reasoning base class, matching the PR’s objective to improve coverage for that class.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

DipayanDasgupta · 2025-10-03T09:19:40Z

@coderabbitai review

coderabbitai · 2025-10-03T09:19:46Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

codecov · 2025-10-03T09:21:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.30%. Comparing base (ae0877e) to head (d6b7cc3).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #21      +/-   ##
==========================================
+ Coverage   85.81%   86.30%   +0.48%     
==========================================
  Files          17       17              
  Lines        1234     1234              
==========================================
+ Hits         1059     1065       +6     
+ Misses        175      169       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

tests/test_reasoning/test_reasoning.py (1)
68-90: Test logic is sound; consider additional edge case coverage.

The test correctly validates the happy path for execute_tool_call:

Mock interactions are properly verified

Plan object structure is correctly asserted

Test isolation is maintained

However, consider adding tests for:

Error handling (e.g., LLM generation failure)

Edge cases for selected_tools (None, empty list, multiple tools)

Behavior when tool_manager.get_all_tools_schema returns empty/malformed data

Example test for error handling:
def test_execute_tool_call_handles_llm_failure(self):
    """Test that execute_tool_call handles LLM generation failures gracefully."""
    mock_agent = Mock()
    mock_agent.model.steps = 5
    mock_agent.llm.generate.side_effect = Exception("LLM failure")
    mock_agent.tool_manager.get_all_tools_schema.return_value = [{"schema": "example"}]
    
    class ConcreteReasoning(Reasoning):
        def plan(self, prompt, obs=None, ttl=1, selected_tools=None):
            pass
    
    reasoning = ConcreteReasoning(agent=mock_agent)
    
    # Assert appropriate error handling or exception propagation
    with pytest.raises(Exception, match="LLM failure"):
        reasoning.execute_tool_call("Execute the plan.", selected_tools=["tool1"])

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae0877e and d6b7cc3.

📒 Files selected for processing (1)

tests/test_reasoning/test_reasoning.py (2 hunks)

🔇 Additional comments (3)

tests/test_reasoning/test_reasoning.py (3)

6-6: LGTM!

The import addition is necessary for testing the base Reasoning class.

64-66: LGTM!

The minimal ConcreteReasoning subclass appropriately satisfies the abstract base class requirement without introducing unnecessary test complexity.

53-56: No changes needed to LLM response mock
The execute_tool_call method reads rsp.choices[0].message directly, so the test’s mock setup correctly matches the implementation.

sanika-n

Thanks for noticing that execute_tool_call hadn’t been covered in the tests, the code looks good to me!

test(reasoning): Add tests for Reasoning base class

d6b7cc3

colinfrisch requested review from colinfrisch and sanika-n October 3, 2025 09:01

coderabbitai bot reviewed Oct 3, 2025

View reviewed changes

sanika-n approved these changes Oct 6, 2025

View reviewed changes

sanika-n merged commit 096bbf3 into projectmesa:main Oct 6, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(reasoning): Add tests for Reasoning base class #21

test(reasoning): Add tests for Reasoning base class #21

DipayanDasgupta commented Oct 3, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 3, 2025 •

edited

Loading

Uh oh!

DipayanDasgupta commented Oct 3, 2025

Uh oh!

coderabbitai bot commented Oct 3, 2025

Uh oh!

codecov bot commented Oct 3, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

sanika-n left a comment

Uh oh!

Uh oh!

Uh oh!

test(reasoning): Add tests for Reasoning base class #21

test(reasoning): Add tests for Reasoning base class #21

Conversation

DipayanDasgupta commented Oct 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Changes

Impact: Before and After

How to Verify

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

DipayanDasgupta commented Oct 3, 2025

Uh oh!

coderabbitai bot commented Oct 3, 2025

Uh oh!

codecov bot commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sanika-n left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DipayanDasgupta commented Oct 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 3, 2025 •

edited

Loading

codecov bot commented Oct 3, 2025 •

edited

Loading