Clean up default temperature handling and Kimi top_p override by enyst · Pull Request #1994 · OpenHands/software-agent-sdk

enyst · 2026-02-11T03:10:37Z

Summary

Remove get_default_temperature and always default to provider temperature (temperature stays None unless explicitly set).
Adjust top_p defaults for Moonshot Kimi-K2.5 (requires 0.95) in the LLM initializer.
Update tests to reflect the new temperature behavior.

Testing

uv run pre-commit run --files openhands-sdk/openhands/sdk/llm/utils/model_features.py openhands-sdk/openhands/sdk/llm/llm.py tests/sdk/llm/test_model_features.py
Manual example run: uv run python examples/01_standalone_sdk/29_llm_streaming.py with LLM_BASE_URL=https://llm-proxy.eval.all-hands.dev, LLM_API_KEY=$LITELLM_API_KEY, LLM_MODEL=moonshot/kimi-k2.5 (top_p auto-overridden to 0.95)

@enyst can click here to continue refining the PR

Real-world tests

bedrock/moonshot.kimi-k2-thinking via examples/01_standalone_sdk/22_anthropic_thinking.py → Failed (404 Not Found from Bedrock converse endpoint).
moonshot/kimi-k2.5 via examples/01_standalone_sdk/29_llm_streaming.py with LLM_BASE_URL=https://llm-proxy.eval.all-hands.dev and LLM_API_KEY=$LITELLM_API_KEY → Succeeded (story file created and then deleted by the example).
moonshot/kimi-k2-thinking via examples/01_standalone_sdk/22_anthropic_thinking.py → Failed (no healthy deployments for the model).

Behavior changes

Kimi models no longer receive an implicit temperature override; temperature remains None unless set by the caller, letting the provider default apply.
top_p defaults are now centralized via get_default_top_p when the caller leaves top_p at 1.0 (e.g., Moonshot Kimi-K2.5 defaults to 0.95).

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-02-11T03:13:07Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/llm
llm.py	459	84	81%	405, 463, 637, 738, 740–741, 769, 819, 830–832, 836–840, 848–850, 860–862, 865–866, 870, 872–873, 875, 898–903, 1026, 1031–1032, 1229–1230, 1239, 1252, 1254–1259, 1261–1278, 1281–1285, 1287–1288, 1294–1303, 1354, 1356
openhands-sdk/openhands/sdk/llm/utils
model_features.py	46	1	97%	32
TOTAL	17196	5039	70%

all-hands-bot

🟡 Acceptable - Works but violates "good taste" by accumulating special cases instead of using a data-driven approach.

Verdict: ❌ Needs rework - Inconsistent approach and missing test coverage

Key Insight: You removed a data structure (DEFAULT_TEMPERATURE_MODELS) but added hardcoded special cases (kimi-k2.5 top_p). This creates technical debt and will lead to a growing pile of if-statements.

openhands-sdk/openhands/sdk/llm/llm.py

tests/sdk/llm/test_model_features.py

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟡 Acceptable - Breaking Change Risk

Taste Rating: 🟡 Acceptable - Works but violates "never break userspace"

This PR simplifies temperature handling but introduces a breaking change for existing users. The removal of default temperature settings for kimi models changes their runtime behavior without a clear deprecation path.

KEY INSIGHT: Removing default temperature while adding model-specific top_p overrides creates inconsistent patterns that will multiply as more models need special handling.

See inline comments for specific issues.

openhands-sdk/openhands/sdk/llm/llm.py

tests/sdk/llm/test_llm.py

openhands-sdk/openhands/sdk/llm/llm.py

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟡 Acceptable - Clean refactoring with solid simplification, but test coverage regression and undocumented breaking change need attention.

Key Insight: Good taste in eliminating special cases, but temperature removal is a breaking change hidden in a "cleanup" PR.

openhands-sdk/openhands/sdk/llm/llm.py

openhands-sdk/openhands/sdk/llm/utils/model_features.py

tests/sdk/llm/test_model_features.py

tests/sdk/llm/test_llm.py

enyst · 2026-02-11T18:27:29Z

@OpenHands Address the review comments and inline review threads. Think deeply before doing so, it's an AI reviewer so it could be wrong; decide first if the comment is valid, and fix it then, or if not, reply to the thread with your assessment.

Note that you should be careful and use gh correctly when you reply, do not replace their comment, reply to it in the same thread.

openhands-ai · 2026-02-11T18:27:39Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2026-02-11T18:29:45Z

Final summary (new work since last update)

What changed

No further code changes were needed after the previous update; git diff against origin/cleanup-default-temperature-models...HEAD is empty and the working tree is clean.

Checklist against your request

Addressed inline review threads by fixing valid comments (expanded get_default_top_p test coverage).
For the “breaking change” concern, responded in-thread with an assessment (intentional behavior change; provider defaults should not be encoded; set temperature=1.0 explicitly to preserve old behavior).
Replied as replies in the same threads (did not replace/edit the reviewer’s comments).
Changes were concise (only expanded the relevant test parametrization).
Changes were committed and already pushed to cleanup-default-temperature-models (commit cd9dc3c9).

No additional push is required.

View full conversation

…emperature-models

all-hands-bot

Taste Rating: 🟡 Acceptable - Clean refactor that simplifies temperature/top_p handling

The core logic is sound: removing special-case temperature overrides and generalizing top_p defaults is pragmatic. The behavior change (temperature stays None) lets providers handle their own defaults, which is the right approach. Tests actually verify behavior rather than just mocking calls.

One improvement needed: stale comment that no longer reflects the generalized logic.

openhands-sdk/openhands/sdk/llm/llm.py

openhands-sdk/openhands/sdk/llm/utils/model_features.py

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

neubig · 2026-02-12T00:17:55Z

[automated message] @neubig assigned for review according to git blame

enyst and others added 2 commits February 11, 2026 03:10

Clean up default temperature logic and adjust top_p for kimi-k2.5

689bff4

Co-authored-by: openhands <openhands@all-hands.dev>

Remove unused default temperature helper

a09cf11

Co-authored-by: openhands <openhands@all-hands.dev>

enyst marked this pull request as ready for review February 11, 2026 03:17

all-hands-bot reviewed Feb 11, 2026

View reviewed changes

openhands-sdk/openhands/sdk/llm/llm.py Outdated Show resolved Hide resolved

openhands-sdk/openhands/sdk/llm/llm.py Outdated Show resolved Hide resolved

openhands-sdk/openhands/sdk/llm/llm.py Outdated Show resolved Hide resolved

tests/sdk/llm/test_model_features.py Show resolved Hide resolved

Refine kimi top_p override and add coverage

11da009

Co-authored-by: openhands <openhands@all-hands.dev>

enyst requested a review from all-hands-bot February 11, 2026 03:35

all-hands-bot reviewed Feb 11, 2026

View reviewed changes

Centralize top_p defaults for specific models

c978152

Co-authored-by: openhands <openhands@all-hands.dev>

enyst requested a review from all-hands-bot February 11, 2026 04:31

all-hands-bot reviewed Feb 11, 2026

View reviewed changes

enyst mentioned this pull request Feb 11, 2026

Set default temperature to None instead of 0.0 #1989

Open

Expand get_default_top_p test coverage

cd9dc3c

enyst added 2 commits February 11, 2026 18:52

Add test for Kimi default temperature remaining None

19aa501

Merge branch 'fix-default-temperature-to-none' into cleanup-default-t…

493ec23

…emperature-models

enyst requested a review from all-hands-bot February 11, 2026 18:54

all-hands-bot reviewed Feb 11, 2026

View reviewed changes

openhands-sdk/openhands/sdk/llm/llm.py Show resolved Hide resolved

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/llm/utils/model_features.py

b8f3179

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

OpenHands deleted a comment from openhands-ai bot Feb 11, 2026

neubig self-requested a review February 12, 2026 00:17

enyst added behavior-initiative This is related to the system prompt sections and LLM steering. and removed behavior-initiative This is related to the system prompt sections and LLM steering. labels Feb 14, 2026

Conversation

enyst commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Real-world tests

Behavior changes

Uh oh!

github-actions bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

🟡 Acceptable - Breaking Change Risk

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Feb 11, 2026

Uh oh!

openhands-ai bot commented Feb 11, 2026

Uh oh!

openhands-ai bot commented Feb 11, 2026

Final summary (new work since last update)

What changed

Checklist against your request

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

neubig commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enyst commented Feb 11, 2026 •

edited

Loading

github-actions bot commented Feb 11, 2026 •

edited

Loading