Remove unsupported models by simonrosenberg · Pull Request #1734 · OpenHands/software-agent-sdk

simonrosenberg · 2026-01-15T16:44:10Z

Summary

Remove models we dont want to evaluate
And delete the useless test: it's not even run by the CI and it doesn't test anything important.

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:4d1fad5-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-4d1fad5-python \
  ghcr.io/openhands/agent-server:4d1fad5-python

All tags pushed for this build

ghcr.io/openhands/agent-server:4d1fad5-golang-amd64
ghcr.io/openhands/agent-server:4d1fad5-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:4d1fad5-golang-arm64
ghcr.io/openhands/agent-server:4d1fad5-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:4d1fad5-java-amd64
ghcr.io/openhands/agent-server:4d1fad5-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:4d1fad5-java-arm64
ghcr.io/openhands/agent-server:4d1fad5-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:4d1fad5-python-amd64
ghcr.io/openhands/agent-server:4d1fad5-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:4d1fad5-python-arm64
ghcr.io/openhands/agent-server:4d1fad5-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:4d1fad5-golang
ghcr.io/openhands/agent-server:4d1fad5-java
ghcr.io/openhands/agent-server:4d1fad5-python

About Multi-Architecture Support

Each variant tag (e.g., 4d1fad5-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 4d1fad5-python-amd64) are also available if needed

all-hands-bot

Review Summary

This PR removes two model configurations (claude-haiku-4-5-20251001 and deepseek-chat) from .github/run-eval/resolve_model_config.py, but leaves numerous references to these models throughout the codebase. This will cause test failures and CI failures.

🔴 Critical Issues (Must Fix)

1. Test Failure

File: tests/github_workflows/test_resolve_model_config.py
Line: 49
Issue: Test explicitly asserts result[1]["id"] == "deepseek-chat", but this model no longer exists in MODELS.
Fix: Update the test assertion to reference a model that still exists.

2. CI Workflow Failure - run-examples.yml

File: .github/workflows/run-examples.yml
Line: 66
Issue: Uses LLM_MODEL: openhands/claude-haiku-4-5-20251001 which is no longer in supported configurations.
Fix: Update to a supported model like claude-sonnet-4-5-20250929.

3. CI Workflow Failure - integration-runner.yml

File: .github/workflows/integration-runner.yml
Line: 92
Issue: Uses model: litellm_proxy/deepseek/deepseek-chat which is no longer supported.
Fix: Update to a supported model or remove this test configuration.

🟠 Important Issues (Should Address)

4. Verified Models List - claude-haiku

File: openhands-sdk/openhands/sdk/llm/utils/verified_models.py
Line: 23
Issue: claude-haiku-4-5-20251001 still listed in VERIFIED_ANTHROPIC_MODELS.
Fix: Remove for consistency, or clarify in PR description why it remains verified for general use but removed from eval configs.

5. Verified Models List - deepseek-chat

File: openhands-sdk/openhands/sdk/llm/utils/verified_models.py
Line: 53
Issue: deepseek-chat still listed in VERIFIED_OPENHANDS_MODELS.
Fix: Remove for consistency, or clarify in PR description why it remains verified for general use but removed from eval configs.

6. Test Default Model

File: tests/integration/utils/llm_judge.py
Line: 105
Issue: Default LLM_JUDGE_MODEL is litellm_proxy/claude-haiku-4-5-20251001.
Fix: Update to a currently supported model to avoid runtime failures.

Additional References Found

These may also need updates:

tests/sdk/llm/test_model_features.py - Contains test cases referencing both models
tests/fixtures/llm_data/ - Test data files and README referencing deepseek-chat
examples/03_github_workflows/*/README.md - Documentation mentioning claude-haiku-4-5-20251001

Recommendation

Before merging, please:

Fix all Critical issues (items 1-3) to prevent CI/test failures
Address Important issues (items 4-6) for consistency
Search for and update any remaining references in tests and documentation
Run the test suite to verify no failures: pytest tests/github_workflows/test_resolve_model_config.py

openhands-ai · 2026-01-15T23:44:32Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Run Eval (swebench) test cross branch launching
There are merge conflicts

If you'd like me to help, just leave a comment, like

@OpenHands please fix the merge conflicts on PR #1734 at branch `sr/remove-nonsupported-models`

or

@OpenHands please fix the failing actions on PR #1734 at branch `sr/remove-nonsupported-models`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

enyst

I understand. Please give me a moment though - Deepseek chat has been discussed before

enyst · 2026-01-16T00:00:41Z

My bad, spoke too fast. I see these are only the models for evals! I thought they were available to people e.g. in proxies or such.

Edited to add: as long as it has conflicts, I don't see the dismiss review option. Maybe PEBKAC! 😅

enyst · 2026-01-16T00:19:24Z

@OpenHands Merge main into this PR and fix the conflicts. Commit and push here.

openhands-ai · 2026-01-16T00:19:34Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2026-01-16T00:21:05Z

Final summary (new since my last update)

What I verified

Request addressed: main is merged into sr/remove-nonsupported-models, conflicts are resolved, changes are committed, and the branch is pushed.
Working tree clean: git status shows no uncommitted changes.
Branch is pushed: git push reports everything is up-to-date.

Conciseness / extra changes check

Compared to origin/main, the PR now contains only the intended model removals in:
- .github/run-eval/resolve_model_config.py (13 deletions)
No other diffs vs origin/main remain.

Conflict resolution performed

Resolved the single merge conflict in tests/github_workflows/test_resolve_model_config.py by keeping the main version during the merge (the PR had deleted it, but main modified it).

Commit(s)

a252fd37 Merge main into sr/remove-nonsupported-models (already pushed to origin/sr/remove-nonsupported-models).

View full conversation

Misunderstood the change.

remove unsupported models

b2f4458

simonrosenberg requested review from juanmichelini and neubig January 15, 2026 16:44

simonrosenberg marked this pull request as ready for review January 15, 2026 16:46

all-hands-bot reviewed Jan 15, 2026

View reviewed changes

simonrosenberg and others added 2 commits January 15, 2026 18:05

delete useless test

39b356b

Merge branch 'main' into sr/remove-nonsupported-models

31e6bd7

neubig approved these changes Jan 15, 2026

View reviewed changes

enyst previously requested changes Jan 15, 2026

View reviewed changes

Merge main into sr/remove-nonsupported-models

a252fd3

simonrosenberg merged commit 426cc26 into main Jan 16, 2026
21 checks passed

simonrosenberg deleted the sr/remove-nonsupported-models branch January 16, 2026 13:54

This was referenced Jan 16, 2026

Remove gpt-mini model from the model lists #1747

Closed

Remove gpt-mini model from model lists #1748

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unsupported models#1734

Remove unsupported models#1734
simonrosenberg merged 4 commits intomainfrom
sr/remove-nonsupported-models

simonrosenberg commented Jan 15, 2026 •

edited by github-actions bot

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

openhands-ai bot commented Jan 15, 2026

Uh oh!

enyst left a comment •

edited

Loading

Uh oh!

enyst commented Jan 16, 2026 •

edited

Loading

Uh oh!

enyst commented Jan 16, 2026

Uh oh!

openhands-ai bot commented Jan 16, 2026

Uh oh!

openhands-ai bot commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

simonrosenberg commented Jan 15, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Review Summary

🔴 Critical Issues (Must Fix)

1. Test Failure

2. CI Workflow Failure - run-examples.yml

3. CI Workflow Failure - integration-runner.yml

🟠 Important Issues (Should Address)

4. Verified Models List - claude-haiku

5. Verified Models List - deepseek-chat

6. Test Default Model

Additional References Found

Recommendation

Uh oh!

openhands-ai bot commented Jan 15, 2026

Uh oh!

enyst left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

enyst commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enyst commented Jan 16, 2026

Uh oh!

openhands-ai bot commented Jan 16, 2026

Uh oh!

openhands-ai bot commented Jan 16, 2026

Final summary (new since my last update)

What I verified

Conciseness / extra changes check

Conflict resolution performed

Commit(s)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

simonrosenberg commented Jan 15, 2026 •

edited by github-actions bot

Loading

enyst left a comment •

edited

Loading

enyst commented Jan 16, 2026 •

edited

Loading