Skip to content

Remove unsupported models#1734

Merged
simonrosenberg merged 4 commits intomainfrom
sr/remove-nonsupported-models
Jan 16, 2026
Merged

Remove unsupported models#1734
simonrosenberg merged 4 commits intomainfrom
sr/remove-nonsupported-models

Conversation

@simonrosenberg
Copy link
Collaborator

@simonrosenberg simonrosenberg commented Jan 15, 2026

Summary

Remove models we dont want to evaluate
And delete the useless test: it's not even run by the CI and it doesn't test anything important.

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:4d1fad5-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-4d1fad5-python \
  ghcr.io/openhands/agent-server:4d1fad5-python

All tags pushed for this build

ghcr.io/openhands/agent-server:4d1fad5-golang-amd64
ghcr.io/openhands/agent-server:4d1fad5-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:4d1fad5-golang-arm64
ghcr.io/openhands/agent-server:4d1fad5-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:4d1fad5-java-amd64
ghcr.io/openhands/agent-server:4d1fad5-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:4d1fad5-java-arm64
ghcr.io/openhands/agent-server:4d1fad5-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:4d1fad5-python-amd64
ghcr.io/openhands/agent-server:4d1fad5-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:4d1fad5-python-arm64
ghcr.io/openhands/agent-server:4d1fad5-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:4d1fad5-golang
ghcr.io/openhands/agent-server:4d1fad5-java
ghcr.io/openhands/agent-server:4d1fad5-python

About Multi-Architecture Support

  • Each variant tag (e.g., 4d1fad5-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 4d1fad5-python-amd64) are also available if needed

@simonrosenberg simonrosenberg marked this pull request as ready for review January 15, 2026 16:46
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR removes two model configurations (claude-haiku-4-5-20251001 and deepseek-chat) from .github/run-eval/resolve_model_config.py, but leaves numerous references to these models throughout the codebase. This will cause test failures and CI failures.


🔴 Critical Issues (Must Fix)

1. Test Failure

File: tests/github_workflows/test_resolve_model_config.py
Line: 49
Issue: Test explicitly asserts result[1]["id"] == "deepseek-chat", but this model no longer exists in MODELS.
Fix: Update the test assertion to reference a model that still exists.

2. CI Workflow Failure - run-examples.yml

File: .github/workflows/run-examples.yml
Line: 66
Issue: Uses LLM_MODEL: openhands/claude-haiku-4-5-20251001 which is no longer in supported configurations.
Fix: Update to a supported model like claude-sonnet-4-5-20250929.

3. CI Workflow Failure - integration-runner.yml

File: .github/workflows/integration-runner.yml
Line: 92
Issue: Uses model: litellm_proxy/deepseek/deepseek-chat which is no longer supported.
Fix: Update to a supported model or remove this test configuration.


🟠 Important Issues (Should Address)

4. Verified Models List - claude-haiku

File: openhands-sdk/openhands/sdk/llm/utils/verified_models.py
Line: 23
Issue: claude-haiku-4-5-20251001 still listed in VERIFIED_ANTHROPIC_MODELS.
Fix: Remove for consistency, or clarify in PR description why it remains verified for general use but removed from eval configs.

5. Verified Models List - deepseek-chat

File: openhands-sdk/openhands/sdk/llm/utils/verified_models.py
Line: 53
Issue: deepseek-chat still listed in VERIFIED_OPENHANDS_MODELS.
Fix: Remove for consistency, or clarify in PR description why it remains verified for general use but removed from eval configs.

6. Test Default Model

File: tests/integration/utils/llm_judge.py
Line: 105
Issue: Default LLM_JUDGE_MODEL is litellm_proxy/claude-haiku-4-5-20251001.
Fix: Update to a currently supported model to avoid runtime failures.


Additional References Found

These may also need updates:

  • tests/sdk/llm/test_model_features.py - Contains test cases referencing both models
  • tests/fixtures/llm_data/ - Test data files and README referencing deepseek-chat
  • examples/03_github_workflows/*/README.md - Documentation mentioning claude-haiku-4-5-20251001

Recommendation

Before merging, please:

  1. Fix all Critical issues (items 1-3) to prevent CI/test failures
  2. Address Important issues (items 4-6) for consistency
  3. Search for and update any remaining references in tests and documentation
  4. Run the test suite to verify no failures: pytest tests/github_workflows/test_resolve_model_config.py

@openhands-ai
Copy link

openhands-ai bot commented Jan 15, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Run Eval (swebench) test cross branch launching
  • There are merge conflicts

If you'd like me to help, just leave a comment, like

@OpenHands please fix the merge conflicts on PR #1734 at branch `sr/remove-nonsupported-models`

or

@OpenHands please fix the failing actions on PR #1734 at branch `sr/remove-nonsupported-models`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

enyst
enyst previously requested changes Jan 15, 2026
Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. Please give me a moment though - Deepseek chat has been discussed before

@enyst
Copy link
Collaborator

enyst commented Jan 16, 2026

My bad, spoke too fast. I see these are only the models for evals! I thought they were available to people e.g. in proxies or such.

Edited to add: as long as it has conflicts, I don't see the dismiss review option. Maybe PEBKAC! 😅

@enyst
Copy link
Collaborator

enyst commented Jan 16, 2026

@OpenHands Merge main into this PR and fix the conflicts. Commit and push here.

@openhands-ai
Copy link

openhands-ai bot commented Jan 16, 2026

I'm on it! enyst can track my progress at all-hands.dev

@openhands-ai
Copy link

openhands-ai bot commented Jan 16, 2026

Final summary (new since my last update)

What I verified

  • Request addressed: main is merged into sr/remove-nonsupported-models, conflicts are resolved, changes are committed, and the branch is pushed.
  • Working tree clean: git status shows no uncommitted changes.
  • Branch is pushed: git push reports everything is up-to-date.

Conciseness / extra changes check

  • Compared to origin/main, the PR now contains only the intended model removals in:
    • .github/run-eval/resolve_model_config.py (13 deletions)
  • No other diffs vs origin/main remain.

Conflict resolution performed

  • Resolved the single merge conflict in tests/github_workflows/test_resolve_model_config.py by keeping the main version during the merge (the PR had deleted it, but main modified it).

Commit(s)

  • a252fd37 Merge main into sr/remove-nonsupported-models (already pushed to origin/sr/remove-nonsupported-models).

View full conversation

@enyst enyst dismissed their stale review January 16, 2026 00:35

Misunderstood the change.

@simonrosenberg simonrosenberg merged commit 426cc26 into main Jan 16, 2026
21 checks passed
@simonrosenberg simonrosenberg deleted the sr/remove-nonsupported-models branch January 16, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants