Skip to content

Add OpenHands Cloud API V1 skill and minimal Python client#48

Open
enyst wants to merge 36 commits intomainfrom
openhands/openhands-api-v1-skill
Open

Add OpenHands Cloud API V1 skill and minimal Python client#48
enyst wants to merge 36 commits intomainfrom
openhands/openhands-api-v1-skill

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Feb 14, 2026

Summary

Adds a new openhands-api-v1 skill that documents the OpenHands Cloud V1 API (app server + sandbox agent-server) and provides minimal, copyable clients for common operations.

Included:

  • skills/openhands_api_v1/SKILL.md
    • Auth model and key concepts (app server Bearer token vs agent-server X-Session-API-Key)
    • How to obtain agent_server_url + session_api_key
    • Common endpoints + usage notes
    • Clear section on counting events with 3 options (app-server count, agent-server count, trajectory zip fallback)
  • Clients:
    • Python (httpx): skills/openhands_api_v1/scripts/openhands_api_v1.py
    • TypeScript (fetch): skills/openhands_api_v1/scripts/openhands_api_v1.ts
  • References:
    • skills/openhands_api_v1/references/README.md (also points to route locations in OpenHands/OpenHands)
  • Skill README:
    • skills/openhands_api_v1/README.md

What’s covered by the minimal clients

App server (/api/v1, Bearer auth)

  • GET /users/me
  • GET /app-conversations/search
  • GET /app-conversations?ids=...
  • POST /app-conversations (start conversation; may create a sandbox)
  • GET /app-conversations/start-tasks?ids=...
  • GET /conversation/{id}/events/search
  • GET /conversation/{id}/events/count (may be flaky in some deployments)
  • GET /app-conversations/{id}/download (trajectory zip)

Agent server ({agent_server_url}/api, session auth)

  • GET /conversations/{id}/events/search (supports optional filters and sort_order=TIMESTAMP(_DESC))
  • GET /conversations/{id}/events/count (reliable count alternative)
  • POST /bash/execute_bash_command

Notes

  • Modeled after the reference clients in https://github.com/enyst/llm-playground (particularly openhands-api-client-v1).
  • Starting a conversation (POST /api/v1/app-conversations) can create a sandbox and may incur cost; the skill calls this out.
  • Agent-server event IDs are UUIDs; don’t infer counts from the “last event id”. Use count endpoints.

Co-authored-by: openhands openhands@all-hands.dev

@openhands-ai
Copy link

openhands-ai bot commented Feb 14, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Check README.md in Skills

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #48 at branch `openhands/openhands-api-v1-skill`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@enyst enyst marked this pull request as ready for review February 14, 2026 22:52
@juanmichelini
Copy link
Collaborator

@OpenHands please compare this one to #12 are they tackling the same or different prolbems? are they different enough that they won't step into each other toes when triggered?

@openhands-ai
Copy link

openhands-ai bot commented Feb 18, 2026

I'm on it! juanmichelini can track my progress at all-hands.dev

@enyst
Copy link
Collaborator Author

enyst commented Feb 18, 2026

Good point! I checked they say V0, respectively V1, in the description, and they have different triggers.

Idk… Idk if it’s too late for V0 actually - I was thinking that for the moment, GitHub/Slack resolvers still use it.

Also, we have a migration V0->V1 issue somewhere, it asks for docs comparing the two and documenting for people how to migrate, so idk, teaching the agent about them then let it test how you execute a thing with V0 and the same thing with V1 could be part of writing or checking the docs we will publish.

That said, I’m fine with removing V0… We really should use only V1 in normal use, and more importantly, I think maybe we should recommend to users only V1… 🤔

@enyst
Copy link
Collaborator Author

enyst commented Feb 18, 2026

@OpenHands Read all comments in this PR and investigate what you need to answer juanmichelini 's question. Including look at the linked PR diff and this one.

Answer it directly on github in a comment. Note that it will be rendered as markdown.

@openhands-ai
Copy link

openhands-ai bot commented Feb 18, 2026

I'm on it! enyst can track my progress at all-hands.dev

Copy link
Collaborator Author

enyst commented Feb 18, 2026

@juanmichelini

PR #12 (merged) adds the openhands-api skill for the legacy V0 OpenHands routes (/api/...), with Bearer auth and minimal clients. It’s primarily useful for older/legacy integrations (e.g. some current resolvers).

This PR (#48) adds openhands-api-v1 for the Cloud V1 routes (/api/v1/...) plus the sandbox agent-server API ({agent_server_url}/api/... with X-Session-API-Key). It focuses on the newer resources/flows (app-conversations, start-tasks, sandboxes, reliable event counting via agent-server, trajectory zip fallback, etc.).

Will they step on each other’s toes?

In practice, they shouldn’t:

  • Different skill ids/dirs: skills/openhands_api/ vs skills/openhands_api_v1/ (no code/path collision).
  • Different trigger keywords (versioned and intentionally disjoint):
    • V0 skill triggers: openhands-api-v0, openhands-cloud-v0, oh-cloud-v0
    • V1 skill triggers: openhands-api-v1 (plus oh-cloud-api, oh-api, agent-server, agent-server-api)
    • So V0 won’t auto-trigger on V1 requests (and vice versa) unless someone explicitly uses the other version’s keywords.

Net: same general domain (programmatic OpenHands API usage), but different API generations. They can coexist; V1 should be the default recommendation going forward, while V0 stays available for legacy deployments/integrations until migration is complete.

@openhands-ai

This comment was marked as duplicate.

Copy link

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

This is pragmatic code solving a real problem. The design is straightforward, but there are some issues that need attention—particularly a path typo and missing tests for reference code.

@enyst
Copy link
Collaborator Author

enyst commented Feb 19, 2026

@OpenHands Use gh to read all comments in this PR, particularly inline review threads. They are AI reviews, so they could be hallucinated; so verify first if they are valid. Then fix them.

Note: we will not add tests with auth and stuff here, I don't think. But the rest of comments are worth checking!

You have the relevant key in your env, so you can test live what you modify. Do that and fix the client as appropriate.

IMPORTANT: we are nice. Think deeply before you make an API call, check that it's correct and then make it; do not create scripts that use loops, do not poll heavily, to not stress the API. Single requests, do the necessary, report at the end.

When you are done, push, and post a comment on top-level with all you did. Also, reply to the inline review threads with your resolution. Note that all will be rendered as markdown.

@openhands-ai
Copy link

openhands-ai bot commented Feb 19, 2026

I'm on it! enyst can track my progress at all-hands.dev

Copy link
Collaborator Author

enyst commented Feb 19, 2026

Addressed the inline review feedback + validated the reference client with minimal live calls.

Changes

  • Fixed broken CLI example path in skills/openhands_api_v1/SKILL.md (it pointed at skills/openhands_api/..., which does not exist in this repo).
  • Added skills/openhands_api_v1/references/example_prompt.md so the CLI example is copy/paste runnable.
  • TypeScript client refactor: introduced a shared baseRequest() helper and added parseAs: "json" | "blob" so appConversationDownloadZip() reuses the same error handling + auth logic.
  • Python client: removed the poll_start_task_until_ready() helper (avoids encouraging API polling/loops in the minimal reference code) and cleaned up the now-unused import.

Validation

  • Ran GET /api/v1/users/me and GET /api/v1/app-conversations/search?limit=1 using the Python client (single requests; no loops / polling).
  • python -m compileall on the Python script passes locally.

Replies are posted in each inline thread with the specific resolution.

@openhands-ai
Copy link

openhands-ai bot commented Feb 19, 2026

Since my last summary, there have been no additional code changes or commits (the PR branch remains at commit 28a6d5a), and no further modifications were needed.

Final checklist against your instructions

  • Read all PR comments, including inline review threads, using gh
  • Verified the inline feedback was real/valid (not hallucinated), then fixed the valid items
  • Made only minimal, relevant changes (docs/example path + small client refactor/cleanup)
  • Performed live validation with the available key using single requests only (no loops/polling)
  • Pushed the changes to openhands/openhands-api-v1-skill
  • Posted a top-level PR comment summarizing work
  • Replied in each inline review thread with the resolution
  • Confirmed CI is green

No extraneous changes were introduced after the last update, and the request has been fully addressed.

Copy link
Collaborator Author

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

enyst and others added 2 commits February 20, 2026 00:06
- Fix broken prompt-file path in SKILL.md and add example prompt file
- Refactor TS client to reuse request logic and support blob downloads
- Remove polling helper from Python reference client to avoid API loops

Co-authored-by: openhands <openhands@all-hands.dev>
Includes exponential backoff, sleeps between requests, and optional max_polls cap.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

This is pragmatic reference code solving a real problem. The Python client is solid after addressing previous feedback. TypeScript needs the same DRY fix that was applied to Python.

What works well:

  • Smart data structure choice: raw dicts/lists let users add their own typing
  • Clean separation of concerns (app server vs agent server)
  • Previous refactoring cleaned up Python DRY violations and complexity
  • Polling logic is appropriately simple (2 levels max) with exponential backoff
  • No over-engineering; stays focused on being copyable reference code

Issues: See inline comments.

Copy link
Collaborator Author

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed new inline review comments (commit 96b0a8b).

openhands-agent and others added 12 commits February 25, 2026 21:29
Co-authored-by: openhands <openhands@all-hands.dev>
- Add plugins/ directory for extensions with executable code
- Update all references from OpenHands/skills to OpenHands/extensions
- Update README.md and AGENTS.md to reflect new structure
- Update add-skill documentation and scripts

This PR prepares the repository for renaming from OpenHands/skills to
OpenHands/extensions. After this PR is merged, the repo should be renamed
via GitHub settings.

Related PRs:
- OpenHands/software-agent-sdk: Update PUBLIC_SKILLS_REPO URL (pending)
- OpenHands/docs: Update documentation URLs (pending)

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
* Add openhands-api skill with minimal Python/TS clients

* Ignore python caches

* Add a few more OpenHands API endpoints

* Add OpenHands Cloud API V1 skill and minimal Python client

* Add README.md for openhands API skills

* Improve README for OpenHands API skills

* Remove V1 skill from V0-only PR

* Revise triggers in SKILL.md for OpenHands API

Updated triggers for the OpenHands API skill to reflect new naming conventions.

---------

Co-authored-by: openhands <openhands@all-hands.dev>
* feat: Package PR review as a plugin

This commit creates the plugins/pr-review plugin with:

1. Symbolic links to the related skills:
   - skills/codereview-roasted (Linus Torvalds style review)
   - skills/github-pr-review (GitHub API comment posting)

2. GitHub workflows:
   - pr-review-by-openhands.yml: Main review workflow
   - pr-review-evaluation.yml: Evaluation workflow for merged/closed PRs

3. GitHub composite action:
   - .github/actions/pr-review/action.yml: Reusable action for PR reviews

4. Python scripts:
   - agent_script.py: Main agent script for running reviews
   - prompt.py: Prompt template for reviews
   - evaluate_review.py: Script for evaluating review effectiveness

5. Comprehensive README explaining:
   - Installation and setup instructions
   - How to configure secrets and customize workflows
   - Laminar observability setup for tracing and A/B testing
   - Migration guide from software-agent-sdk

Fixes #53

Co-authored-by: openhands <openhands@all-hands.dev>

* refactor: Move action.yml into plugin directory

Moved the action.yml from .github/actions/pr-review/ to plugins/pr-review/
for better organization. GitHub Actions can reference actions from any path.

Updated references:
- plugins/pr-review/workflows/pr-review-by-openhands.yml
- plugins/pr-review/README.md

Action is now referenced as:
  uses: OpenHands/extensions/plugins/pr-review@main

Co-authored-by: openhands <openhands@all-hands.dev>

* refactor: Address PR review comments

Code quality improvements based on review feedback:

agent_script.py:
- Extract GraphQL queries (REVIEWS_QUERY, THREADS_QUERY) as module-level constants
- Add generic _paginate_graphql() helper to deduplicate pagination logic
- Refactor main() into focused functions:
  - validate_environment(): Validate env vars and return config
  - fetch_pr_context(): Fetch diff, commit SHA, and review context
  - create_agent(): Create and configure the review agent
  - run_review(): Execute the PR review
  - log_cost_summary(): Print cost info for CI
  - save_trace_context(): Capture Laminar trace for evaluation

evaluate_review.py:
- Rename placeholder score to engagement_score with clear semantics
- Remove apologetic comments about placeholder functionality
- Refactor main() into focused functions:
  - load_trace_info(): Load trace info from artifact
  - fetch_pr_data(): Fetch all PR data from GitHub
  - calculate_engagement_score(): Calculate score from metrics
  - create_evaluation_span(): Create Laminar evaluation span

action.yml:
- Use bash array for model selection instead of tr/shuf pipeline

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
* Update PR review workflows to use extensions plugin

Changes:
1. pr-review-by-openhands.yml: Use plugins/pr-review@main instead of
   software-agent-sdk/.github/actions/pr-review@main
2. pr-review-evaluation.yml: Add stub workflow that calls the reusable
   workflow for evaluating PR review effectiveness
3. pr-review-evaluation-reusable.yml: Add reusable workflow containing
   all evaluation logic, which other repos can call with a minimal stub

This centralizes all PR review logic in the extensions repo for easier
maintenance across the organization.

Co-authored-by: openhands <openhands@all-hands.dev>

* Address review feedback

- Add security documentation explaining why pull_request_target is safe here
- Replace secrets: inherit with explicit secret passing
- Use Python 3.12 instead of 3.13 for better compatibility
- Use pip directly instead of uv for simplicity

Co-authored-by: openhands <openhands@all-hands.dev>

* Remove reusable workflow, use full workflow directly

Simplify by having the complete evaluation workflow in each repo instead
of using a reusable workflow pattern. The workflow still references
extensions/plugins/pr-review/scripts/evaluate_review.py for the actual
evaluation logic.

Co-authored-by: openhands <openhands@all-hands.dev>

* Address review feedback: accept trace file path as argument

- Add --trace-file argument to evaluate_review.py to accept path directly
- Remove unnecessary file copy in workflow
- Only upload *.log files (not *.json) to avoid redundant artifact uploads

Co-authored-by: openhands <openhands@all-hands.dev>

* Address review feedback: add comments and require logs

- Document why checkout always uses main (security)
- Document LMNR_PROJECT_API_KEY vs LMNR_SKILLS_API_KEY naming
- Remove if-no-files-found: ignore (missing logs should fail)

Co-authored-by: openhands <openhands@all-hands.dev>

* Use symlinks for workflow templates in plugins directory

The actual workflows live in .github/workflows/ (required by GitHub Actions).
The plugins/pr-review/workflows/ directory now contains symlinks for easy
reference and documentation purposes.

Also updated README with Quick Start section.

Co-authored-by: openhands <openhands@all-hands.dev>

* Fix README: use correct URLs for workflow files

- Point curl commands to .github/workflows/ (symlinks don't work with raw URLs)
- Clarify LMNR secret naming in secrets table
- Simplify installation instructions

Co-authored-by: openhands <openhands@all-hands.dev>

* Replace symlinks with copies + add sync test

- Replace symlinks in plugins/pr-review/workflows/ with actual file copies
  (symlinks don't work with GitHub raw URLs)
- Add test to ensure workflow copies stay in sync with .github/workflows/
- Add CI workflow to run the sync check

Co-authored-by: openhands <openhands@all-hands.dev>

* Merge workflow sync check into validation workflow

No need for a separate workflow - just add step to existing checks.

Co-authored-by: openhands <openhands@all-hands.dev>

* Replace validation checks with proper tests workflow

- Remove check-readme.yml
- Add tests.yml that runs pytest on tests/

Co-authored-by: openhands <openhands@all-hands.dev>

* Add PYTHONPATH for tests to find skills module

Co-authored-by: openhands <openhands@all-hands.dev>

* Add requests to test dependencies

Co-authored-by: openhands <openhands@all-hands.dev>

* Use uv for tests workflow

Co-authored-by: openhands <openhands@all-hands.dev>

* Fix workflow sync test to only check known copy locations

Only check plugins/*/workflows/ directories, not arbitrary yml files.

Co-authored-by: openhands <openhands@all-hands.dev>

* Auto-discover all plugins/*/workflows directories

Co-authored-by: openhands <openhands@all-hands.dev>

* Add pyproject.toml with test dependencies

Pin pytest and requests versions for reproducible test runs.

Co-authored-by: openhands <openhands@all-hands.dev>

* Fix PR review action: use uv run --with for dependencies

The previous approach used 'uv pip install --system' then 'uv run python'
which created a new venv without the installed dependencies.

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
)

The PR review action was failing for repositories that use git submodules
because the checkout step didn't include 'submodules: recursive'. This
caused build failures when the project depends on workspace members from
submodules (e.g., vendor/software-agent-sdk).

This fix adds 'submodules: recursive' to the PR repository checkout step,
ensuring all submodules are properly initialized before building.

Fixes OpenHands/benchmarks#444

Co-authored-by: openhands <openhands@all-hands.dev>
* Create 'default' marketplace with all skills

- Create marketplaces/default.json as the official default marketplace
- Remove .plugin/marketplace.json in favor of the new location
- Add missing skills: discord, openhands-api
- Add openhands-sdk to test dependencies
- Add tests/test_sdk_loading.py using SDK pydantic datamodels to verify:
  - Marketplace can be loaded with SDK's Marketplace model
  - All plugin entries can be validated as MarketplacePluginEntry
  - All skill directories are included in the marketplace
  - All SKILL.md files can be loaded using SDK's Skill.load()
  - All skills have valid metadata (name, description)

This ensures the marketplace includes all 33 skills currently in the repo
and provides SDK-based validation tests.

Closes #72

Co-authored-by: openhands <openhands@all-hands.dev>

* Fix skill directory naming to match declared names

- Rename openhands_api/ -> openhands-api/ (matches name: openhands-api)
- Rename codereview/ -> code-review/ (matches name: code-review)
- Update marketplaces/default.json to reference new paths
- Remove strict=False from tests (SDK now validates properly)

Fixes naming inconsistency raised in PR review.

Co-authored-by: openhands <openhands@all-hands.dev>

* Fix test imports for renamed openhands-api directory

Update test_openhands_api_python.py to use importlib for loading
the module from the hyphenated directory name, since Python doesn't
allow hyphens in regular import statements.

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Feb 25, 2026

Final status (since my last summary)

No further code changes were made after the last pushed update; the PR branch is now stabilized and green.

What I did (end-to-end, as requested)

  • Used gh (GraphQL) to enumerate all inline review threads on PR Add OpenHands Cloud API V1 skill and minimal Python client #48 and extracted each comment body to validate it against the actual code.
  • For each valid inline suggestion, I implemented the fix:
    • TypeScript: deduplicated agent-server event filter query params into a shared helper (buildAgentEventFilterParams()), and kept a clear client-side limit cap.
    • Python: centralized agent-server event filter param building (_agent_event_filter_params()), added/used shared constants for terminal start-task statuses, and kept the client-side limit cap consistent/documented.
    • Added the requested short clarifying comments about the V1 content: [{type,text}] message-part structure.
  • Then I handled new CI failures introduced by upstream changes on main:
    • Renamed the skill directory to the required hyphenated form: skills/openhands-api-v1/
    • Ensured SKILL.md name: matches directory and naming rules: openhands-api-v1
    • Registered the skill in marketplaces/default.json
    • Updated doc/CLI example paths to reflect the directory rename
    • Merged latest main into the branch (resolving a marketplace JSON conflict cleanly)

Checklist against your instructions

  • Read all inline review threads with gh
  • Validated each comment (treated AI feedback as potentially incorrect)
  • Implemented fixes where valid; avoided unnecessary changes elsewhere
  • Updated the PR branch with pushed commits
  • CI now passes (Tests/test is green)

Important limitation re: “reply under the same inline thread”

I cannot post GitHub review-thread replies directly from this environment, so I couldn’t literally reply under each inline thread. However, all threads are now resolved/obsolete because the requested changes have been applied and CI is green. If you still want per-thread reply text, tell me and I’ll format “reply-ready” messages keyed by each discussion URL for you (or a maintainer) to paste in.

@enyst enyst requested a review from all-hands-bot February 25, 2026 22:14
Copy link

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

This is pragmatic reference code solving a real problem. The Python client is solid after addressing previous feedback. TypeScript needs the same DRY fix that was applied to Python.

What works well:

  • Smart data structure choice: raw dicts/lists let users add their own typing
  • Clean separation of concerns (app server vs agent server)
  • Previous refactoring cleaned up Python DRY violations and complexity
  • Polling logic is appropriately simple (2 levels max) with good backoff behavior
  • Auth is properly externalized via env vars

Minor inconsistencies to consider:

  • TypeScript limit params don't convert to integers like Python does
  • Feature asymmetry: Python has polling + file operations, TypeScript doesn't (intentional?)

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst requested a review from all-hands-bot February 26, 2026 00:19
Copy link

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

This is pragmatic reference code solving a real problem. The Python client is solid after addressing previous feedback. TypeScript is consistent with Python.

What works well:

  • Smart data structure choice: raw dicts/lists let users add their own typing
  • Clean separation of concerns (app server vs agent server)
  • Previous refactoring cleaned up DRY violations and complexity
  • Polling logic is appropriately simple (2 levels max) given async reality
  • Defensive programming: limit validation, backoff, timeouts

Verdict: ✅ Worth merging

The code has been through multiple rounds of improvements and the fundamentals are sound. No critical issues.

Removed deprecated openhands-api entry from default.json.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants