Skip to content

Conversation

@leonkaushikdeka
Copy link

Summary

Adds comprehensive browser automation using Playwright, transforming OpenCode from a local-only coding agent into one that can interact with web applications.

Browser Control Capabilities

  • Navigation: navigate, back, forward, refresh, open, close tabs
  • Element Interaction: click, fill, hover, right-click, double-click, drag-drop
  • Scrolling: scroll, scrollTo, scrollTop, scrollBottom
  • Form Handling: check/uncheck, select dropdown, clear, get value
  • Content Extraction: get text, attributes, CSS, page source, screenshots
  • JavaScript: execute arbitrary JS in page context
  • Waiting: wait for elements, wait for URL patterns
  • Storage: cookies, localStorage, sessionStorage (get/set/delete)
  • Configuration: viewport, userAgent, geolocation, timezone
  • Assertions: assert text, visibility, URL patterns
  • Multi-Browser: Chromium (default), Firefox, Webkit

Features

  • Headed mode by default for visual feedback
  • Resource limits (10 pages max, 512MB RAM, 30min idle timeout)
  • Retry logic for element staleness
  • Permission system requiring user approval for browser actions

## Summary
Adds comprehensive browser automation using Playwright, transforming OpenCode from a local-only coding agent into one that can interact with web applications.

## Browser Control Capabilities
- **Navigation**: navigate, back, forward, refresh, open, close tabs
- **Element Interaction**: click, fill, hover, right-click, double-click, drag-drop
- **Scrolling**: scroll, scrollTo, scrollTop, scrollBottom
- **Form Handling**: check/uncheck, select dropdown, clear, get value
- **Content Extraction**: get text, attributes, CSS, page source, screenshots
- **JavaScript**: execute arbitrary JS in page context
- **Waiting**: wait for elements, wait for URL patterns
- **Storage**: cookies, localStorage, sessionStorage (get/set/delete)
- **Configuration**: viewport, userAgent, geolocation, timezone
- **Assertions**: assert text, visibility, URL patterns
- **Multi-Browser**: Chromium (default), Firefox, Webkit

## Features
- Headed mode by default for visual feedback
- Resource limits (10 pages max, 512MB RAM, 30min idle timeout)
- Retry logic for element staleness
- Permission system requiring user approval for browser actions

## Example Usage
```
browser_navigate({"url": "https://example.com/login"})
browser_fill({"selector": "#email", "value": "user@example.com"})
browser_fill({"selector": "#password", "value": "secret123"})
browser_click({"selector": "button[type=submit]"})
browser_assertURL({"pattern": "/dashboard"})
```
@github-actions
Copy link
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Contributor

The following comment was made by an LLM, it may be inaccurate:

Potential Duplicate Found

PR #7302: [FEATURE] Added in-built browser tools using playwright and a parallel playwright node process using spawn for bun-playwright issues

Why it's related: This PR also implements browser automation using Playwright with tools for browser control. It appears to address the same core feature of adding browser interaction capabilities to OpenCode, potentially handling similar functionality for controlling browsers with Playwright.

You may want to review if PR #7302 is still open/active, and whether PR #8489 is a newer implementation or addresses additional gaps.

@leonkaushikdeka leonkaushikdeka closed this by deleting the head repository Jan 14, 2026
@ForLoopCodes
Copy link

im sorry bro 😭😭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants