Browser Use 2.0 #8941

hannesrudolph · 2025-10-31T01:29:47Z

Summary
This upgrades the in‑chat browsing experience with persistent sessions, clearer feedback, a dedicated browser panel, and more natural action descriptions.

Extension.Development.Host.rc10.2025-11-05.13-47-15.mp4

What's new

Persistent Browser Sessions
- The browser stays open across steps so you can send follow‑ups without relaunching.
- You’ll see a "Browser Session" header and a "Session started" note when active.
Dedicated Browser Session panel
- Open a full‑size view when you need more space, while keeping the chat context in view.
Live, readable action feed
- Actions are presented in plain language: Launch, Click, Type, Press, Hover, Scroll.
- Keyboard events now appear as "Press Enter" or "Press Esc" for easier scanning.
- Broader keyboard coverage: navigation keys and common shortcuts are supported for more natural control.
Inline console logs
- Console output is surfaced inline with a clear "No new logs" state.
- Noise-reduced by default: only new entries since the previous step are shown to cut repeat noise.
- Filter by type (Errors, Warnings, Logs) so you can focus on what matters.
Clear session controls
- A prominent Disconnect/Close control makes it easy to end a session when you’re done.
Interactive in-session controls
- Follow-ups attach to the active session so you can guide the assistant mid-flow without restarting.
- Suggested follow-ups appear inline to keep momentum.
More accurate interactions
- Improved click, scroll, and hover reliability across screen sizes with a consistent preview aspect ratio.
Seamless follow‑ups
- Keep chatting while the session is open; the assistant continues from the same context.
Fully localized
- New labels and action text are translated across all supported languages.

What you'll notice in the UI

"Browser Session" appears in chat when a session is active.
A "Session started" status line confirms the start.
Follow-up suggestions appear inside the Browser Session row when active.
Keyboard actions are summarized clearly (e.g., "Press Tab", "Shift+Tab", "Arrow keys").
New action wording like "Press Enter" or "Hover (x, y)".
Console Logs are visible inline, with a "No new logs" indicator and a noise‑reduced view that shows only new entries since the last step.
Type filters (All, Errors, Warnings, Logs) above the log list to quickly narrow the feed.
A quick Disconnect button to end the session.

Important

This PR enhances the in-chat browsing experience with persistent sessions, a dedicated browser panel, improved action descriptions, and various UI and backend updates to support these features.

Behavior:
- Introduces persistent browser sessions, allowing the browser to stay open across steps.
- Adds a dedicated browser session panel for a full-size view while keeping chat context visible.
- Implements a live, readable action feed with plain language descriptions for actions like Launch, Click, Type, etc.
- Adds inline console logs with filtering options by type (Errors, Warnings, Logs).
- Provides clear session controls with a prominent Disconnect/Close button.
- Supports interactive in-session controls for seamless follow-ups.
- Improves interaction accuracy with consistent preview aspect ratio.
- Fully localizes new labels and action text.
UI Changes:
- Displays "Browser Session" in chat when active, with a "Session started" status line.
- Shows follow-up suggestions inside the Browser Session row.
- Summarizes keyboard actions clearly (e.g., "Press Tab", "Shift+Tab").
- Displays console logs inline with a "No new logs" indicator.
- Adds type filters above the log list for quick narrowing.
- Includes a quick Disconnect button to end the session.
Code Changes:
- Adds browser-panel.tsx to vite.config.ts input.
- Updates presentAssistantMessage() in presentAssistantMessage.ts to handle browser session logic.
- Modifies getEnvironmentDetails.ts to include browser session status.
- Adds BrowserSessionPanelManager.ts for managing the browser session panel.
- Updates BrowserSession.ts to handle new session behaviors and interactions.
- Adds tests for new browser session functionalities in BrowserSession.spec.ts and BrowserActionTool.coordinateScaling.spec.ts.

^{This description was created by}^{for 2dde656. You can customize this summary. It will automatically update as commits are pushed.}

roomote · 2025-10-31T01:30:08Z

Oroocle Follow along on Roo Cloud

Review status: Reviewed latest changes at commit 20bdf53. Found one new issue around coordinate fallback in getViewportCoordinate; existing browser session TODOs remain open.

Race condition in browser session state callback
Missing size parameter in launch action
Coordinate fallback in getViewportCoordinate when viewport dimensions are unavailable
Coordinate format mismatch
ResizeObserver cleanup missing mounted check
Wrong parameter passed to getBrowserActionText

Previous reviews

20bdf53: Review #1

3ed1783: Review #2

311e194: Review #3

c1f8044: Review #4

8a289d9: Review #5

8a289d9: Review #6

ff44f0b: Review #7

791544f: Review #8

b8e5d6b: Review #9

b810920: Review #10

c72b812: Review #11

9cb7afa: Review #12

4e547af: Review #13

8f229ae: Review #14

eee8d51: Review #15

5a581b9: Review #16

d8a4bbd: Review #17

d9df0df: Review #18

ded2cfa: Review #19

50f5abb: Review #20

74e1251: Review #21

be85b08: Review #22

984b53f: Review #23

f430445: Review #24

99069c0: Review #25

46578e0: Review #26

d9df0df: Review #27

9311f10: Review #28

622f43a: Review #29

_{Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.}

src/core/task/Task.ts

src/core/tools/BrowserActionTool.ts

webview-ui/src/components/chat/BrowserSessionRow.tsx

Copilot

Pull Request Overview

This PR adds a new Browser Session panel feature that provides a dedicated UI for viewing and controlling browser automation sessions. Key improvements include:

New standalone browser session panel with navigation controls
Enhanced coordinate scaling for accurate click/hover actions on downscaled screenshots
New keyboard press action support
Improved browser session lifecycle management
Real-time browser session status tracking

Reviewed Changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
webview-ui/vite.config.ts	Adds browser-panel.html as a new build entry point
webview-ui/src/i18n/locales/*/chat.json	Adds translations for new browser session UI labels (session, press, hover actions)
webview-ui/src/components/chat/BrowserSessionRow.tsx	Major refactor: adds full browser-like UI with navigation, toolbar, and improved screenshot display
webview-ui/src/components/chat/BrowserActionRow.tsx	New component to display browser actions inline in chat with auto-panel-opening logic
webview-ui/src/components/browser-session/*	New components for standalone browser session panel
src/services/browser/BrowserSession.ts	Adds press() method, cursor visualization, viewport tracking, and state change callbacks
src/core/tools/browserActionTool.ts	Implements coordinate scaling from screenshot to viewport dimensions
src/core/webview/BrowserSessionPanelManager.ts	New manager for browser panel lifecycle and communication
src/shared/*Message.ts	Adds new message types for browser panel communication

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

webview-ui/src/components/chat/BrowserActionRow.tsx

webview-ui/src/components/chat/BrowserSessionRow.tsx

src/core/webview/BrowserSessionPanelManager.ts

src/services/browser/BrowserSession.ts

src/core/tools/BrowserActionTool.ts

src/core/tools/__tests__/BrowserActionTool.coordinateScaling.spec.ts

webview-ui/src/components/chat/BrowserSessionRow.tsx

src/core/tools/browserActionTool.ts

roomote

Review complete. I found 3 issues that should be addressed before approval. Please see the inline comments and checklist above.

src/shared/browserUtils.ts

hannesrudolph requested a review from mrubens as a code owner October 31, 2025 01:29

Copilot AI review requested due to automatic review settings October 31, 2025 01:29

hannesrudolph requested review from cte and jr as code owners October 31, 2025 01:29

github-project-automation bot moved this to Triage in Roo Code Roadmap Oct 31, 2025

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Oct 31, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Oct 31, 2025

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. UI/UX UI/UX related or focused labels Oct 31, 2025

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 31, 2025