Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Oct 31, 2025

Summary
This upgrades the in‑chat browsing experience with persistent sessions, clearer feedback, a dedicated browser panel, and more natural action descriptions.

Extension.Development.Host.rc10.2025-11-05.13-47-15.mp4

What's new

  • Persistent Browser Sessions
    • The browser stays open across steps so you can send follow‑ups without relaunching.
    • You’ll see a "Browser Session" header and a "Session started" note when active.
  • Dedicated Browser Session panel
    • Open a full‑size view when you need more space, while keeping the chat context in view.
  • Live, readable action feed
    • Actions are presented in plain language: Launch, Click, Type, Press, Hover, Scroll.
    • Keyboard events now appear as "Press Enter" or "Press Esc" for easier scanning.
    • Broader keyboard coverage: navigation keys and common shortcuts are supported for more natural control.
  • Inline console logs
    • Console output is surfaced inline with a clear "No new logs" state.
    • Noise-reduced by default: only new entries since the previous step are shown to cut repeat noise.
    • Filter by type (Errors, Warnings, Logs) so you can focus on what matters.
  • Clear session controls
    • A prominent Disconnect/Close control makes it easy to end a session when you’re done.
  • Interactive in-session controls
    • Follow-ups attach to the active session so you can guide the assistant mid-flow without restarting.
    • Suggested follow-ups appear inline to keep momentum.
  • More accurate interactions
    • Improved click, scroll, and hover reliability across screen sizes with a consistent preview aspect ratio.
  • Seamless follow‑ups
    • Keep chatting while the session is open; the assistant continues from the same context.
  • Fully localized
    • New labels and action text are translated across all supported languages.

What you'll notice in the UI

  • "Browser Session" appears in chat when a session is active.
  • A "Session started" status line confirms the start.
  • Follow-up suggestions appear inside the Browser Session row when active.
  • Keyboard actions are summarized clearly (e.g., "Press Tab", "Shift+Tab", "Arrow keys").
  • New action wording like "Press Enter" or "Hover (x, y)".
  • Console Logs are visible inline, with a "No new logs" indicator and a noise‑reduced view that shows only new entries since the last step.
  • Type filters (All, Errors, Warnings, Logs) above the log list to quickly narrow the feed.
  • A quick Disconnect button to end the session.

Important

This PR enhances the in-chat browsing experience with persistent sessions, a dedicated browser panel, improved action descriptions, and various UI and backend updates to support these features.

  • Behavior:
    • Introduces persistent browser sessions, allowing the browser to stay open across steps.
    • Adds a dedicated browser session panel for a full-size view while keeping chat context visible.
    • Implements a live, readable action feed with plain language descriptions for actions like Launch, Click, Type, etc.
    • Adds inline console logs with filtering options by type (Errors, Warnings, Logs).
    • Provides clear session controls with a prominent Disconnect/Close button.
    • Supports interactive in-session controls for seamless follow-ups.
    • Improves interaction accuracy with consistent preview aspect ratio.
    • Fully localizes new labels and action text.
  • UI Changes:
    • Displays "Browser Session" in chat when active, with a "Session started" status line.
    • Shows follow-up suggestions inside the Browser Session row.
    • Summarizes keyboard actions clearly (e.g., "Press Tab", "Shift+Tab").
    • Displays console logs inline with a "No new logs" indicator.
    • Adds type filters above the log list for quick narrowing.
    • Includes a quick Disconnect button to end the session.
  • Code Changes:
    • Adds browser-panel.tsx to vite.config.ts input.
    • Updates presentAssistantMessage() in presentAssistantMessage.ts to handle browser session logic.
    • Modifies getEnvironmentDetails.ts to include browser session status.
    • Adds BrowserSessionPanelManager.ts for managing the browser session panel.
    • Updates BrowserSession.ts to handle new session behaviors and interactions.
    • Adds tests for new browser session functionalities in BrowserSession.spec.ts and BrowserActionTool.coordinateScaling.spec.ts.

This description was created by Ellipsis for 2dde656. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings October 31, 2025 01:29
@hannesrudolph hannesrudolph requested review from cte and jr as code owners October 31, 2025 01:29
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. UI/UX UI/UX related or focused labels Oct 31, 2025
@roomote
Copy link
Contributor

roomote bot commented Oct 31, 2025

Oroocle Clock   Follow along on Roo Cloud

Review status: Reviewed latest changes at commit 20bdf53. Found one new issue around coordinate fallback in getViewportCoordinate; existing browser session TODOs remain open.

  • Race condition in browser session state callback
  • Missing size parameter in launch action
  • Coordinate fallback in getViewportCoordinate when viewport dimensions are unavailable
  • Coordinate format mismatch
  • ResizeObserver cleanup missing mounted check
  • Wrong parameter passed to getBrowserActionText
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 31, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new Browser Session panel feature that provides a dedicated UI for viewing and controlling browser automation sessions. Key improvements include:

  • New standalone browser session panel with navigation controls
  • Enhanced coordinate scaling for accurate click/hover actions on downscaled screenshots
  • New keyboard press action support
  • Improved browser session lifecycle management
  • Real-time browser session status tracking

Reviewed Changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
webview-ui/vite.config.ts Adds browser-panel.html as a new build entry point
webview-ui/src/i18n/locales/*/chat.json Adds translations for new browser session UI labels (session, press, hover actions)
webview-ui/src/components/chat/BrowserSessionRow.tsx Major refactor: adds full browser-like UI with navigation, toolbar, and improved screenshot display
webview-ui/src/components/chat/BrowserActionRow.tsx New component to display browser actions inline in chat with auto-panel-opening logic
webview-ui/src/components/browser-session/* New components for standalone browser session panel
src/services/browser/BrowserSession.ts Adds press() method, cursor visualization, viewport tracking, and state change callbacks
src/core/tools/browserActionTool.ts Implements coordinate scaling from screenshot to viewport dimensions
src/core/webview/BrowserSessionPanelManager.ts New manager for browser panel lifecycle and communication
src/shared/*Message.ts Adds new message types for browser panel communication

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. I found 3 issues that should be addressed before approval. Please see the inline comments and checklist above.

@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Nov 4, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Nov 4, 2025
@hannesrudolph hannesrudolph moved this from PR [Needs Prelim Review] to PR [Draft / In Progress] in Roo Code Roadmap Nov 5, 2025
@hannesrudolph hannesrudolph moved this from PR [Draft / In Progress] to PR [Needs Prelim Review] in Roo Code Roadmap Nov 6, 2025
@mrubens mrubens merged commit ee93530 into main Nov 21, 2025
10 checks passed
@mrubens mrubens deleted the Browser-Use-2.0 branch November 21, 2025 22:26
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Nov 21, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer PR - Needs Preliminary Review size:XXL This PR changes 1000+ lines, ignoring generated files. UI/UX UI/UX related or focused

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants