Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
0cf7696
feat: add script injection support for browser session recording
openhands-agent Jan 14, 2026
816e51b
feat: add browser_start_recording and browser_stop_recording tools
openhands-agent Jan 14, 2026
11aedc3
test: add unit and E2E tests for browser session recording
openhands-agent Jan 14, 2026
68c6b80
docs: add browser session recording example
openhands-agent Jan 14, 2026
2ed2f9b
fix: add retry mechanism to browser recording and improve stub
openhands-agent Jan 14, 2026
8aec4f9
Update 34_browser_session_recording.py
malhotra5 Jan 14, 2026
cd118f7
fix: use unpkg CDN for rrweb to fix MIME type issue
openhands-agent Jan 14, 2026
0a7d3ee
Update 34_browser_session_recording.py
malhotra5 Jan 14, 2026
0f84145
fix: persist browser recording across page navigations
openhands-agent Jan 14, 2026
c9370c3
feat: auto-save browser recordings to file, return concise summary
openhands-agent Jan 15, 2026
87c36a0
Update 34_browser_session_recording.py
malhotra5 Jan 15, 2026
b89df77
docs: update browser recording example with persistence_dir
openhands-agent Jan 15, 2026
de8481a
Update 34_browser_session_recording.py
malhotra5 Jan 15, 2026
d227c50
Merge branch 'feat/browser-session-recording' of https://github.com/O…
malhotra5 Jan 15, 2026
49e360a
fix persistence path check
malhotra5 Jan 15, 2026
326251f
Update 33_browser_session_recording.py
malhotra5 Jan 15, 2026
167def2
Remove fallback stub from browser recording; report failures directly
openhands-agent Jan 17, 2026
4e15af4
Improve recording event flushing: periodic saves to numbered files
openhands-agent Jan 17, 2026
bba1480
Refactor: Extract injected JavaScript to constants at top of file
openhands-agent Jan 17, 2026
9852b34
Fix: Check for existing files before saving recording events
openhands-agent Jan 17, 2026
c115d52
Fix: session recording periodic flush and CancelledError handling
openhands-agent Feb 10, 2026
5b5b44a
Fix: start periodic flush task when recording is already active
openhands-agent Feb 10, 2026
5108e3c
Fix: only start recording when agent explicitly requests it
openhands-agent Feb 10, 2026
73e8480
Add unit tests for recording flush behavior
openhands-agent Feb 10, 2026
1d28599
Potential fix for pull request finding 'Empty except'
malhotra5 Feb 10, 2026
b1adb11
Merge branch 'main' into feat/browser-session-recording
malhotra5 Feb 10, 2026
bf09d97
rename file
malhotra5 Feb 10, 2026
4ffb097
Fix unreachable except clause in test_browser_executor_e2e.py
openhands-agent Feb 10, 2026
ac4a6e5
Trigger CI re-run for docs check
openhands-agent Feb 10, 2026
f792a81
Refactor: Encapsulate recording state in RecordingSession class
openhands-agent Feb 10, 2026
a71051d
Fix concurrency race condition in browser recording flush
openhands-agent Feb 10, 2026
8838b52
Fix browser_stop_recording API documentation contract
openhands-agent Feb 10, 2026
8888ecd
Merge branch 'main' into feat/browser-session-recording
malhotra5 Feb 10, 2026
f4b7cae
Refactor RecordingSession to use EventBuffer and RecordingState
openhands-agent Feb 10, 2026
50278b6
Remove backward compatibility code and update tests to use new API
openhands-agent Feb 10, 2026
924453c
Fix file count reporting with existing files in save_dir
openhands-agent Feb 10, 2026
e423730
Refactor polling anti-pattern to use event-driven Promise-based waiting
openhands-agent Feb 10, 2026
66cc91d
Remove size-based flushing for recording events
openhands-agent Feb 10, 2026
da2f6cf
Refactor: Move JavaScript code to separate files for better maintaina…
openhands-agent Feb 10, 2026
1f74b0b
Replace mock-only recording tests with real behavior tests
openhands-agent Feb 10, 2026
db82d62
Merge branch 'main' into feat/browser-session-recording
malhotra5 Feb 10, 2026
9fe1520
feat: separate recordings into timestamped subfolders
openhands-agent Feb 10, 2026
8f67fd6
Fix decorator exception handling to be more specific
openhands-agent Feb 11, 2026
23e6404
Fix readOnlyHint for browser_stop_recording tool
openhands-agent Feb 11, 2026
eee270d
Remove unused save_dir parameter from _stop_recording
openhands-agent Feb 11, 2026
e9cfa9d
Document CDN dependency risk in RecordingConfig
openhands-agent Feb 11, 2026
0409976
Optimize file numbering with one-time directory scan
openhands-agent Feb 11, 2026
14edf83
Document removal of size-based flushing in EventBuffer
openhands-agent Feb 11, 2026
f143e0c
Remove size-based flushing documentation from EventBuffer
openhands-agent Feb 11, 2026
947a9c1
Simplify directory naming: base_save_dir -> output_dir, save_dir -> s…
openhands-agent Feb 11, 2026
ec23c36
Clarify lock documentation: rename to _event_buffer_lock and fix term…
openhands-agent Feb 11, 2026
f3119ad
Simplify RecordingState enum to boolean _is_recording
openhands-agent Feb 11, 2026
f1b081c
Simplify error handling and improve logging for recording
openhands-agent Feb 11, 2026
5b9be07
Refactor: Extract EventStorage from RecordingSession
openhands-agent Feb 11, 2026
a9b85b8
Fix: Look for recording files in timestamped subdirectory
openhands-agent Feb 11, 2026
6ca3d4a
Merge branch 'main' into feat/browser-session-recording
xingyaoww Feb 11, 2026
cdbc830
Use .agent_tmp for persistence_dir in browser session recording example
openhands-agent Feb 11, 2026
243b8a7
Revert persistence_dir change, save recordings to .agent_tmp instead
openhands-agent Feb 11, 2026
d45e701
Merge branch 'main' into feat/browser-session-recording
xingyaoww Feb 11, 2026
53dface
refactor(recording): extract helper methods from start() to reduce co…
openhands-agent Feb 11, 2026
1bae49e
fix(recording): cleanup recording session when browser session closes
openhands-agent Feb 11, 2026
8fa4ab9
refactor(recording): move recording output dir to global constant and…
openhands-agent Feb 11, 2026
3e2bfbf
docs(recording): add error handling policy documentation and inline c…
openhands-agent Feb 11, 2026
91a3b7e
fix(async_executor): remove atexit handler to fix cleanup ordering
xingyaoww Feb 11, 2026
0b0c7e3
fix: skip recording test when browser initialization fails
openhands-agent Feb 11, 2026
8e1a6e0
Merge branch 'main' into feat/browser-session-recording
malhotra5 Feb 11, 2026
70ca3b2
fix: update test to check correct recording output directory
openhands-agent Feb 11, 2026
f283e00
Merge branch 'main' into feat/browser-session-recording
xingyaoww Feb 11, 2026
fe6f705
Revert "fix(async_executor): remove atexit handler to fix cleanup ord…
openhands-agent Feb 11, 2026
0fcb494
feat: track consecutive flush failures and warn user
openhands-agent Feb 11, 2026
80ba4b5
Merge branch 'main' into feat/browser-session-recording
xingyaoww Feb 11, 2026
a25e8c5
Enforce approval when PR is deemed worth merging
openhands-agent Feb 11, 2026
98626e6
Merge branch 'main' into feat/browser-session-recording
xingyaoww Feb 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .agents/skills/code-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ You have permission to **APPROVE** or **COMMENT** on PRs. Do not use REQUEST_CHA

**Default to APPROVE**: If your review finds no issues at "important" level or higher, approve the PR. Minor suggestions or nitpicks alone are not sufficient reason to withhold approval.

**IMPORTANT: If you determine a PR is worth merging, you MUST approve it.** Do not just say a PR is "worth merging" or "ready to merge" without actually submitting an approval. Your words and actions must be consistent.

### When to APPROVE

Approve PRs that are straightforward and low-risk:
Expand Down
15 changes: 15 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,21 @@ git push -u origin <feature-name>
```
</DOCUMENTATION_WORKFLOW>

<AGENT_TMP_DIRECTORY>
# Agent Temporary Directory Convention

When tools need to store observation files (e.g., browser session recordings, task tracker data), use `.agent_tmp` as the directory name for consistency.

The browser session recording tool saves recordings to `.agent_tmp/observations/recording-{timestamp}/`.

This convention ensures tool-generated observation files are stored in a predictable location that can be easily:
- Added to `.gitignore`
- Cleaned up after agent sessions
- Identified as agent-generated artifacts

Note: This is separate from `persistence_dir` which is used for conversation state persistence.
</AGENT_TMP_DIRECTORY>

<REPO>
<PROJECT_STRUCTURE>
- `openhands-sdk/` core SDK; `openhands-tools/` built-in tools; `openhands-workspace/` workspace management; `openhands-agent-server/` server runtime; `examples/` runnable patterns; `tests/` split by domain (`tests/sdk`, `tests/tools`, `tests/agent_server`, etc.).
Expand Down
178 changes: 178 additions & 0 deletions examples/01_standalone_sdk/38_browser_session_recording.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
"""Browser Session Recording Example

This example demonstrates how to use the browser session recording feature
to capture and save a recording of the agent's browser interactions using rrweb.

The recording can be replayed later using rrweb-player to visualize the agent's
browsing session.

The recording will be automatically saved to the persistence directory when
browser_stop_recording is called. You can replay it with:
- rrweb-player: https://github.com/rrweb-io/rrweb/tree/master/packages/rrweb-player
- Online viewer: https://www.rrweb.io/demo/
"""

import json
import os

from pydantic import SecretStr

from openhands.sdk import (
LLM,
Agent,
Conversation,
Event,
LLMConvertibleEvent,
get_logger,
)
from openhands.sdk.tool import Tool
from openhands.tools.browser_use import BrowserToolSet
from openhands.tools.browser_use.definition import BROWSER_RECORDING_OUTPUT_DIR


logger = get_logger(__name__)

# Configure LLM
api_key = os.getenv("LLM_API_KEY")
assert api_key is not None, "LLM_API_KEY environment variable is not set."
model = os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929")
base_url = os.getenv("LLM_BASE_URL")
llm = LLM(
usage_id="agent",
model=model,
base_url=base_url,
api_key=SecretStr(api_key),
)

# Tools - including browser tools with recording capability
cwd = os.getcwd()
tools = [
Tool(name=BrowserToolSet.name),
]

# Agent
agent = Agent(llm=llm, tools=tools)

llm_messages = [] # collect raw LLM messages


def conversation_callback(event: Event):
if isinstance(event, LLMConvertibleEvent):
llm_messages.append(event.to_llm_message())


# Create conversation with persistence_dir set to save browser recordings
conversation = Conversation(
agent=agent,
callbacks=[conversation_callback],
workspace=cwd,
persistence_dir="./.conversations",
)

# The prompt instructs the agent to:
# 1. Start recording the browser session
# 2. Browse to a website and perform some actions
# 3. Stop recording (auto-saves to file)
PROMPT = """
Please complete the following task to demonstrate browser session recording:

1. First, use `browser_start_recording` to begin recording the browser session.

2. Then navigate to https://docs.openhands.dev/ and:
- Get the page content
- Scroll down the page
- Get the browser state to see interactive elements

3. Next, navigate to https://docs.openhands.dev/openhands/usage/cli/installation and:
- Get the page content
- Scroll down to see more content

4. Finally, use `browser_stop_recording` to stop the recording.
Events are automatically saved.
"""

print("=" * 80)
print("Browser Session Recording Example")
print("=" * 80)
print("\nTask: Record an agent's browser session and save it for replay")
print("\nStarting conversation with agent...\n")

conversation.send_message(PROMPT)
conversation.run()

print("\n" + "=" * 80)
print("Conversation finished!")
print("=" * 80)

# Check if the recording files were created
# Recordings are saved in BROWSER_RECORDING_OUTPUT_DIR/recording-{timestamp}/
if os.path.exists(BROWSER_RECORDING_OUTPUT_DIR):
# Find recording subdirectories (they start with "recording-")
recording_dirs = sorted(
[
d
for d in os.listdir(BROWSER_RECORDING_OUTPUT_DIR)
if d.startswith("recording-")
and os.path.isdir(os.path.join(BROWSER_RECORDING_OUTPUT_DIR, d))
]
)

if recording_dirs:
# Process the most recent recording directory
latest_recording = recording_dirs[-1]
recording_path = os.path.join(BROWSER_RECORDING_OUTPUT_DIR, latest_recording)
json_files = sorted(
[f for f in os.listdir(recording_path) if f.endswith(".json")]
)

print(f"\n✓ Recording saved to: {recording_path}")
print(f"✓ Number of files: {len(json_files)}")

# Count total events across all files
total_events = 0
all_event_types: dict[int | str, int] = {}
total_size = 0

for json_file in json_files:
filepath = os.path.join(recording_path, json_file)
file_size = os.path.getsize(filepath)
total_size += file_size

with open(filepath) as f:
events = json.load(f)

# Events are stored as a list in each file
if isinstance(events, list):
total_events += len(events)
for event in events:
event_type = event.get("type", "unknown")
all_event_types[event_type] = all_event_types.get(event_type, 0) + 1

print(f" - {json_file}: {len(events)} events, {file_size} bytes")

print(f"✓ Total events: {total_events}")
print(f"✓ Total size: {total_size} bytes")
if all_event_types:
print(f"✓ Event types: {all_event_types}")

print("\nTo replay this recording, you can use:")
print(
" - rrweb-player: "
"https://github.com/rrweb-io/rrweb/tree/master/packages/rrweb-player"
)
else:
print(f"\n✗ No recording directories found in: {BROWSER_RECORDING_OUTPUT_DIR}")
print(" The agent may not have completed the recording task.")
else:
print(f"\n✗ Observations directory not found: {BROWSER_RECORDING_OUTPUT_DIR}")
print(" The agent may not have completed the recording task.")

print("\n" + "=" * 100)
print("Conversation finished.")
print(f"Total LLM messages: {len(llm_messages)}")
print("=" * 100)

# Report cost
cost = conversation.conversation_stats.get_combined_metrics().accumulated_cost
print(f"Conversation ID: {conversation.id}")
print(f"EXAMPLE_COST: {cost}")
103 changes: 103 additions & 0 deletions openhands-tools/openhands/tools/browser_use/definition.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import base64
import hashlib
import os
from collections.abc import Sequence
from pathlib import Path
from typing import TYPE_CHECKING, Literal, Self
Expand All @@ -25,6 +26,9 @@
from openhands.tools.browser_use.impl import BrowserToolExecutor


# Directory where browser session recordings are saved
BROWSER_RECORDING_OUTPUT_DIR = os.path.join(".agent_tmp", "browser_observations")

# Mapping of base64 prefixes to MIME types for image detection
BASE64_IMAGE_PREFIXES = {
"/9j/": "image/jpeg",
Expand Down Expand Up @@ -668,6 +672,103 @@ def create(cls, executor: "BrowserToolExecutor") -> Sequence[Self]:
]


# ============================================
# `browser_start_recording`
# ============================================
class BrowserStartRecordingAction(BrowserAction):
"""Schema for starting browser session recording."""

pass


BROWSER_START_RECORDING_DESCRIPTION = f"""Start recording the browser session.

This tool starts recording all browser interactions using rrweb. The recording
captures DOM mutations, mouse movements, clicks, scrolls, and other user interactions.

Output Location: {BROWSER_RECORDING_OUTPUT_DIR}/recording-<timestamp>/
Format: Recording events are saved as numbered JSON files (1.json, 2.json, etc.)
containing rrweb event arrays. Events are flushed every 5 seconds or when they
exceed 1 MB. These files can be replayed using rrweb-player.

Call browser_stop_recording to stop recording and save any remaining events.

Note: Recording persists across page navigations - the recording will automatically
restart on new pages.
"""


class BrowserStartRecordingTool(
ToolDefinition[BrowserStartRecordingAction, BrowserObservation]
):
"""Tool for starting browser session recording."""

@classmethod
def create(cls, executor: "BrowserToolExecutor") -> Sequence[Self]:
return [
cls(
description=BROWSER_START_RECORDING_DESCRIPTION,
action_type=BrowserStartRecordingAction,
observation_type=BrowserObservation,
annotations=ToolAnnotations(
title="browser_start_recording",
readOnlyHint=False,
destructiveHint=False,
idempotentHint=False,
openWorldHint=False,
),
executor=executor,
)
]


# ============================================
# `browser_stop_recording`
# ============================================
class BrowserStopRecordingAction(BrowserAction):
"""Schema for stopping browser session recording."""

pass


BROWSER_STOP_RECORDING_DESCRIPTION = f"""Stop recording the browser session.

This tool stops the current recording session and saves any remaining events to disk.

Output Location: {BROWSER_RECORDING_OUTPUT_DIR}/recording-<timestamp>/
Format: Events are saved as numbered JSON files (1.json, 2.json, etc.) containing
rrweb event arrays. These files can be replayed using rrweb-player to visualize
the recorded session.

Returns a summary message with the total event count, file count, and save directory.
"""


class BrowserStopRecordingTool(
ToolDefinition[BrowserStopRecordingAction, BrowserObservation]
):
"""Tool for stopping browser session recording."""

@classmethod
def create(cls, executor: "BrowserToolExecutor") -> Sequence[Self]:
return [
cls(
description=BROWSER_STOP_RECORDING_DESCRIPTION,
action_type=BrowserStopRecordingAction,
observation_type=BrowserObservation,
annotations=ToolAnnotations(
title="browser_stop_recording",
# Modifies state: stops recording, flushes events to disk
readOnlyHint=False,
destructiveHint=False,
idempotentHint=False,
openWorldHint=False,
),
executor=executor,
)
]


class BrowserToolSet(ToolDefinition[BrowserAction, BrowserObservation]):
"""A set of all browser tools.

Expand Down Expand Up @@ -721,6 +822,8 @@ def create(
BrowserCloseTabTool,
BrowserGetStorageTool,
BrowserSetStorageTool,
BrowserStartRecordingTool,
BrowserStopRecordingTool,
]:
tools.extend(tool_class.create(executor))
return tools
Expand Down
Loading
Loading