Added Teracrawl #13

Brenden2008 · 2025-12-03T02:44:13Z

Added Teracrawl benchmarks, an open source web scraping API powered by Browser.cash.

The Browser Cash team is happy to collaborate on maintaining datasets and creating new evals. Reach us at alex@megatera.ai

We achieved a success score of 84.2% and an F1 score of 62.7%.

Added:

Teracrawl engine benchmarks
Edited README to include the benchmark results

Reproduction guide:

Here's the env config we used for Teracrawl:

# Browser.cash API Key (Required)
# Get one at https://browser.cash
BROWSER_API_KEY=

# Datalab.to API Key for PDF processing (Enabled for the benchmark)
DATALAB_API_KEY=

# Server Configuration
PORT=8085
HOST=0.0.0.0
DEBUG_LOG=false

# Services
SERP_SERVICE_URL=http://localhost:8080

# Session Pool Config
POOL_SIZE=10

# Crawler Tuning
CRAWL_TABS_PER_SESSION=5
CRAWL_MIN_CONTENT_LENGTH=200
CRAWL_NAVIGATION_TIMEOUT_MS=10000
CRAWL_SLOW_TIMEOUT_MS=20000
CRAWL_JITTER_MS=0

MAX_CONCURRENT_BATCHES=10

And the command we ran the scrape evals with

uv run run_eval.py --scrape_engine teracrawl_api --suite quality --output-dir runs/results --dataset datasets/1-0-0.csv --max-workers 1

We ran Teracrawl and the benchmark on the same node, so you don't need to make any config changes to the env on the scrape-evals side.

Brenden2008 added 2 commits December 2, 2025 05:39

Add Teracrawl support

6c7820d

Add Teracrawl results to README

1d4a3a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added Teracrawl #13

Added Teracrawl #13

Uh oh!

Brenden2008 commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Added Teracrawl #13

Are you sure you want to change the base?

Added Teracrawl #13

Uh oh!

Conversation

Brenden2008 commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant