Skip to content

Commit 1f3a152

Browse files
committed
Update README.md
1 parent 0358380 commit 1f3a152

File tree

1 file changed

+36
-4
lines changed

1 file changed

+36
-4
lines changed

README.md

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,27 @@ A production-ready [Model Context Protocol](https://modelcontextprotocol.io/intr
2020

2121
The server provides the following enterprise-ready tools:
2222

23+
### Core Scraping Tools
24+
2325
- `markdownify(website_url: str)`: Transform any webpage into clean, structured markdown format
24-
- `smartscraper(user_prompt: str, website_url: str)`: Leverage AI to extract structured data from any webpage
25-
- `searchscraper(user_prompt: str)`: Execute AI-powered web searches with structured, actionable results
26+
- `smartscraper(user_prompt: str, website_url: str, number_of_scrolls: int = None, markdown_only: bool = None)`: Leverage AI to extract structured data from any webpage with support for infinite scrolling
27+
- `searchscraper(user_prompt: str, num_results: int = None, number_of_scrolls: int = None)`: Execute AI-powered web searches with structured, actionable results
28+
29+
### Advanced Scraping Tools
30+
31+
- `scrape(website_url: str, render_heavy_js: bool = None)`: Basic scraping endpoint to fetch page content with optional heavy JavaScript rendering
32+
- `sitemap(website_url: str)`: Extract sitemap URLs and structure for any website
33+
34+
### Multi-Page Crawling
35+
36+
- `smartcrawler_initiate(url: str, prompt: str = None, extraction_mode: str = "ai", depth: int = None, max_pages: int = None, same_domain_only: bool = None)`: Initiate intelligent multi-page web crawling with two modes:
37+
- **AI Extraction Mode** (10 credits per page): Extracts structured data based on your prompt
38+
- **Markdown Conversion Mode** (2 credits per page): Converts pages to clean markdown
39+
- `smartcrawler_fetch_results(request_id: str)`: Retrieve results from asynchronous crawling operations
40+
41+
### Intelligent Agent-Based Scraping
42+
43+
- `agentic_scrapper(url: str, user_prompt: str = None, output_schema: dict = None, steps: list = None, ai_extraction: bool = None, persistent_session: bool = None, timeout_seconds: float = None)`: Run advanced agentic scraping workflows with customizable steps and structured output schemas
2644

2745
## Setup Instructions
2846

@@ -77,11 +95,25 @@ Add the ScrapeGraphAI MCP server on the settings:
7795

7896
The server enables sophisticated queries such as:
7997

98+
### Single Page Scraping
8099
- "Analyze and extract the main features of the ScapeGraph API"
81100
- "Generate a structured markdown version of the ScapeGraph homepage"
82-
- "Extract and analyze pricing information from the ScapeGraph website"
101+
- "Extract and analyze pricing information from the ScapeGraph website with infinite scroll support"
102+
- "Scrape this JavaScript-heavy page with full rendering"
103+
104+
### Search and Research
83105
- "Research and summarize recent developments in AI-powered web scraping"
84-
- "Create a comprehensive summary of the Python documentation website"
106+
- "Search for the top 5 articles about machine learning frameworks and extract key points"
107+
108+
### Multi-Page Crawling
109+
- "Crawl the entire documentation site and convert all pages to markdown"
110+
- "Extract all product information from an e-commerce site up to 3 levels deep"
111+
- "Crawl a blog and extract all article titles, authors, and summaries"
112+
113+
### Advanced Agentic Scraping
114+
- "Navigate through a multi-step form and extract the final results"
115+
- "Follow pagination links and compile a complete dataset"
116+
- "Execute a complex workflow with custom extraction schema"
85117

86118
## Error Handling
87119

0 commit comments

Comments
 (0)