A Model Context Protocol (MCP) server for web scraping, content searching, site crawling, and data extraction using the Firecrawl API.
-
Web Scraping: Extract content from any webpage with customizable options
- Mobile device emulation
- Ad and popup blocking
- Content filtering
- Structured data extraction
- Multiple output formats
-
Content Search: Intelligent search capabilities
- Multi-language support
- Location-based results
- Customizable result limits
- Structured output formats
-
Site Crawling: Advanced web crawling functionality
- Depth control
- Path filtering
- Rate limiting
- Progress tracking
- Sitemap integration
-
Site Mapping: Generate site structure maps
- Subdomain support
- Search filtering
- Link analysis
- Visual hierarchy
-
Data Extraction: Extract structured data from multiple URLs
- Schema validation
- Batch processing
- Web search enrichment
- Custom extraction prompts
# Global installation
npm install -g @modelcontextprotocol/mcp-server-firecrawl
# Local project installation
npm install @modelcontextprotocol/mcp-server-firecrawl
-
Get your Firecrawl API key from the developer portal
-
Set your API key:
Unix/Linux/macOS (bash/zsh):
export FIRECRAWL_API_KEY=your-api-key
Windows (Command Prompt):
set FIRECRAWL_API_KEY=your-api-key
Windows (PowerShell):
$env:FIRECRAWL_API_KEY = "your-api-key"
Alternative: Using .env file (recommended for development):
# Install dotenv npm install dotenv # Create .env file echo "FIRECRAWL_API_KEY=your-api-key" > .env
Then in your code:
import dotenv from 'dotenv'; dotenv.config();
-
Run the server:
mcp-server-firecrawl
Add to your MCP settings:
{
"firecrawl": {
"command": "mcp-server-firecrawl",
"env": {
"FIRECRAWL_API_KEY": "your-api-key"
}
}
}
Add to your MCP configuration:
{
"mcpServers": {
"firecrawl": {
"command": "mcp-server-firecrawl",
"env": {
"FIRECRAWL_API_KEY": "your-api-key"
}
}
}
}
// Basic scraping
{
name: "scrape_url",
arguments: {
url: "https://example.com",
formats: ["markdown"],
onlyMainContent: true
}
}
// Advanced extraction
{
name: "scrape_url",
arguments: {
url: "https://example.com/blog",
jsonOptions: {
prompt: "Extract article content",
schema: {
title: "string",
content: "string"
}
},
mobile: true,
blockAds: true
}
}
// Basic crawling
{
name: "crawl",
arguments: {
url: "https://example.com",
maxDepth: 2,
limit: 100
}
}
// Advanced crawling
{
name: "crawl",
arguments: {
url: "https://example.com",
maxDepth: 3,
includePaths: ["/blog", "/products"],
excludePaths: ["/admin"],
ignoreQueryParameters: true
}
}
// Generate site map
{
name: "map",
arguments: {
url: "https://example.com",
includeSubdomains: true,
limit: 1000
}
}
// Extract structured data
{
name: "extract",
arguments: {
urls: ["https://example.com/product1", "https://example.com/product2"],
prompt: "Extract product details",
schema: {
name: "string",
price: "number",
description: "string"
}
}
}
See configuration guide for detailed setup options.
See API documentation for detailed endpoint specifications.
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
# Start in development mode
npm run dev
Check the examples directory for more usage examples:
- Basic scraping: scrape.ts
- Crawling and mapping: crawl-and-map.ts
The server implements robust error handling:
- Rate limiting with exponential backoff
- Automatic retries
- Detailed error messages
- Debug logging
- API key protection
- Request validation
- Domain allowlisting
- Rate limiting
- Safe error messages
See CONTRIBUTING.md for contribution guidelines.
MIT License - see LICENSE for details.