firecrawl.com
firecrawl.com logo

FireCrawl

Integration with FireCrawl to provide advanced web scraping capabilities for extracting structured data from complex web...

Created byApr 22, 2025

Firecrawl MCP Server

A Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities.
Big thanks to @vrknetha, @cawstudios for the initial implementation!You can also play around with our MCP Server on MCP.so's playground or on Klavis AI. Thanks to MCP.so and Klavis AI for hosting and @gstarwd and @xiangkaiz for integrating our server.

Features

  • Scrape, crawl, search, extract, deep research and batch scrape support
  • Web scraping with JS rendering
  • URL discovery and crawling
  • Web search with content extraction
  • Automatic retries with exponential backoff
  • Credit usage monitoring for cloud API
  • Comprehensive logging system
  • Support for cloud and self-hosted Firecrawl instances
  • Mobile/Desktop viewport support
  • Smart content filtering with tag inclusion/exclusion

Installation

Running with npx

Manual Installation

Running on Cursor

Configuring CursorNote: Requires Cursor version 0.45.6+ For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Guide
To configure Firecrawl MCP in Cursor v0.45.6
  1. Open Cursor Settings
  1. Go to Features > MCP Servers
  1. Click "+ Add New MCP Server"
  1. Enter the following:
To configure Firecrawl MCP in Cursor v0.48.6
  1. Open Cursor Settings
  1. Go to Features > MCP Servers
  1. Click "+ Add new global MCP server"
  1. Enter the following code:
If you are using Windows and are running into issues, try cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"
Replace your-api-key with your Firecrawl API key. If you don't have one yet, you can create an account and get it from https://www.firecrawl.dev/app/api-keys
After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Firecrawl MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.

Running on Windsurf

Add this to your ./codeium/windsurf/model_config.json:

Running with SSE Local Mode

To run the server using Server-Sent Events (SSE) locally instead of the default stdio transport:

Installing via Smithery (Legacy)

To install Firecrawl for Claude Desktop automatically via Smithery:

Configuration

Environment Variables

Required for Cloud API

  • FIRECRAWL_API_KEY: Your Firecrawl API key
  • FIRECRAWL_API_URL (Optional): Custom API endpoint for self-hosted instances

Optional Configuration

Retry Configuration
  • FIRECRAWL_RETRY_MAX_ATTEMPTS: Maximum number of retry attempts (default: 3)
  • FIRECRAWL_RETRY_INITIAL_DELAY: Initial delay in milliseconds before first retry (default: 1000)
  • FIRECRAWL_RETRY_MAX_DELAY: Maximum delay in milliseconds between retries (default: 10000)
  • FIRECRAWL_RETRY_BACKOFF_FACTOR: Exponential backoff multiplier (default: 2)
Credit Usage Monitoring
  • FIRECRAWL_CREDIT_WARNING_THRESHOLD: Credit usage warning threshold (default: 1000)
  • FIRECRAWL_CREDIT_CRITICAL_THRESHOLD: Credit usage critical threshold (default: 100)

Configuration Examples

For cloud API usage with custom retry and credit monitoring:
For self-hosted instance:

Usage with Claude Desktop

Add this to your claude_desktop_config.json:

System Configuration

The server includes several configurable parameters that can be set via environment variables. Here are the default values if not configured:
These configurations control:
  1. Retry Behavior
  1. Credit Usage Monitoring

Rate Limiting and Batch Processing

The server utilizes Firecrawl's built-in rate limiting and batch processing capabilities:
  • Automatic rate limit handling with exponential backoff
  • Efficient parallel processing for batch operations
  • Smart request queuing and throttling
  • Automatic retries for transient errors

Available Tools

1. Scrape Tool (`firecrawl_scrape`)

Scrape content from a single URL with advanced options.

2. Batch Scrape Tool (`firecrawl_batch_scrape`)

Scrape multiple URLs efficiently with built-in rate limiting and parallel processing.
Response includes operation ID for status checking:

3. Check Batch Status (`firecrawl_check_batch_status`)

Check the status of a batch operation.

4. Search Tool (`firecrawl_search`)

Search the web and optionally extract content from search results.

5. Crawl Tool (`firecrawl_crawl`)

Start an asynchronous crawl with advanced options.

6. Extract Tool (`firecrawl_extract`)

Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction.
Example response:

Extract Tool Options:

  • urls: Array of URLs to extract information from
  • prompt: Custom prompt for the LLM extraction
  • systemPrompt: System prompt to guide the LLM
  • schema: JSON schema for structured data extraction
  • allowExternalLinks: Allow extraction from external links
  • enableWebSearch: Enable web search for additional context
  • includeSubdomains: Include subdomains in extraction
When using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses Firecrawl's managed LLM service.

7. Deep Research Tool (firecrawl_deep_research)

Conduct deep web research on a query using intelligent crawling, search, and LLM analysis.
Arguments:
  • query (string, required): The research question or topic to explore.
  • maxDepth (number, optional): Maximum recursive depth for crawling/search (default: 3).
  • timeLimit (number, optional): Time limit in seconds for the research session (default: 120).
  • maxUrls (number, optional): Maximum number of URLs to analyze (default: 50).
Returns:
  • Final analysis generated by an LLM based on research. (data.finalAnalysis)
  • May also include structured activities and sources used in the research process.

8. Generate LLMs.txt Tool (firecrawl_generate_llmstxt)

Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.
Arguments:
  • url (string, required): The base URL of the website to analyze.
  • maxUrls (number, optional): Max number of URLs to include (default: 10).
  • showFullText (boolean, optional): Whether to include llms-full.txt contents in the response.
Returns:
  • Generated llms.txt file contents and optionally the llms-full.txt (data.llmstxt and/or data.llmsfulltxt)

Logging System

The server includes comprehensive logging:
  • Operation status and progress
  • Performance metrics
  • Credit usage monitoring
  • Rate limit tracking
  • Error conditions
Example log messages:

Error Handling

The server provides robust error handling:
  • Automatic retries for transient errors
  • Rate limit handling with backoff
  • Detailed error messages
  • Credit usage warnings
  • Network resilience
Example error response:

Development

Contributing

  1. Fork the repository
  1. Create your feature branch
  1. Run tests: npm test
  1. Submit a pull request

License

MIT License - see LICENSE file for details

Firecrawl MCP Server

A Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities.
Big thanks to @vrknetha, @cawstudios for the initial implementation!You can also play around with our MCP Server on MCP.so's playground or on Klavis AI. Thanks to MCP.so and Klavis AI for hosting and @gstarwd and @xiangkaiz for integrating our server.

Features

  • Scrape, crawl, search, extract, deep research and batch scrape support
  • Web scraping with JS rendering
  • URL discovery and crawling
  • Web search with content extraction
  • Automatic retries with exponential backoff
  • Credit usage monitoring for cloud API
  • Comprehensive logging system
  • Support for cloud and self-hosted Firecrawl instances
  • Mobile/Desktop viewport support
  • Smart content filtering with tag inclusion/exclusion

Installation

Running with npx

Manual Installation

Running on Cursor

Configuring CursorNote: Requires Cursor version 0.45.6+ For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Guide
To configure Firecrawl MCP in Cursor v0.45.6
  1. Open Cursor Settings
  1. Go to Features > MCP Servers
  1. Click "+ Add New MCP Server"
  1. Enter the following:
To configure Firecrawl MCP in Cursor v0.48.6
  1. Open Cursor Settings
  1. Go to Features > MCP Servers
  1. Click "+ Add new global MCP server"
  1. Enter the following code:
If you are using Windows and are running into issues, try cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"
Replace your-api-key with your Firecrawl API key. If you don't have one yet, you can create an account and get it from https://www.firecrawl.dev/app/api-keys
After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Firecrawl MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.

Running on Windsurf

Add this to your ./codeium/windsurf/model_config.json:

Running with SSE Local Mode

To run the server using Server-Sent Events (SSE) locally instead of the default stdio transport:

Installing via Smithery (Legacy)

To install Firecrawl for Claude Desktop automatically via Smithery:

Configuration

Environment Variables

Required for Cloud API

  • FIRECRAWL_API_KEY: Your Firecrawl API key
  • FIRECRAWL_API_URL (Optional): Custom API endpoint for self-hosted instances

Optional Configuration

Retry Configuration
  • FIRECRAWL_RETRY_MAX_ATTEMPTS: Maximum number of retry attempts (default: 3)
  • FIRECRAWL_RETRY_INITIAL_DELAY: Initial delay in milliseconds before first retry (default: 1000)
  • FIRECRAWL_RETRY_MAX_DELAY: Maximum delay in milliseconds between retries (default: 10000)
  • FIRECRAWL_RETRY_BACKOFF_FACTOR: Exponential backoff multiplier (default: 2)
Credit Usage Monitoring
  • FIRECRAWL_CREDIT_WARNING_THRESHOLD: Credit usage warning threshold (default: 1000)
  • FIRECRAWL_CREDIT_CRITICAL_THRESHOLD: Credit usage critical threshold (default: 100)

Configuration Examples

For cloud API usage with custom retry and credit monitoring:
For self-hosted instance:

Usage with Claude Desktop

Add this to your claude_desktop_config.json:

System Configuration

The server includes several configurable parameters that can be set via environment variables. Here are the default values if not configured:
These configurations control:
  1. Retry Behavior
  1. Credit Usage Monitoring

Rate Limiting and Batch Processing

The server utilizes Firecrawl's built-in rate limiting and batch processing capabilities:
  • Automatic rate limit handling with exponential backoff
  • Efficient parallel processing for batch operations
  • Smart request queuing and throttling
  • Automatic retries for transient errors

Available Tools

1. Scrape Tool (`firecrawl_scrape`)

Scrape content from a single URL with advanced options.

2. Batch Scrape Tool (`firecrawl_batch_scrape`)

Scrape multiple URLs efficiently with built-in rate limiting and parallel processing.
Response includes operation ID for status checking:

3. Check Batch Status (`firecrawl_check_batch_status`)

Check the status of a batch operation.

4. Search Tool (`firecrawl_search`)

Search the web and optionally extract content from search results.

5. Crawl Tool (`firecrawl_crawl`)

Start an asynchronous crawl with advanced options.

6. Extract Tool (`firecrawl_extract`)

Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction.
Example response:

Extract Tool Options:

  • urls: Array of URLs to extract information from
  • prompt: Custom prompt for the LLM extraction
  • systemPrompt: System prompt to guide the LLM
  • schema: JSON schema for structured data extraction
  • allowExternalLinks: Allow extraction from external links
  • enableWebSearch: Enable web search for additional context
  • includeSubdomains: Include subdomains in extraction
When using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses Firecrawl's managed LLM service.

7. Deep Research Tool (firecrawl_deep_research)

Conduct deep web research on a query using intelligent crawling, search, and LLM analysis.
Arguments:
  • query (string, required): The research question or topic to explore.
  • maxDepth (number, optional): Maximum recursive depth for crawling/search (default: 3).
  • timeLimit (number, optional): Time limit in seconds for the research session (default: 120).
  • maxUrls (number, optional): Maximum number of URLs to analyze (default: 50).
Returns:
  • Final analysis generated by an LLM based on research. (data.finalAnalysis)
  • May also include structured activities and sources used in the research process.

8. Generate LLMs.txt Tool (firecrawl_generate_llmstxt)

Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.
Arguments:
  • url (string, required): The base URL of the website to analyze.
  • maxUrls (number, optional): Max number of URLs to include (default: 10).
  • showFullText (boolean, optional): Whether to include llms-full.txt contents in the response.
Returns:
  • Generated llms.txt file contents and optionally the llms-full.txt (data.llmstxt and/or data.llmsfulltxt)

Logging System

The server includes comprehensive logging:
  • Operation status and progress
  • Performance metrics
  • Credit usage monitoring
  • Rate limit tracking
  • Error conditions
Example log messages:

Error Handling

The server provides robust error handling:
  • Automatic retries for transient errors
  • Rate limit handling with backoff
  • Detailed error messages
  • Credit usage warnings
  • Network resilience
Example error response:

Development

Contributing

  1. Fork the repository
  1. Create your feature branch
  1. Run tests: npm test
  1. Submit a pull request

License

MIT License - see LICENSE file for details