Back to MCP Catalog

Firecrawl MCP Server

Browser AutomationTypeScript
Powerful web scraping and crawling for LLM clients
Available Tools

scrape

Fetches and returns the content of a specified URL. Useful for retrieving information from a single web page.

urloptions

map

Creates a structured map of a website by exploring its links to a specified depth. Returns the site structure with URLs and their relationships.

urldepthoptions

crawl

Systematically visits and collects content from multiple pages on a website starting from a given URL. Can follow links to a specified depth.

urldepthoptions

search

Performs a search query using a specified search engine and returns the results. Useful for finding relevant web pages on a topic.

queryoptions

extract

Extracts specific structured data from a web page based on provided selectors or patterns. Useful for pulling specific information like prices, dates, or contact details.

urlschemaoptions

deep_research

Conducts comprehensive research on a topic by crawling multiple relevant pages and extracting key information. Combines search, crawl, and extract capabilities.

queryoptions

generate_llmstxt

Generates an llms.txt file for a website, which contains guidelines for how LLMs should interact with the site. Helps ensure compliance with site policies.

url

Firecrawl is a robust web scraping and crawling tool designed specifically for LLM clients like Cursor and Claude. It enables AI assistants to access and process web content in real-time, allowing them to retrieve up-to-date information, conduct research, and extract structured data from websites. With Firecrawl, LLMs can navigate the web autonomously, following links, mapping site structures, and performing targeted searches. The tool offers advanced capabilities like deep research across multiple pages and automatic extraction of specific data points from web content, making it an essential extension for any AI assistant that needs to work with internet resources.

Installation

Prerequisites

  • Node.js (v16 or higher)
  • npm or pnpm

Setup Instructions

  1. Clone the repository:

    git clone https://github.com/mendableai/firecrawl-mcp-server.git
    cd firecrawl-mcp-server
    
  2. Install dependencies:

    npm install
    # or
    pnpm install
    
  3. Start the server:

    npm start
    # or
    pnpm start
    

The server will start on port 3000 by default.

Configuration for LLM Clients

To use Firecrawl with your LLM client (like Claude or Cursor), you'll need to add the server configuration to your client's settings.

For Claude Desktop

Add the following to your Claude Desktop configuration:

{
  "mcpServers": {
    "firecrawl": {
      "url": "http://localhost:3000"
    }
  }
}

For Self-Hosted Deployment

You can also deploy Firecrawl as a self-hosted service using Docker:

docker build -t firecrawl-mcp-server .
docker run -p 3000:3000 firecrawl-mcp-server

Usage

Once installed and configured, your LLM client can use Firecrawl's tools to scrape and analyze web content. The tools are designed to be intuitive for LLMs to use, with clear parameters and responses.

Best Practices

  1. Start with specific URLs: When using the scrape or extract tools, provide specific URLs rather than general domains.

  2. Use appropriate tools for the task:

    • Use scrape for single-page content retrieval
    • Use crawl for exploring multiple linked pages
    • Use map for understanding site structure
    • Use search for finding specific information
    • Use extract for pulling structured data from pages
    • Use deep_research for comprehensive multi-page analysis
  3. Handle rate limiting: Be mindful of rate limits when crawling websites. Use the crawl tool with reasonable depth parameters.

  4. Process large responses: Some web pages may return large amounts of content. Be prepared to process and summarize this information effectively.

Troubleshooting

  • Connection issues: Ensure the MCP server is running and the port is accessible
  • Scraping failures: Some websites may block scraping attempts. Try using different user agents or consider respecting robots.txt
  • Timeout errors: For large sites or deep crawls, increase timeout settings or break the task into smaller chunks

Related MCPs

Playwright Browser Automation
Browser AutomationPython

Automate browser interactions with Playwright

Playwright Browser Automation
Browser AutomationJavaScript

Automate browser interactions, take screenshots, and scrape web content

Playwright Browser Automation
Browser AutomationTypeScript

Browser automation capabilities using Playwright

About Model Context Protocol

Model Context Protocol (MCP) allows AI models to access external tools and services, extending their capabilities beyond their training data.

Generate Cursor Documentation

Save time on coding by generating custom documentation and prompts for Cursor IDE.