Retrieves web page content from a specified URL using Playwright headless browser with intelligent content extraction
Batch retrieves web page content from multiple URLs in parallel using multi-tab fetching for improved performance
Fetcher is a powerful web content retrieval tool that uses Playwright's headless browser to fetch and process web pages. Unlike traditional web scrapers, it fully executes JavaScript, making it capable of handling dynamic web content and modern web applications. It features intelligent content extraction with a built-in Readability algorithm that automatically removes ads, navigation, and other non-essential elements from web pages. The tool offers flexible output formats (HTML or Markdown), parallel processing for batch operations, and configurable parameters for fine-tuned control. It optimizes resource usage by blocking unnecessary elements like images and stylesheets, and includes robust error handling to ensure reliable operation even with problematic web pages.
You can install and run Fetcher MCP in several ways:
Run directly with npx:
npx -y fetcher-mcp
For first-time setup, install the required browser:
npx playwright install chromium
Run with Docker:
docker run -p 3000:3000 ghcr.io/jae-jae/fetcher-mcp:latest
Or deploy with Docker Compose by creating a docker-compose.yml
file:
version: "3.8"
services:
fetcher-mcp:
image: ghcr.io/jae-jae/fetcher-mcp:latest
container_name: fetcher-mcp
restart: unless-stopped
ports:
- "3000:3000"
environment:
- NODE_ENV=production
volumes:
- /tmp:/tmp
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000"]
interval: 30s
timeout: 10s
retries: 3
Then run:
docker-compose up -d
Start the server with HTTP and SSE transport:
npx -y fetcher-mcp --log --transport=http --host=0.0.0.0 --port=3000
This provides two endpoints:
/mcp
- Streamable HTTP endpoint (modern MCP protocol)/sse
- SSE endpoint (legacy MCP protocol)To show the browser window for debugging:
npx -y fetcher-mcp --debug
Add this configuration to your Claude Desktop config file:
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"fetcher": {
"command": "npx",
"args": ["-y", "fetcher-mcp"]
}
}
}
For websites with anti-crawler mechanisms, include in your prompt:
Please wait for the page to fully load
For slow-loading websites:
Please set the page loading timeout to 60 seconds
To preserve original HTML structure:
Please preserve the original HTML content
To fetch complete page content:
Please fetch the complete webpage content instead of just the main content
To return content as HTML:
Please return the content in HTML format
To display the browser window during a specific fetch operation:
Please enable debug mode for this fetch operation