Back to MCP Catalog

Scrapling Fetch MCP Server

Browser AutomationPython
Access text content from websites with bot protection
Available Tools

s-fetch-page

Retrieves complete web pages with pagination support. Parameters include 'url', 'mode' (basic/stealth/max-stealth), 'start_index', and 'max_length'.

s-fetch-pattern

Extracts content matching regex patterns with surrounding context. Parameters include 'url', 'mode' (basic/stealth/max-stealth), 'search_pattern', and 'context_chars'.

Scrapling Fetch bridges the gap between what you can see in your browser and what AI assistants can access. It helps AI assistants retrieve text content from websites that implement bot detection and anti-automation measures, using the Scrapling library. This tool is specifically optimized for low-volume retrieval of documentation, articles, and reference materials from protected websites. It offers multiple protection levels to balance between speed and success rate, allowing AI assistants to access content that would otherwise be blocked.

Overview

Scrapling Fetch enables AI assistants to access text content from websites that implement bot detection and anti-automation measures. This tool is specifically designed for retrieving documentation and reference materials, not for general-purpose scraping or data harvesting.

Installation

To use Scrapling Fetch, you'll need:

  1. Python 3.10 or higher
  2. The uv package manager

Install the required dependencies with these commands:

uv tool install scrapling
scrapling install
uv tool install scrapling-fetch-mcp

Configuration

To configure your AI assistant to use Scrapling Fetch, add the following to your client's MCP server configuration:

{
  "mcpServers": {
    "Cyber-Chitta": {
      "command": "uvx",
      "args": ["scrapling-fetch-mcp"]
    }
  }
}

Usage Options

Protection Levels

Scrapling Fetch offers three protection levels to balance between speed and success rate:

  • basic: Fast retrieval (1-2 seconds) but lower success with heavily protected sites
  • stealth: Balanced protection (3-8 seconds) that works with most sites
  • max-stealth: Maximum protection (10+ seconds) for heavily protected sites

It's recommended to start with basic mode and only escalate to higher protection levels if needed.

Content Targeting

Depending on your needs, you can use different approaches to retrieve content:

  • Use s-fetch-page to retrieve entire pages with pagination support
  • Use s-fetch-pattern to extract specific content using regular expressions

For large documents, consider using the pagination parameters with s-fetch-page or use s-fetch-pattern when looking for specific information on large pages.

Limitations

  • Designed only for text content (documentation, articles, reference materials)
  • Not designed for high-volume scraping or data harvesting
  • May not work with sites requiring authentication
  • Performance varies by site complexity

Related MCPs

Playwright Browser Automation
Browser AutomationPython

Automate browser interactions with Playwright

Playwright Browser Automation
Browser AutomationJavaScript

Automate browser interactions, take screenshots, and scrape web content

Playwright Browser Automation
Browser AutomationTypeScript

Browser automation capabilities using Playwright

About Model Context Protocol

Model Context Protocol (MCP) allows AI models to access external tools and services, extending their capabilities beyond their training data.

Generate Cursor Documentation

Save time on coding by generating custom documentation and prompts for Cursor IDE.