Browserbase MCP Server

Browser AutomationTypeScript

Control and automate browsers with AI-powered web interactions

Available Tools

navigate

Navigate to a specified URL in the browser

url

click

Click on an element in the webpage

selector

fill

Fill a form field with text

selectortext

screenshot

Take a screenshot of the current page or a specific element

selector

extract

Extract structured data from the webpage

query

executeJavaScript

Execute custom JavaScript in the browser context

script

act

Perform an action described in natural language (Stagehand MCP)

instruction

Browserbase provides powerful browser automation capabilities through the Model Context Protocol (MCP). It combines Browserbase's cloud browser infrastructure with Stagehand's intelligent browser automation to enable LLMs to interact with web pages, take screenshots, extract data, and execute JavaScript in a controlled environment. The server offers two complementary approaches: the Browserbase MCP for direct browser control with features like navigation, data extraction, and console monitoring, and the Stagehand MCP for natural language-based browser automation that supports multiple AI models including GPT-4 and Claude-3.7 Sonnet.

Introduction

Browserbase MCP Server enables AI models to control web browsers through the Model Context Protocol. This integration allows LLMs to perform complex web automation tasks including navigation, data extraction, form filling, and capturing screenshots.

The repository provides two distinct but complementary approaches:

Browserbase MCP: Direct browser control with specific commands
Stagehand MCP: Natural language-based browser automation

Installation

You can install the Browserbase MCP Server using the following configuration:

Clone the repository:

git clone https://github.com/browserbase/mcp-server-browserbase.git
cd mcp-server-browserbase

Choose which MCP you want to use:
- For Browserbase MCP: cd browserbase
- For Stagehand MCP: cd stagehand
Install dependencies and start the server:

npm install
npm start

Configure your LLM client to use the MCP server by adding the appropriate configuration to your client settings.

Using Browserbase MCP

The Browserbase MCP provides direct control over browser automation with features like:

Browser Control: Navigate to URLs, click elements, and fill forms
Data Extraction: Extract structured data from any webpage
Console Monitoring: Track and analyze browser console logs
Screenshots: Capture full-page and element screenshots
JavaScript Execution: Run custom JavaScript in the browser context

To use Browserbase MCP, you'll need to:

Start the MCP server
Connect your LLM client to the server
Use the available tools to control the browser

Using Stagehand MCP

Stagehand MCP offers a more natural language approach to browser automation:

Use atomic instructions like act("click the login button") or extract("find the red shoes")
Supports multiple AI models including OpenAI's GPT-4 and Anthropic's Claude-3.7 Sonnet
Leverages vision capabilities with annotated screenshots for complex DOM structures

To use Stagehand MCP:

Start the MCP server
Connect your LLM client to the server
Use natural language commands to control the browser

Alternative Installation

You can also install Browserbase MCP through Smithery, which provides a simplified setup process.

Community and Support

For support and community discussions:

Join the Stagehand Slack community
Check the GitHub repository for updates and issues
Visit stagehand.dev for additional documentation