Back to MCP Catalog
MCP Azure OpenAI Web Browsing
Browser AutomationPython
A Model Context Protocol server for Browser Automation

About this MCP

MCP Azure OpenAI Web Browsing is a minimal implementation that connects Azure OpenAI capabilities with web browser automation through the Model Context Protocol (MCP). It leverages Playwright for browser control and provides a bridge that converts MCP server responses to OpenAI function calling format, enabling AI-powered web interactions. This tool allows AI models to navigate websites, interact with web elements, and perform automated tasks through a standardized protocol. The implementation includes both server and client components, making it easy to integrate into existing AI applications that need web automation capabilities.

Documentation

Overview

MCP Azure OpenAI Web Browsing provides a Model Context Protocol server that enables AI models to control web browsers through Azure OpenAI. The implementation uses FastMCP for the server component and Playwright for browser automation, creating a powerful tool for AI-driven web interactions.

Installation

Prerequisites

  • Python 3.8+
  • Azure OpenAI API access

Setup Instructions

  1. Clone the repository:

    git clone https://github.com/kimtth/mcp-aoai-web-browsing.git
    cd mcp-aoai-web-browsing
    
  2. Set up environment variables:

    • Rename .env.template to .env
    • Fill in your Azure OpenAI credentials:
      AZURE_OPEN_AI_ENDPOINT=your_endpoint
      AZURE_OPEN_AI_API_KEY=your_api_key
      AZURE_OPEN_AI_DEPLOYMENT_MODEL=your_model
      AZURE_OPEN_AI_API_VERSION=your_api_version
      
  3. Install dependencies using uv (recommended):

    pip install uv
    uv sync
    
  4. Launch the application:

    python chatgui.py
    

Available Tools

The MCP server provides several Playwright-based tools for web automation:

Browser Navigation

  • playwright_navigate(url, timeout=30000, wait_until="load"): Navigate to a specified URL
  • playwright_go_back(): Navigate back in browser history
  • playwright_go_forward(): Navigate forward in browser history
  • playwright_reload(): Reload the current page

Page Interaction

  • playwright_click(selector): Click on an element matching the selector
  • playwright_fill(selector, value): Fill a form field with text
  • playwright_press(selector, key): Press a key on an element
  • playwright_get_text(selector): Get text content from an element
  • playwright_get_attribute(selector, name): Get attribute value from an element
  • playwright_extract_selectors(content): Extract selectors from page content

Browser Control

  • playwright_screenshot(): Take a screenshot of the current page
  • playwright_get_page_content(): Get the HTML content of the current page
  • playwright_close(): Close the browser

Usage Example

Once the application is running, you can interact with it through the provided GUI. Enter prompts that instruct the AI to perform web tasks, such as:

  • "Navigate to example.com and click on the first link"
  • "Go to a search engine and search for 'Model Context Protocol'"
  • "Fill out a contact form on a website"

The AI will use the Playwright tools to execute these commands in a controlled browser environment.

Architecture

The implementation consists of three main components:

  1. MCP Server: Built with FastMCP, it exposes web automation capabilities through standardized tools.
  2. Client Bridge: Converts MCP server responses to OpenAI function calling format.
  3. GUI Client: Provides a user interface for interacting with the AI and visualizing browser automation.

This architecture ensures a stable connection between the AI model and the browser automation tools, making it easy to extend with additional capabilities.

Related MCPs

Playwright Plus Python MCP
Browser AutomationPython

A Model Context Protocol server that provides browser automation capabilities using Playwright

Playwright MCP Server
Browser AutomationJavaScript

A Model Context Protocol server that enables LLMs to automate browsers using Playwright

MCP Server Playwright
Browser AutomationTypeScript

Browser automation capabilities for LLMs using Playwright

About Model Context Protocol

Model Context Protocol (MCP) allows AI models to access external tools and services, extending their capabilities beyond their training data.

Generate Cursor Documentation

Save time on coding by generating custom documentation and prompts for Cursor IDE.