Hugging Face Dataset Viewer MCP Server

Data Science ToolsPython

Browse and analyze datasets hosted on the Hugging Face Hub

Available Tools

validate

Check if a dataset exists and is accessible

datasetauth_token

get_info

Get detailed information about a dataset

datasetauth_token

get_rows

Get paginated contents of a dataset

datasetconfigsplitpageauth_token

get_first_rows

Get first rows from a dataset split

datasetconfigsplitauth_token

get_statistics

Get statistics about a dataset split

datasetconfigsplitauth_token

search_dataset

Search for text within a dataset

datasetconfigsplitqueryauth_token

filter

Filter rows using SQL-like conditions

datasetconfigsplitwhereorderbypageauth_token

get_parquet

Download entire dataset in Parquet format

datasetauth_token

The Hugging Face Dataset Viewer MCP provides a seamless interface to explore, search, and analyze datasets hosted on the Hugging Face Hub. It enables users to validate datasets, retrieve detailed information, access paginated contents, and perform advanced operations like searching and filtering. With support for dataset configurations, splits, and authentication for private datasets, this MCP offers comprehensive capabilities for data exploration. It also provides statistical analysis and the ability to download entire datasets in Parquet format, making it an essential tool for data scientists and machine learning practitioners.

Installation

Prerequisites

Python 3.12 or higher
uv - Fast Python package installer and resolver

Setup Instructions

Clone the repository:

git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer

Create and activate a virtual environment:

# Create virtual environment
uv venv

# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

Install in development mode:

uv add -e .

Configuration

Environment Variables

You can set the HUGGINGFACE_TOKEN environment variable to provide your Hugging Face API token for accessing private datasets.

Claude Desktop Integration

Add the MCP server configuration to your Claude Desktop config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "dataset-viewer": {
      "command": "uv",
      "args": [
        "--directory",
        "PATH_TO_YOUR_DATASET_VIEWER_DIRECTORY",
        "run",
        "dataset-viewer"
      ]
    }
  }
}

Replace PATH_TO_YOUR_DATASET_VIEWER_DIRECTORY with the actual path to where you cloned the repository.

Usage

Once installed, you can use the Dataset Viewer MCP to interact with Hugging Face datasets. The MCP uses the dataset:// URI scheme for accessing datasets.

Basic Operations

Validate a dataset's existence and accessibility
Get detailed information about datasets
Browse dataset contents with pagination
View statistics and analyze dataset characteristics

Working with Private Datasets

For private datasets, you'll need to provide an authentication token either through the environment variable or as a parameter to the relevant tools.

Advanced Features

Search for specific text within datasets
Filter rows using SQL-like conditions
Sort results using ORDER BY clauses
Download entire datasets in Parquet format for offline analysis

Hugging Face Dataset Viewer MCP Server

validate

get_info

get_rows

get_first_rows

get_statistics

search_dataset

filter

get_parquet

Installation

Prerequisites

Setup Instructions

Configuration

Environment Variables

Claude Desktop Integration

Usage

Basic Operations

Working with Private Datasets

Advanced Features

Related MCPs

About Model Context Protocol

Generate Cursor Documentation