Databricks Connector MCP Server

Cloud PlatformsPython

Connect to Databricks to run SQL queries and manage jobs

Available Tools

run_sql_query

Execute SQL queries on your Databricks SQL warehouse

sql

list_jobs

List all Databricks jobs in your workspace

get_job_status

Get the status of a specific Databricks job by ID

job_id

get_job_details

Get detailed information about a specific Databricks job

job_id

The Databricks Connector provides a seamless interface between language models and your Databricks environment. It enables natural language interaction with your data warehouses and job management systems, allowing you to execute SQL queries, list jobs, and monitor job statuses directly through conversational prompts. This connector bridges the gap between AI assistants and your Databricks infrastructure, making data analysis and workflow management more accessible through natural language. It's particularly useful for data scientists, analysts, and engineers who want to quickly access Databricks resources without switching contexts.

Overview

The Databricks Connector allows language models to interact directly with your Databricks environment, enabling SQL query execution and job management through natural language requests.

Prerequisites

Before setting up the connector, ensure you have:

Python 3.7 or higher installed
A Databricks workspace with appropriate access
A personal access token for authentication
Access to a SQL warehouse endpoint
Necessary permissions to run queries and access jobs

Installation

Clone the repository:

git clone https://github.com/JordiNeil/mcp-databricks-server.git
cd mcp-databricks-server

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install the required dependencies:

pip install -r requirements.txt

Create a .env file in the root directory with your Databricks credentials:

DATABRICKS_HOST=your-databricks-instance.cloud.databricks.com
DATABRICKS_TOKEN=your-personal-access-token
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/your-warehouse-id

Obtaining Databricks Credentials

To set up the connector, you'll need to gather the following credentials:

Host: Your Databricks instance URL without the https:// prefix (e.g., your-instance.cloud.databricks.com)
Personal Access Token:
- Navigate to User Settings by clicking your username in the top right corner
- Select the "Developer" tab
- Click "Manage" under "Access tokens"
- Generate a new token and save it immediately (it won't be shown again)
HTTP Path for SQL Warehouse:
- Go to SQL Warehouses in your Databricks workspace
- Select your warehouse
- Find and copy the HTTP Path from the connection details

Testing Your Connection

Before running the server, you can verify your connection settings:

python test_connection.py

This will attempt to connect to your Databricks environment and report any issues.

Running the Server

Start the MCP server with:

python main.py

The server will start and listen for requests from language models.

Using with Language Models

Once the server is running, language models can interact with your Databricks environment through natural language. Examples include:

"Show me all tables in the sales database"
"Run a query to count customers by region"
"List all my Databricks jobs"
"Check the status of job #123"
"Get details about the ETL job with ID 456"

Troubleshooting

If you encounter connection issues:

Ensure your Databricks host is correctly specified without the https:// prefix
Verify that your SQL warehouse is running and accessible
Check that your personal access token has the necessary permissions
Confirm that your .env file is properly formatted and in the correct location

Security Considerations

Your Databricks personal access token provides direct access to your workspace
Never commit your .env file to version control
Consider using a token with only the necessary permission scopes
Run this server in a secure environment to protect your credentials