Generate audio from plain text using default voice settings
Generate audio from a structured script with multiple voices and actors
Delete a job by its ID
Get the audio file by its ID
List all available voices
Get voiceover job history. Optionally specify a job ID for a specific job
ElevenLabs Voice Generator is a Model Context Protocol server that integrates with ElevenLabs' text-to-speech API. It enables AI assistants to generate high-quality voice audio from text, supporting multiple voices and script parts. The server includes persistent history storage via SQLite and offers comprehensive voice generation capabilities. It can be used for simple text-to-speech conversion, multi-part script management with different voices, and provides audio file downloads for the generated content.
ElevenLabs Voice Generator allows AI assistants to create high-quality voice audio from text using ElevenLabs' advanced text-to-speech technology. This MCP server provides a bridge between AI assistants and ElevenLabs' voice generation capabilities.
The simplest way to install is using uvx
:
cline_mcp_settings.json
for Claude Desktop):{
"mcpServers": {
"elevenlabs": {
"command": "uvx",
"args": ["elevenlabs-mcp-server"],
"env": {
"ELEVENLABS_API_KEY": "your-api-key",
"ELEVENLABS_VOICE_ID": "your-voice-id",
"ELEVENLABS_MODEL_ID": "eleven_flash_v2",
"ELEVENLABS_STABILITY": "0.5",
"ELEVENLABS_SIMILARITY_BOOST": "0.75",
"ELEVENLABS_STYLE": "0.1",
"ELEVENLABS_OUTPUT_DIR": "output"
}
}
}
}
For Claude Desktop users, you can install automatically via Smithery:
npx -y @smithery/cli install elevenlabs-mcp-server --client claude
For development or custom installations:
git clone https://github.com/mamertofabian/elevenlabs-mcp-server.git
cd elevenlabs-mcp-server
uv venv
{
"mcpServers": {
"elevenlabs": {
"command": "uv",
"args": [
"--directory",
"path/to/elevenlabs-mcp-server",
"run",
"elevenlabs-mcp-server"
],
"env": {
"ELEVENLABS_API_KEY": "your-api-key",
"ELEVENLABS_VOICE_ID": "your-voice-id",
"ELEVENLABS_MODEL_ID": "eleven_flash_v2",
"ELEVENLABS_STABILITY": "0.5",
"ELEVENLABS_SIMILARITY_BOOST": "0.75",
"ELEVENLABS_STYLE": "0.1",
"ELEVENLABS_OUTPUT_DIR": "output"
}
}
}
}
The server can be configured with the following environment variables:
ELEVENLABS_API_KEY
: Your ElevenLabs API key (required)ELEVENLABS_VOICE_ID
: Default voice ID to use (required)ELEVENLABS_MODEL_ID
: Model to use (default: "eleven_flash_v2")ELEVENLABS_STABILITY
: Voice stability setting (0.0-1.0, default: 0.5)ELEVENLABS_SIMILARITY_BOOST
: Voice similarity boost (0.0-1.0, default: 0.75)ELEVENLABS_STYLE
: Voice style setting (0.0-1.0, default: 0.1)ELEVENLABS_OUTPUT_DIR
: Directory to save audio files (default: "output")Once installed and configured, the AI assistant can use the available tools to generate voice audio. The server supports both simple text-to-speech conversion and more complex multi-part scripts with different voices.
For basic voice generation, use the generate_audio_simple
tool with your text.
For more complex scenarios with multiple speakers, use the generate_audio_script
tool with a structured script format.
The server maintains a history of generated voice audio, which can be accessed and managed using the provided tools.
The repository also includes a sample SvelteKit MCP Client for testing and demonstration purposes. To use it:
cd clients/web-ui
pnpm install
Copy .env.example
to .env
and configure as needed
Run the web UI:
pnpm dev