Skip to main content

Speech Engines

Chatty MCP supports multiple speech synthesis options, giving you flexibility to choose the best voice experience for your environment.

System Text-to-Speech

Chatty MCP leverages your operating system's native TTS capabilities for a lightweight solution that requires no additional downloads:

OSEngineFeatures
macOSsay commandAdjustable speed and volume
LinuxespeakAdjustable speed and volume

System TTS is the default option if no other engines are specified.

Kokoro-ONNX

For higher quality, more natural-sounding speech, Chatty MCP integrates with Kokoro-ONNX, an optimized implementation of Kokoro-TTS.

Features

  • Multiple voices: Choose from various voice options
  • Streaming mode: Begin playback while audio is still being generated
  • Natural sound: High-quality, realistic speech synthesis
  • Adjustable parameters: Control speed and volume
  • Cross-platform: Works on macOS, Linux

Getting Started with Kokoro

  1. Download the model files:

    wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx
    wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin
  2. Place the model files in one of these locations (in order of priority):

    • Current directory (where you run chatty-mcp)
    • $HOME/.kokoro_models/ directory
    • $HOME/.chatty-mcp/ directory
    • Custom path specified by environment variables:
      export CHATTY_MCP_KOKORO_MODEL_PATH=/path/to/kokoro-v1.0.onnx
      export CHATTY_MCP_KOKORO_VOICE_PATH=/path/to/voices-v1.0.bin
  3. Enable Kokoro in your Cursor MCP configuration:

    {
    "mcpServers": {
    "chatty": {
    "command": "chatty-mcp",
    "args": ["--engine", "kokoro", "--streaming", "--voice", "af_sarah"],
    "description": "Chatty MCP with Kokoro TTS"
    }
    }
    }

Performance Considerations

  • System TTS: Lightweight with low resource usage, but less natural-sounding
  • Kokoro standard mode: Better quality, but may have slight delay before speaking
  • Kokoro streaming mode: Best experience with natural sound and quick response time

For the most responsive and natural-sounding experience, we recommend using Kokoro with streaming mode enabled.