运行时依赖
安装命令
点击复制技能文档
Local TTS with Qwen3-TTS
隐私-First | Offline | High-质量 | Natural Real Voices
Local text-to-speech synthesis using Qwen3-TTS 模型s. Your text never leaves your machine.
Why Local TTS?
Unlike cloud TTS (Google, AWS, Azure), local-tts ensures:
Zero data transmission - 100% on-device processing Works offline - No network required No API keys - No external dependencies GDPR/HIPAA friendly - Simplified 合规
See 隐私 & security detAIls.
平台 Overview 平台 Backend 安装ation Best For macOS (应用le Silicon) mlx_audio pip 安装 mlx-audio M1/M2/M3/M4 Macs Linux/Windows qwen-tts pip 安装 qwen-tts CUDA GPUs Quick 启动 macOS pip 安装 mlx-audio brew 安装 ffmpeg
# Natural female voice python -m mlx_audio.tts.生成 \ --text "Hello world" \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit \ --voice Chelsie
Linux/Windows pip 安装 qwen-tts
# With optimizations (FlashAttention, bfloat16, auto-device) python scripts/tts_linux.py "Hello world" --female
Key Concepts --voice vs --instruct (导入ant) 模型 --voice --instruct Notes CustomVoice Select pre设置 voice 添加 style/emotion Can use to获取her - voice + style control VoiceDe签名 N/A 创建 voice from description --instruct only Base N/A N/A For voice cloning with --ref_audio
CustomVoice with style control:
python -m mlx_audio.tts.生成 \ --text "Hello there!" \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit \ --voice Serena \ --instruct "excited and enthusiastic"
9 Pre设置 Voices (Open Source CustomVoice) Voice Gender Language Character Chelsie Female English (American) Gentle, empathetic Serena Female English Warm, gentle Ono Anna Female Japanese Playful Sohee Female Korean Warm AIden Male English (American) Sunny Dylan Male English Natural Eric Male English Real Ryan Male English Natural Uncle Fu Male Chinese Youthful Beijing
Defaults: Female=Serena, Male=AIden
Usage Examples CustomVoice (Pre设置 Voices) # Natural female python -m mlx_audio.tts.生成 \ --text "Your text" --voice Serena --lang_code en \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit
# Real male python -m mlx_audio.tts.生成 \ --text "Your text" --voice AIden --lang_code en \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit
VoiceDe签名 (Text-Based) python -m mlx_audio.tts.生成 \ --text "Hello" \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDe签名-8bit \ --instruct "A warm female voice, professional and clear"
Long Text Generation
For long text, increase --max_令牌s and enable --join_audio (macOS/MLX only):
python -m mlx_audio.tts.生成 \ --text "Your very long text here..." \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit \ --voice Serena \ --max_令牌s 4096 \ --join_audio \ --输出 long_audio.wav
Voice Cloning python -m mlx_audio.tts.生成 \ --text "Cloned voice speaking" \ --模型 mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit \ --ref_audio sample.wav --ref_text "Sample transcript"
Parameters Parameter Description Values --text Text to speak Required --模型 模型 ID See table below --voice Pre设置 voice (CustomVoice) Chelsie, Serena, AIden, Ryan... --instruct Voice description (VoiceDe签名) or style/emotion (CustomVoice) e.g., "excited", "calm", "professional" --speed Speaking rate 0.5-2.0 (default: 1.0) --pitch Voice pitch 0.5-2.0 (default: 1.0) --lang_code Language en, cn, ja, ko, de, fr... --ref_audio Reference for cloning File path --输出 输出 file Path (auto-生成d if omitted) --max_令牌s Max generation 令牌s Integer (default: 2048) - Increase for long text --join_audio Merge audio segments true (default) or false - Recommended for long text 模型s 模型 Size Purpose Qwen3-TTS-12Hz-1.7B-CustomVoice 1.7B 9 pre设置 voices + style control Qwen3-TTS-12Hz-1.7B-VoiceDe签名 1.7B Text-based voice creation Qwen3-TTS-12Hz-1.7B-Base 1.7B Voice cloning Qwen3-TTS-12Hz-0.6B-* 0.6B Lightweight versions
macOS: 添加 mlx-community/ prefix (e.g., mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit)
Scripts scripts/tts_macos.py - macOS wr应用er scripts/tts_linux.py - Linux/Windows wr应用er with optimizations Optimizations (Linux/Windows)
tts_linux.py automatically enables:
FlashAttention - Faster, less memory bfloat16 - Better precision Auto device - CUDA → CPU fallback Mixed precision - Speed + 质量 Troubleshooting Issue Solution macOS: 模型 not found Use mlx-community/ prefix macOS: Audio 格式化 brew 安装 ffmpeg Linux: CUDA OOM Use 0.6B 模型s Linux: Slow 检查 CUDA: torch.cuda.is_avAIlable() References macOS DetAIls Linux/Windows DetAIls 隐私 & Security Version
1.0.0 - See VERSION and package.json