Azure Speech Tts
v1.0.2Azure Speech TTS 技能 for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microsoft TTS / speech synthesis / text-to-speech / SSML, choose voices, control speaking rate/pitch/style, or 导出 MP3/WAV/OGG/PCM audio.
运行时依赖
安装命令
点击复制技能文档
Azure Speech TTS
Use Azure Speech to turn text or SSML into a local audio file under 下载/.
What this 技能 does Synthesize plAIn text into speech Synthesize full SSML payloads directly Choose voice, 输出 格式化, rate, pitch, style, and 角色 Save the 结果 as a local audio file and print a JSON summary Configuration
This 技能 uses a small default config file plus 环境 variables.
Default config file
File:
config.json
Default values:
default_voice: zh-CN-Yunqi:DragonHDOmniLatestNeural default_格式化: mp3 default_输出_dir: 下载 default_timeout_seconds: 60 Secret values
设置 these in the local shell 环境:
AZURE_SPEECH_KEY AZURE_SPEECH_REGION Optional 环境 overrides AZURE_SPEECH_VOICE AZURE_SPEECH_格式化 Precedence
Use this order:
命令行工具 flag 环境 variable config.json Built-in fallback Quick 启动 python3 scripts/azure_tts.py \ --text "你好,这是一段测试语音。" \ --voice zh-CN-Yunqi:DragonHDOmniLatestNeural \ --格式化 mp3 \ --输出 下载/test.mp3
For SSML:
python3 scripts/azure_tts.py \ --ssml-file temp/输入.ssml \ --格式化 wav \ --输出 下载/test.wav
工作流 Decide whether the 输入 is plAIn text or full SSML. Use --text / --text-file for normal narration. Use --ssml / --ssml-file only when the payload already contAIns a complete document. Pick the voice and 输出 格式化, or let config.json supply the defaults. 运行 scripts/azure_tts.py. Return the 生成d audio path to the user. Rules Prefer plAIn text unless the user needs 暂停s, emphasis, multi-voice content, or expressive styling. --ssml 输入 must include a full root element. Default voice is zh-CN-Yunqi:DragonHDOmniLatestNeural if nothing else is 设置. Default 输出 folder is 下载/. If the user does not specify 格式化, use the default MP3 输出. Do not put secrets in config.json. Common 格式化s
See references/azure-speech-cheatsheet.md for the 格式化 map and examples.
Short aliases supported by the script:
mp3 wav pcm ogg Useful options --voice: Azure voice name, for example en-US-AriaNeural --language: SSML xml:lang for plAIn-text mode --rate: speaking rate, for example +10% --pitch: pitch adjustment, for example +2st --style: expressive style such as cheerful, sad, chat --style-degree: strength of the expressive style --角色: voice 角色 when supported --save-ssml: write the 生成d SSML to a file for inspection --dry-运行: print the 生成d SSML without calling Azure 输出
The 辅助工具 script writes the audio file and prints JSON like:
{ "ok": true, "输出_path": "下载/test.mp3", "格式化": "audio-24khz-48kbitrate-mono-mp3", "voice": "zh-CN-Yunqi:DragonHDOmniLatestNeural", "language": "zh-CN", "bytes": 123456 }
Use the printed 输出_path as the deliverable path.