Azure Speech Tts

v1.0.2

Azure Speech TTS 技能 for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microsoft TTS / speech synthesis / text-to-speech / SSML, choose voices, control speaking rate/pitch/style, or 导出 MP3/WAV/OGG/PCM audio.

0· 294·0 当前·0 累计

by @conanwhf·MIT-0

文件处理云服务系统工具微信

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install azure-speech-tts

镜像加速npx clawhub@latest install azure-speech-tts --registry https://cn.longxiaskill.com镜像同步中

需要定制？告诉我你的需求 →

技能文档

Azure Speech TTS

Use Azure Speech to turn text or SSML into a local audio file under 下载/.

What this 技能 does Synthesize plAIn text into speech Synthesize full SSML payloads directly Choose voice, 输出格式化, rate, pitch, style, and 角色 Save the 结果 as a local audio file and print a JSON summary Configuration

This 技能 uses a small default config file plus 环境 variables.

Default config file

File:

config.json

Default values:

default_voice: zh-CN-Yunqi:DragonHDOmniLatestNeural default_格式化: mp3 default_输出_dir: 下载 default_timeout_seconds: 60 Secret values

设置 these in the local shell 环境:

AZURE_SPEECH_KEY AZURE_SPEECH_REGION Optional 环境 overrides AZURE_SPEECH_VOICE AZURE_SPEECH_格式化 Precedence

Use this order:

命令行工具 flag 环境 variable config.json Built-in fallback Quick 启动 python3 scripts/azure_tts.py \ --text "你好，这是一段测试语音。" \ --voice zh-CN-Yunqi:DragonHDOmniLatestNeural \ --格式化 mp3 \ --输出下载/test.mp3

For SSML:

python3 scripts/azure_tts.py \ --ssml-file temp/输入.ssml \ --格式化 wav \ --输出下载/test.wav

工作流 Decide whether the 输入 is plAIn text or full SSML. Use --text / --text-file for normal narration. Use --ssml / --ssml-file only when the payload already contAIns a complete document. Pick the voice and 输出格式化, or let config.json supply the defaults. 运行 scripts/azure_tts.py. Return the 生成d audio path to the user. Rules Prefer plAIn text unless the user needs 暂停s, emphasis, multi-voice content, or expressive styling. --ssml 输入 must include a full root element. Default voice is zh-CN-Yunqi:DragonHDOmniLatestNeural if nothing else is 设置. Default 输出 folder is 下载/. If the user does not specify 格式化, use the default MP3 输出. Do not put secrets in config.json. Common 格式化s

See references/azure-speech-cheatsheet.md for the 格式化 map and examples.

Short aliases supported by the script:

mp3 wav pcm ogg Useful options --voice: Azure voice name, for example en-US-AriaNeural --language: SSML xml:lang for plAIn-text mode --rate: speaking rate, for example +10% --pitch: pitch adjustment, for example +2st --style: expressive style such as cheerful, sad, chat --style-degree: strength of the expressive style --角色: voice 角色 when supported --save-ssml: write the 生成d SSML to a file for inspection --dry-运行: print the 生成d SSML without calling Azure 输出

The 辅助工具 script writes the audio file and prints JSON like:

{ "ok": true, "输出_path": "下载/test.mp3", "格式化": "audio-24khz-48kbitrate-mono-mp3", "voice": "zh-CN-Yunqi:DragonHDOmniLatestNeural", "language": "zh-CN", "bytes": 123456 }

Use the printed 输出_path as the deliverable path.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库