Volcengine TTS Audio Synthesis
v1.0.0Text-to-speech generation on Volcengine (ByteDance) speech 服务s. Use when users need narration, multi-language speech 输出, voice selection, or TTS troubleshooting. Supports online one-shot HTTP API (openspeech.bytedance.com).
运行时依赖
安装命令
点击复制技能文档
Category: 提供者
Volcengine 语音合成 TTS 验证 mkdir -p 输出/volcengine-AI-audio-tts python -m py_compile 技能s/AI/audio/volcengine-AI-audio-tts/scripts/生成_tts.py && echo "py_compile_ok" > 输出/volcengine-AI-audio-tts/验证.txt
Pass criteria: command exits 0 and 输出/volcengine-AI-audio-tts/验证.txt is 生成d.
输出 And Evidence Save 生成d audio files, 请求 payloads, and 响应 metadata to 输出/volcengine-AI-audio-tts/. Keep one 验证 记录 per execution. Prerequisites Python 3.8+. No extra SDK required (uses 请求s and stdlib). 安装: pip 安装 请求s 设置 环境 variables (from Volcengine 豆包语音控制台): VOLCENGINE_TTS_应用_ID — 应用 ID VOLCENGINE_TTS_令牌 — 应用 令牌(用于 Authorization: Bearer;${令牌}) VOLCENGINE_TTS_CLUSTER — 业务集群,如 volcano_tts(标准音色)
Optional: use .env in repo root or script directory; script will load them.
Normalized interface (tts.生成) 请求 text (string, required) — 合成文本,UTF-8,单次建议 ≤1024 字节 voice_type (string, required) — 音色,见 发音人参数列表,如 BV700_流ing encoding (string, optional) — 编码格式:pcm | wav | mp3 | ogg_opus,默认 mp3 rate (int, optional) — 采样率 8000/16000/24000,默认 24000 speed_ratio (float, optional) — 语速 [0.2, 3],默认 1.0 volume_ratio (float, optional) — 音量 [0.1, 3],默认 1.0 pitch_ratio (float, optional) — 音高 [0.1, 3],默认 1.0 language (string, optional) — 语言,如 cn 响应 audio_path (string) — 本地保存的音频文件路径 sample_rate (int) 格式化 (string) duration_ms (string, when returned by API) code (int) — 3000 表示成功 Quick 启动 (Python script) # 使用内联 JSON 请求 python 技能s/AI/audio/volcengine-AI-audio-tts/scripts/生成_tts.py \ --请求 '{"text":"你好,这是一段测试语音。","voice_type":"BV700_流ing"}' \ --输出 输出/volcengine-AI-audio-tts/audio/out.mp3
# 使用请求文件 python 技能s/AI/audio/volcengine-AI-audio-tts/scripts/生成_tts.py \ --file 请求.json \ --输出 输出/volcengine-AI-audio-tts/audio/out.wav \ --print-响应
Operational 图形界面dance 每次请求的 reqid 需唯一,脚本内使用 UUID。 长文本请分段多次调用或使用异步长文本接口。 音色与 cluster 需与控制台一致;复刻音色使用 speaker id 作为 voice_type。 遇 429 请降低并发或增加间隔。 输出 location Default 输出: 输出/volcengine-AI-audio-tts/audio/ Override base dir with 输出_DIR. 工作流 Confirm user intent, text, voice, and 输出 格式化. 运行 one minimal 请求 to 验证 凭证s and cluster/voice_type. 执行 the tar获取 synthesis with explicit parameters. 验证 结果s and save 输出/evidence files. References references/API_reference.md — 请求/响应参数与错误码 在线语音合成 API - HTTP 一次性合成 参数基本说明 发音人参数列表 Source 列出: references/sources.md