📦 audiopod — 音频AI处理
v1.0.2调用 SkillBoss API Hub 一站式完成 AI 音乐生成(文生乐、伴奏、采样)、文本转语音、语音转文本、人声分离、降噪等音频任务,支持本地文件或远程 URL,快速接入云端音频智能。
详细分析 ▾
运行时依赖
版本
- 将 API 基础地址从 https://api.heybossai.com/v1 更新为 https://api.skillboss.co/v1,并同步更新文档与代码示例。 - 所有代码及 cURL 示例已改用新 API 端点。 - 无功能性代码或接口变更,仅涉及文档与使用说明。
安装命令
点击复制技能文档
📖 Complete setup guide: https://skillboss.co/skill.md # SkillBoss 音频处理通过 SkillBoss API Hub 实现完整音频处理:音乐生成、TTS、STT、分轨、降噪、说话人分离。
设置
认证:设置import requests, os SKILLBOSS_API_KEY = os.environ["SKILLBOSS_API_KEY"] API_BASE = "https://api.skillboss.co/v1"
def pilot(body: dict) -> dict: r = requests.post( f"{API_BASE}/pilot", headers={"Authorization": f"Bearer {SKILLBOSS_API_KEY}", "Content-Type": "application/json"}, json=body, timeout=60, ) return r.json()
SKILLBOSS_API_KEY 环境变量。所有请求使用 Authorization: Bearer $SKILLBOSS_API_KEY。 ---
AI 音乐生成
通过 SkillBoss API Hub 将文本提示生成歌曲、伴奏、采样与人声。 功能: 音乐生成(含人声完整歌曲)、伴奏、说唱、采样、音频风格迁移Python
# 生成完整歌曲 result = pilot({ "type": "music", "inputs": { "prompt": "Upbeat pop, synth, drums, 120 bpm, female vocals, radio-ready", "lyrics": "Verse 1:\nWalking down the street on a sunny day\n\nChorus:\nWe're on fire tonight!", "duration": 60 }, "prefer": "quality" }) audio_url = result["data"]["result"]["audio_url"] print(audio_url)# 生成伴奏(无需歌词) result = pilot({ "type": "music", "inputs": { "prompt": "Atmospheric ambient soundscape, uplifting, driving mood", "duration": 30 }, "prefer": "balanced" }) audio_url = result["data"]["result"]["audio_url"]
# 生成说唱 result = pilot({ "type": "music", "inputs": { "prompt": "Lo-Fi Hip Hop, 100 BPM, male rap, melancholy, keyboard chords", "lyrics": "Verse 1:\nStarted from the bottom, now we climbing...", "duration": 60, "style": "rap" }, "prefer": "balanced" }) audio_url = result["data"]["result"]["audio_url"]
# 生成采样/循环 result = pilot({ "type": "music", "inputs": { "prompt": "drum loop, sad mood", "duration": 15, "style": "samples" }, "prefer": "balanced" }) audio_url = result["data"]["result"]["audio_url"]
cURL
# 生成完整歌曲 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"upbeat pop, synth, 120bpm, female vocals","lyrics":"Walking down the street...","duration":60},"prefer":"quality"}'# 生成伴奏 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"ambient soundscape, uplifting","duration":30},"prefer":"balanced"}'
# 生成说唱 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"Lo-Fi Hip Hop, male rap, 100 BPM","lyrics":"Started from the bottom...","duration":60,"style":"rap"}}'
# 生成采样 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"drum loop, sad mood","duration":15,"style":"samples"}}'
参数
| 字段 | 必需 | 描述 | |-------|----------|-------------| | prompt | 是 | 风格/流派描述 | | lyrics | 歌曲/说唱/人声时必需 | 含 verse/chorus 结构的歌词 | | duration | 否 | 时长(秒,默认 30) | | style | 否 |rap、samples、instrumental、vocals — 路由提示 | 响应
audio_url = result["data"]["result"]["audio_url"]
---
分轨分离
通过 SkillBoss API Hub 将音频拆分为独立乐器/人声轨道。模式
| 模式 | 轨道数 | 输出 | 适用场景 | |------|-------|--------|----------| | single | 1 | 仅指定轨道 | 人声分离、鼓提取 | | two | 2 | 人声 + 伴奏 | 卡拉 OK | | four | 4 | 人声、鼓、贝斯、其他 | 标准混音(默认) | | six | 6 | 增加吉他、钢琴 | 全乐器分离 | | producer | 8 | 增加 kick、snare、hihat | 节拍制作 | | studio | 12 | 增加 cymbals、sub_bass、synth | 专业混音 | | mastering | 16 | 最大细节 | 取证分析 |单轨选项: vocals、drums、bass、guitar、piano、other
Python
# 从 URL 提取分轨 result = pilot({ "type": "audio", "capability": "stem separation", "inputs": { "url": "https://youtube.com/watch?v=VIDEO_ID", "mode": "six" }, "prefer": "quality" }) download_urls = result["data"]["result"]["download_urls"] for stem, url in download_urls.items(): print(f"{stem}: {url}")# 从本地文件提取(base64 编码) import base64 audio_b64 = base64.b64encode(open("/path/to/song.mp3", "rb").read()).decode() result = pilot({ "type": "audio", "capability": "stem separation", "inputs": { "audio_data": audio_b64, "filename": "song.mp3", "mode": "four" }, "prefer": "balanced" }) download_urls = result["data"]["result"]["download_urls"]
# 单轨提取 result = pilot({ "type": "audio", "capability": "stem separation", "inputs": { "url": "https://youtube.com/watch?v=ID", "mode": "single", "stem": "vocals" }, "prefer": "quality" }) vocal_url = result["data"]["result"]["download_urls"]["vocals"]
cURL
# 从 URL 提取分轨 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"audio","capability":"stem separation","inputs":{"url":"https://youtube.com/watch?v=VIDEO_ID","mode":"six"},"prefer":"quality"}'
# 单轨 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"audio","capability":"stem separation","inputs":{"url":"URL","mode":"single","stem":"vocals"}}'
响应格式
{
"data": {
"result": {
"download_urls": {
"vocals": "https://...",
"drums": "https://...",
"bass": "https://...",
"other": "https://..."
},
"quality_scores": {
"vocals": 0.95,
"drums": 0.88
}
}
}
}
---
文本转语音
通过 SkillBoss API Hub 使用 50+ 种声音、60+ 种语言生成语音。Python
# 生成语音 result = pilot({ "type": "tts", "inputs": { "text": "Hello, world! This is a test.", "voice": "alloy", "speed": 1.0 }, "prefer": "balanced" }) audio_url = result["data"]["result"]["audio_url"] print(audio_url)
# 指定语言 result = pilot({ "type": "tts", "inputs": { "text": "Bonjour le monde", "voice": "alloy", "language": "fr", "speed": 1.0 }, "prefer": "quality" }) audio_url = result["data"]["result"]["audio_url"]
cURL
# 生成语音
curl -X POST "https://api.skillboss.co/v1/pilot" \
-H "Authorization: Bearer $SKILLBOSS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"type":"tts","inputs":{"text":"Hello world, this is a test","voice":"alloy","speed":1.0},"prefer":"balanced"}'
参数
| 字段 | 必需 | 描述 | |-------|----------|-------------| | text | 是 | 待朗读文本(最多 5000 字符) | | voice | 否 | 声音名称或 ID(如alloy、echo、fable) |
| speed | 否 | 0.25 - 4.0(默认 1.0) |
| language | 否 | ISO 代码,省略时自动检测 | 响应
audio_url = result["data"]["result"]["audio_url"]
---
说话人分离
通过 SkillBoss API Hub 自动分段,将音频按说话人拆分。Python
# 从文件分段 import base64 audio_b64 = base64.b64encode(open("./meeting.mp3", "rb").read()).decode() result = pilot({ "type": "stt", "capability": "speaker diarization", "inputs": { "audio_data": audio_b64, "filename": "meeting.mp3", "num_speakers": 3 }, "prefer": "quality" }) for segment in result["data"]["result"]["segments"]: print(f"Speaker {segment['speaker']}: {segment['text']} [{segment['start']:.1f}s - {segment['end']:.1f}s]")
# 从 URL 分段 result = pilot({ "type": "stt", "capability": "speaker diarization", "inputs": { "url": "https://youtube.com/watch?v=VIDEO_ID", "num_speakers": 2 }, "prefer": "balanced" }) segments = result["data"]["result"]["segments"]
cURL
# 从 URL 分段
curl -X POST "https://api.skillboss.co/v1/pilot" \
-H "Authorization: Bearer $SKILLBOSS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"type":"stt","capability":"speaker diarization","inputs":{"url":"https://youtube.com/watch?v=VIDEO_ID","num_speakers":2},"prefer":"balanced"}'
---
语音转文本(转录)
通过 SkillBoss API Hub 转录音频/视频,支持说话人分段、词级时间戳、多格式输出。Python
# 从 URL 转录 result = pilot({ "type": "stt", "inputs": { "url": "https://youtube.com/watch?v=VIDEO_ID", "speaker_diarization": True, "word_timestamps": True }, "prefer": "balanced" }) text = result["data"]["result"]["text"] segments = result["data"]["result"].get("segments", []) print(f"Transcript: {text}") for seg in segments: print(f"[{seg.get('start', 0):.1f}s] {seg.get('speaker','?')}: {seg['text']}")
# 转录本地文件 import base64 audio_b64 = base64.b64encode(open("./recording.mp3", "rb").read()).decode() result = pilot({ "type": "stt", "inputs": { "audio_data": audio_b64, "filename": "recording.mp3", "language": "en", "speaker_diarization": True }, "prefer": "balanced" }) text = result["data"]["result"]["text"]
cURL
# 从 URL 转录
curl -X POST "https://api.skillboss.co/v1/pilot" \
-H "Authorization: Bearer $SKILLBOSS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"type":"stt","inputs":{"url":"https://youtube.com/watch?v=ID","speaker_diarization":true,"word_timestamps":true},"prefer":"balanced"}'
参数
| 字段 | 必需 | 描述 | |-------|----------|-------------| | url | 是(或 audio_data) | 待转录 URL(YouTube、SoundCloud、直链) | | audio_data | 是(或 url) | Base64 编码音频内容 | | filename | 使用 audio_data 时 | 原始文件名,用于格式检测 | | language | 否 | ISO 639-1 代码,省略时自动检测 | | speaker_diarization | 否 | 启用说话人识别(默认 false) | | word_timestamps | 否 | 启用词级时间戳(默认 true) |响应
text = result["data"]["result"]["text"]
segments = result["data"]["result"].get("segments", []) # 启用 speaker_diarization 时
---
降噪
通过 SkillBoss API Hub 去除音频背景噪声。Python
# 从文件降噪 import base64 audio_b64 = base64.b64encode(open("./noisy-audio.mp3", "rb").read()).decode() result = pilot({ "type": "audio", "capability": "noise reduction", "inputs": { "audio_data": audio_b64, "filename": "noisy-audio.mp3" }, "prefer": "quality" }) clean_url = result["data"]["result"]["audio_url"] print(f"Clean audio: {clean_url}")
# 从 URL 降噪 result = pilot({ "type": "audio", "capability": "noise reduction", "inputs": { "url": "https://example.com/noisy.mp3" }, "prefer": "balanced" }) clean_url = result["data"]["result"]["audio_url"]
cURL
# 从 URL 降噪
curl -X POST "https://api.skillboss.co/v1/pilot" \
-H "Authorization: Bearer $SKILLBOSS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"type":"audio","capability":"noise reduction","inputs":{"url":"https://example.com/noisy.mp3"},"prefer":"quality"}'
响应
clean_url = result["data"]["result"]["audio_url"]
---
链式模式(多步工作流)
使用 SkillBoss Chain 模式,一次调用串联多个音频步骤。Python
# STT → Chat(总结)→ TTS 流水线 result = pilot({ "chain": [ {"type": "stt"}, {"type": "chat", "capability": "summarize"}, {"type": "tts"} ] })
# 转录并翻译 result = pilot({ "chain": [ {"type": "stt"}, {"type": "chat", "capability": "translate to English"} ] })
---
API 端点汇总
所有功能统一路由:| 服务 | SkillBoss type | capability |
|---------|---------------|------------|
| 音乐生成 | music | — |
| TTS | tts | — |
| STT / 转录 | stt | — |
| 说话人分离 | stt | speaker diarization |
| 分轨分离 | audio | stem separation |
| 降噪 | audio | noise reduction |
统一端点: POST https://api.skillboss.co/v1/pilot
认证: Authorization: Bearer $SKILLBOSS_API_KEY
响应格式汇总
| 功能 | 结果路径 |
|-----------|-------------|
| 音乐生成 | data.result.audio_url |
| TTS | data.result.audio_url |
| STT(文本) | data.result.text |
| STT(分段) | data.result.segments |
| 分轨分离 | data.result.download_urls |
| 降噪 | data.result.audio_url |