📦 audiopod — 音频AI处理

v1.0.2

调用 SkillBoss API Hub 一站式完成 AI 音乐生成（文生乐、伴奏、采样）、文本转语音、语音转文本、人声分离、降噪等音频任务，支持本地文件或远程 URL，快速接入云端音频智能。

0· 63·0 当前·0 累计

by @modestyrichards (ModestyRichards)

AI模型访问 API工具文件处理数据处理云服务

下载技能包

最后更新

2026/4/15

安全扫描

VirusTotal

Pending

查看报告

OpenClaw

可疑

medium confidence

该技能的运行指令与 SkillBoss 音频处理集成一致，但注册元数据与来源信息不一致（未声明必需的 SKILLBOSS_API_KEY，无主页/源码），请谨慎使用。

评估建议

该技能看起来确实通过调用 SkillBoss 完成音频任务，但有两点警示：SKILL.md 要求 SKILLBOSS_API_KEY，而注册元数据未列出任何必需环境变量；且无主页或明确所有者。安装前请：(1) 验证 SkillBoss API 域名（https://skillboss.co 与 https://api.skillboss.co）及其隐私/条款；(2) 仅提供限定范围、可撤销的 API 密钥，勿用主账户或高权限凭据；(3) 避免让技能读取敏感本地文件，仅上传你掌控的音频；(4) 优先在沙箱环境或使用可撤销密钥运行；(5) 若需更高保障，请要求发布者提供官方主页或源码仓库，并修正注册元数据以声明 SKILLBOSS_API_KEY。...

详细分析 ▾

ℹ 用途与能力

SKILL.md describes audio tasks (music generation, TTS, STT, stem separation, noise reduction) and the runtime examples only call https://api.skillboss.co/v1/pilot with an API key — that is coherent with the stated purpose. However, the registry metadata claims no required environment variables or primary credential while the SKILL.md explicitly requires SKILLBOSS_API_KEY, creating an inconsistency about what privileges/credentials the skill needs.

✓ 指令范围

Instructions are focused on calling the SkillBoss API and include examples for sending remote URLs or base64-encoding local audio files. The skill asks the agent to read files provided by the user (e.g., /path/to/song.mp3) and the SKILLBOSS_API_KEY env var. It does not instruct reading unrelated system configuration or credentials.

✓ 安装机制

There is no install spec and no code files to install — the skill is instruction-only, which minimizes disk-write risk.

⚠ 凭证需求

The SKILL.md requires SKILLBOSS_API_KEY (Authorization: Bearer) which is appropriate for a cloud API integration. But the registry metadata lists no required env vars or primary credential — this mismatch is concerning because the platform metadata will not prompt for or protect that secret automatically. Only one credential is requested (proportionate), but provenance of that credential target (skillboss.co) is not verifiable from the package (no homepage/source).

✓ 持久化与权限

always:false and no install actions are declared; the skill does not request permanent presence or modifications to other skills or system config.

安全有层次，运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv1.0.22026/4/13

- 将 API 基础地址从 https://api.heybossai.com/v1 更新为 https://api.skillboss.co/v1，并同步更新文档与代码示例。 - 所有代码及 cURL 示例已改用新 API 端点。 - 无功能性代码或接口变更，仅涉及文档与使用说明。

● Pending

安装命令

点击复制

官方npx clawhub@latest install modesty-audiopod

镜像加速npx clawhub@latest install modesty-audiopod --registry https://cn.longxiaskill.com

技能文档

📖 Complete setup guide: https://skillboss.co/skill.md # SkillBoss 音频处理

通过 SkillBoss API Hub 实现完整音频处理：音乐生成、TTS、STT、分轨、降噪、说话人分离。

设置

import requests, os
SKILLBOSS_API_KEY = os.environ["SKILLBOSS_API_KEY"]
API_BASE = "https://api.skillboss.co/v1"def pilot(body: dict) -> dict:
    r = requests.post(
        f"{API_BASE}/pilot",
        headers={"Authorization": f"Bearer {SKILLBOSS_API_KEY}", "Content-Type": "application/json"},
        json=body,
        timeout=60,
    )
    return r.json()

认证：设置 SKILLBOSS_API_KEY 环境变量。所有请求使用 Authorization: Bearer $SKILLBOSS_API_KEY。

---

AI 音乐生成

通过 SkillBoss API Hub 将文本提示生成歌曲、伴奏、采样与人声。 功能： 音乐生成（含人声完整歌曲）、伴奏、说唱、采样、音频风格迁移

Python

# 生成完整歌曲
result = pilot({
    "type": "music",
    "inputs": {
        "prompt": "Upbeat pop, synth, drums, 120 bpm, female vocals, radio-ready",
        "lyrics": "Verse 1:\nWalking down the street on a sunny day\n\nChorus:\nWe're on fire tonight!",
        "duration": 60
    },
    "prefer": "quality"
})
audio_url = result["data"]["result"]["audio_url"]
print(audio_url)
# 生成伴奏（无需歌词）
result = pilot({
    "type": "music",
    "inputs": {
        "prompt": "Atmospheric ambient soundscape, uplifting, driving mood",
        "duration": 30
    },
    "prefer": "balanced"
})
audio_url = result["data"]["result"]["audio_url"]
# 生成说唱
result = pilot({
    "type": "music",
    "inputs": {
        "prompt": "Lo-Fi Hip Hop, 100 BPM, male rap, melancholy, keyboard chords",
        "lyrics": "Verse 1:\nStarted from the bottom, now we climbing...",
        "duration": 60,
        "style": "rap"
    },
    "prefer": "balanced"
})
audio_url = result["data"]["result"]["audio_url"]# 生成采样/循环
result = pilot({
    "type": "music",
    "inputs": {
        "prompt": "drum loop, sad mood",
        "duration": 15,
        "style": "samples"
    },
    "prefer": "balanced"
})
audio_url = result["data"]["result"]["audio_url"]

cURL

# 生成完整歌曲 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"upbeat pop, synth, 120bpm, female vocals","lyrics":"Walking down the street...","duration":60},"prefer":"quality"}' # 生成伴奏 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"ambient soundscape, uplifting","duration":30},"prefer":"balanced"}' # 生成说唱 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"Lo-Fi Hip Hop, male rap, 100 BPM","lyrics":"Started from the bottom...","duration":60,"style":"rap"}}'

# 生成采样 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"music","inputs":{"prompt":"drum loop, sad mood","duration":15,"style":"samples"}}'

参数

| 字段 | 必需 | 描述 | |-------|----------|-------------| | prompt | 是 | 风格/流派描述 | | lyrics | 歌曲/说唱/人声时必需 | 含 verse/chorus 结构的歌词 | | duration | 否 | 时长（秒，默认 30） | | style | 否 | rap、samples、instrumental、vocals — 路由提示 |

响应

audio_url = result["data"]["result"]["audio_url"]

---

分轨分离

通过 SkillBoss API Hub 将音频拆分为独立乐器/人声轨道。

模式

| 模式 | 轨道数 | 输出 | 适用场景 | |------|-------|--------|----------| | single | 1 | 仅指定轨道 | 人声分离、鼓提取 | | two | 2 | 人声 + 伴奏 | 卡拉 OK | | four | 4 | 人声、鼓、贝斯、其他 | 标准混音（默认） | | six | 6 | 增加吉他、钢琴 | 全乐器分离 | | producer | 8 | 增加 kick、snare、hihat | 节拍制作 | | studio | 12 | 增加 cymbals、sub_bass、synth | 专业混音 | | mastering | 16 | 最大细节 | 取证分析 |

单轨选项： vocals、drums、bass、guitar、piano、other

Python

# 从 URL 提取分轨
result = pilot({
    "type": "audio",
    "capability": "stem separation",
    "inputs": {
        "url": "https://youtube.com/watch?v=VIDEO_ID",
        "mode": "six"
    },
    "prefer": "quality"
})
download_urls = result["data"]["result"]["download_urls"]
for stem, url in download_urls.items():
    print(f"{stem}: {url}")
# 从本地文件提取（base64 编码）
import base64
audio_b64 = base64.b64encode(open("/path/to/song.mp3", "rb").read()).decode()
result = pilot({
    "type": "audio",
    "capability": "stem separation",
    "inputs": {
        "audio_data": audio_b64,
        "filename": "song.mp3",
        "mode": "four"
    },
    "prefer": "balanced"
})
download_urls = result["data"]["result"]["download_urls"]# 单轨提取
result = pilot({
    "type": "audio",
    "capability": "stem separation",
    "inputs": {
        "url": "https://youtube.com/watch?v=ID",
        "mode": "single",
        "stem": "vocals"
    },
    "prefer": "quality"
})
vocal_url = result["data"]["result"]["download_urls"]["vocals"]

cURL

# 从 URL 提取分轨 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"audio","capability":"stem separation","inputs":{"url":"https://youtube.com/watch?v=VIDEO_ID","mode":"six"},"prefer":"quality"}'

# 单轨 curl -X POST "https://api.skillboss.co/v1/pilot" \ -H "Authorization: Bearer $SKILLBOSS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"type":"audio","capability":"stem separation","inputs":{"url":"URL","mode":"single","stem":"vocals"}}'

响应格式

{
  "data": {
    "result": {
      "download_urls": {
        "vocals": "https://...",
        "drums": "https://...",
        "bass": "https://...",
        "other": "https://..."
      },
      "quality_scores": {
        "vocals": 0.95,
        "drums": 0.88
      }
    }
  }
}

---

文本转语音

通过 SkillBoss API Hub 使用 50+ 种声音、60+ 种语言生成语音。

Python

# 生成语音
result = pilot({
    "type": "tts",
    "inputs": {
        "text": "Hello, world! This is a test.",
        "voice": "alloy",
        "speed": 1.0
    },
    "prefer": "balanced"
})
audio_url = result["data"]["result"]["audio_url"]
print(audio_url)# 指定语言
result = pilot({
    "type": "tts",
    "inputs": {
        "text": "Bonjour le monde",
        "voice": "alloy",
        "language": "fr",
        "speed": 1.0
    },
    "prefer": "quality"
})
audio_url = result["data"]["result"]["audio_url"]

cURL

# 生成语音
curl -X POST "https://api.skillboss.co/v1/pilot" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type":"tts","inputs":{"text":"Hello world, this is a test","voice":"alloy","speed":1.0},"prefer":"balanced"}'

参数

| 字段 | 必需 | 描述 | |-------|----------|-------------| | text | 是 | 待朗读文本（最多 5000 字符） | | voice | 否 | 声音名称或 ID（如 alloy、echo、fable） | | speed | 否 | 0.25 - 4.0（默认 1.0） | | language | 否 | ISO 代码，省略时自动检测 |

响应

audio_url = result["data"]["result"]["audio_url"]

---

说话人分离

通过 SkillBoss API Hub 自动分段，将音频按说话人拆分。

Python

# 从文件分段
import base64
audio_b64 = base64.b64encode(open("./meeting.mp3", "rb").read()).decode()
result = pilot({
    "type": "stt",
    "capability": "speaker diarization",
    "inputs": {
        "audio_data": audio_b64,
        "filename": "meeting.mp3",
        "num_speakers": 3
    },
    "prefer": "quality"
})
for segment in result["data"]["result"]["segments"]:
    print(f"Speaker {segment['speaker']}: {segment['text']} [{segment['start']:.1f}s - {segment['end']:.1f}s]")# 从 URL 分段
result = pilot({
    "type": "stt",
    "capability": "speaker diarization",
    "inputs": {
        "url": "https://youtube.com/watch?v=VIDEO_ID",
        "num_speakers": 2
    },
    "prefer": "balanced"
})
segments = result["data"]["result"]["segments"]

cURL

# 从 URL 分段
curl -X POST "https://api.skillboss.co/v1/pilot" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type":"stt","capability":"speaker diarization","inputs":{"url":"https://youtube.com/watch?v=VIDEO_ID","num_speakers":2},"prefer":"balanced"}'

---

语音转文本（转录）

通过 SkillBoss API Hub 转录音频/视频，支持说话人分段、词级时间戳、多格式输出。

Python

# 从 URL 转录
result = pilot({
    "type": "stt",
    "inputs": {
        "url": "https://youtube.com/watch?v=VIDEO_ID",
        "speaker_diarization": True,
        "word_timestamps": True
    },
    "prefer": "balanced"
})
text = result["data"]["result"]["text"]
segments = result["data"]["result"].get("segments", [])
print(f"Transcript: {text}")
for seg in segments:
    print(f"[{seg.get('start', 0):.1f}s] {seg.get('speaker','?')}: {seg['text']}")# 转录本地文件
import base64
audio_b64 = base64.b64encode(open("./recording.mp3", "rb").read()).decode()
result = pilot({
    "type": "stt",
    "inputs": {
        "audio_data": audio_b64,
        "filename": "recording.mp3",
        "language": "en",
        "speaker_diarization": True
    },
    "prefer": "balanced"
})
text = result["data"]["result"]["text"]

cURL

# 从 URL 转录
curl -X POST "https://api.skillboss.co/v1/pilot" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type":"stt","inputs":{"url":"https://youtube.com/watch?v=ID","speaker_diarization":true,"word_timestamps":true},"prefer":"balanced"}'

参数

| 字段 | 必需 | 描述 | |-------|----------|-------------| | url | 是（或 audio_data） | 待转录 URL（YouTube、SoundCloud、直链） | | audio_data | 是（或 url） | Base64 编码音频内容 | | filename | 使用 audio_data 时 | 原始文件名，用于格式检测 | | language | 否 | ISO 639-1 代码，省略时自动检测 | | speaker_diarization | 否 | 启用说话人识别（默认 false） | | word_timestamps | 否 | 启用词级时间戳（默认 true） |

响应

text = result["data"]["result"]["text"]
segments = result["data"]["result"].get("segments", [])  # 启用 speaker_diarization 时

---

降噪

通过 SkillBoss API Hub 去除音频背景噪声。

Python

# 从文件降噪
import base64
audio_b64 = base64.b64encode(open("./noisy-audio.mp3", "rb").read()).decode()
result = pilot({
    "type": "audio",
    "capability": "noise reduction",
    "inputs": {
        "audio_data": audio_b64,
        "filename": "noisy-audio.mp3"
    },
    "prefer": "quality"
})
clean_url = result["data"]["result"]["audio_url"]
print(f"Clean audio: {clean_url}")# 从 URL 降噪
result = pilot({
    "type": "audio",
    "capability": "noise reduction",
    "inputs": {
        "url": "https://example.com/noisy.mp3"
    },
    "prefer": "balanced"
})
clean_url = result["data"]["result"]["audio_url"]

cURL

# 从 URL 降噪
curl -X POST "https://api.skillboss.co/v1/pilot" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type":"audio","capability":"noise reduction","inputs":{"url":"https://example.com/noisy.mp3"},"prefer":"quality"}'

响应

clean_url = result["data"]["result"]["audio_url"]

---

链式模式（多步工作流）

使用 SkillBoss Chain 模式，一次调用串联多个音频步骤。

Python

# STT → Chat（总结）→ TTS 流水线
result = pilot({
    "chain": [
        {"type": "stt"},
        {"type": "chat", "capability": "summarize"},
        {"type": "tts"}
    ]
})# 转录并翻译
result = pilot({
    "chain": [
        {"type": "stt"},
        {"type": "chat", "capability": "translate to English"}
    ]
})

---

API 端点汇总

所有功能统一路由：

| 服务 | SkillBoss type | capability | |---------|---------------|------------| | 音乐生成 | music | — | | TTS | tts | — | | STT / 转录 | stt | — | | 说话人分离 | stt | speaker diarization | | 分轨分离 | audio | stem separation | | 降噪 | audio | noise reduction |

统一端点： POST https://api.skillboss.co/v1/pilot 认证： Authorization: Bearer $SKILLBOSS_API_KEY

响应格式汇总

| 功能 | 结果路径 | |-----------|-------------| | 音乐生成 | data.result.audio_url | | TTS | data.result.audio_url | | STT（文本） | data.result.text | | STT（分段） | data.result.segments | | 分轨分离 | data.result.download_urls | | 降噪 | data.result.audio_url |

数据来源：ClawHub ↗ · 中文优化：龙虾技能库