语音转录

v1.0.0

语音转文本技能 - 基于 SiliconFlow API (SenseVoiceSmall/TeleSpeechASR)，支持四川话等多种方言识别

0· 0·0 当前·0 累计

by @datou3456

数据与API

使用场景：使用语音转录进行数据与API使用语音转录

下载技能包

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install voice-transcription

镜像加速npx clawhub@latest install voice-transcription --registry https://cn.longxiaskill.com 镜像可用

本土化适配说明

语音转录安装说明：安装命令：["openclaw skills install voice-transcription"]

需要定制？告诉我你的需求 →

技能文档

🎙️ Voice Transcription – 语音转文本基于 SiliconFlow API 的语音转文本技能，支持普通话、粤语、英语、日语、韩语等多种语言，对四川话等方言也有良好识别能力。

模型说明模型 | 特点 | 适用场景 ---|---|--- FunAudioLLM/SenseVoiceSmall | 轻量级，支持多语言+情感识别 | 日常对话、会议录音 TeleAI/TeleSpeechASR | 电信自研，方言识别更强 | 四川话等方言语音

配置设置环境变量 SILICONFLOW_API_KEY 或在调用时传入 api_key 参数。

export SILICONFLOW_API_KEY="your-api-key-here"

使用方法命令行

# 转录音频文件（自动选择模型） python3 scripts/transcribe.py audio.mp3 # 指定模型 python3 scripts/transcribe.py audio.mp3 --model TeleAI/TeleSpeechASR # 指定 API Key python3 scripts/transcribe.py audio.mp3 --api-key sk-xxx

# 输出到文件 python3 scripts/transcribe.py audio.mp3 --output result.txt

在对话中使用当用户提供音频文件或提到“语音转文字”、“听一下这个录音”等时，使用此技能。

注意事项音频文件限制：时长不超过 1 小时，文件大小不超过 50 MB 支持格式：mp3, wav, m4a, flac, ogg, webm 等常见音频格式 API 免费额度：SiliconFlow 提供一定的免费调用额度

运行时依赖

安装命令

本土化适配说明

技能文档

相关技能推荐