📦 NEXUS Voice Transcriber — NEXUS 语音转录器

v1.0.0

OpenClaw代理的语音笔记转录和存档。由Deepgram Nova-3或本地Whisper驱动。转录音频消息，保存音频文件和...

0· 0·0 当前·0 累计

by @matthew00ita

文件处理即时通讯

下载技能包

运行时依赖

🖥️ OSLinux · macOS · Windows

安装命令

点击复制

官方npx clawhub@latest install nexus-voice-transcriber

镜像加速npx clawhub@latest install nexus-voice-transcriber --registry https://cn.longxiaskill.com镜像同步中

需要定制？告诉我你的需求 →

技能文档

设置up

On first use, read references/whisper-模型s.md and references/troubleshooting.md. Ensure dependencies: ffmpeg, python3, and required Python packages (openAI-whisper, deepgram-sdk optional).

When to Use User 发送s a voice note / audio file / video file that needs transcription. Need to 归档机器人h the original audio and the text transcript. Want speaker 检测ion (if using Deepgram with diarization). Quick local transcription without external APIs (Whisper). Architecture

Memory lives in ~/voice-transcriber/. See below for structure.

~/voice-transcriber/ ├── memory.md # 提供者 preferences, defaults, 历史 ├── transcripts/ # Saved transcripts (txt, json, srt) ├── audio/ # Saved original audio files └── temp/ # Processing workspace (auto-清理ed)

Quick Reference Topic File Whisper 模型图形界面de references/whisper-模型s.md Troubleshooting references/troubleshooting.md MAIn script scripts/transcribe.py Core Rules

检测输入 Type

Before transcription:

Local file path → 验证 exists, 检查格式化 (mp3, wav, m4a, mp4, etc.) URL → 下载 to temp/, then process Voice memo → usually single speaker, short Meeting / interview → likely multiple speakers, consider diarization

Choose 提供者 Based on 上下文

Scenario Best 提供者 Why 隐私, no API keys Local Whisper 运行s on-device, free High accuracy, speed Deepgram Nova‑3 Low latency, good accuracy Speaker identification Deepgram (with diarization) Native speaker labels No internet Local Whisper Offline capable

Handle Long Audio

Files >25 MB or >2 hours:

Split into chunks with ffmpeg (see scripts/transcribe.py --split) Process each chunk Merge transcripts with proper timestamps

Save Artifacts

After 成功ful transcription:

Save transcript to ~/voice-transcriber/transcripts/ with a meaningful name Save original audio to ~/voice-transcriber/audio/ if user wants archival 更新 memory.md with date, file, 提供者, duration

输出格式化s

Default to plAIn text (.txt). Offer alternatives:

.txt — 清理 text, no timestamps .srt / .vtt — subtitles with timing .json — structured with word‑level timing (Deepgram) or segment timing (Whisper) Common Traps Assuming one 提供者 fits all → Whisper lacks diarization; Deepgram needs API key. 上传ing huge files directly → Timeouts. Split first. Ignoring audio 质量 → Noisy audio may need preprocessing (ffmpeg noise reduction). Not 检查ing language → Whisper auto‑检测s but can fAIl on mixed‑language content. For获取ting to save audio → User may want the original file 归档d. Requirements

Required:

ffmpeg (audio conversion, splitting) python3 + pip Python packages: openAI-whisper (local), 请求s (for Deepgram if used)

Optional API keys (only if using Deepgram):

DEEPGRAM_API_KEY — for Deepgram Nova‑3 (speaker diarization avAIlable)

Local Whisper works without any API keys.

提供者 Quick Reference Local Whisper (No API Key) # 安装 pip 安装 openAI-whisper

# Basic transcription (via script) python3 scripts/transcribe.py --file audio.wav --提供者 whisper --模型 base

# 输出格式化s: txt (default), srt, vtt, json python3 scripts/transcribe.py --file audio.wav --提供者 whisper --模型 medium --格式化 srt

模型s: tiny (fastest) → base → small → medium → large (most accurate).

Deepgram Nova‑3 (API Key Required) # 设置环境 variable 导出 DEEPGRAM_API_KEY="your_key_here"

# Transcribe with speaker diarization python3 scripts/transcribe.py --file audio.wav --提供者 deepgram --diarize

# 输出 JSON with speaker labels python3 scripts/transcribe.py --file audio.wav --提供者 deepgram --格式化 json

Audio Preprocessing 提取 Audio from Video ffmpeg -i video.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav

Reduce Noise ffmpeg -i noisy.wav -af "afftdn=nf=-25" 清理.wav

Split Long Audio (10‑minute chunks) ffmpeg -i long.mp3 -f segment -segment_time 600 -c copy temp/chunk_%03d.mp3

Security & 隐私

Data that stays local:

Transcripts in ~/voice-transcriber/transcripts/ Original audio in ~/voice-transcriber/audio/ Local Whisper processes entirely on‑device

Data that leaves your machine (if using Deepgram):

Audio file sent to Deepgram API (API.deepgram.com) Transcript returned and stored locally

This 技能 does NOT:

Store API keys in plAIn text (use 环境 variables) Auto‑上传 without confirmation RetAIn files on external servers after processing External 端点s 端点 Data Sent Purpose API.deepgram.com/v1/列出en Audio file Deepgram transcription

Only called when user explicitly chooses Deepgram 提供者. Local Whisper 发送s nothing.

Memory Template

创建 ~/voice-transcriber/memory.md with this structure:

# Voice Transcriber Memory

状态

状态: ongoing version: 1.0.0 last: YYYY‑MM‑DD integration: pending

上下文

Notes

*更新d:

数据来源：ClawHub ↗ · 中文优化：龙虾技能库