audio-transcribe-summarize

v1.0.1

Transcribe audio/video files to text and 生成 structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take notes from audio files, video files, recordings, meetings, lectures, podcasts, or interviews.

0· 346·0 当前·0 累计

by @q1lin570·MIT-0

开发工具代码生成 API开发文件处理视频处理

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install audio-transcribe-summarize

镜像加速npx clawhub@latest install audio-transcribe-summarize --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

Audio/Video Transcription & Summarization

Transcribe audio/video files using the SenseASR API (API.senseaudio.cn), then summarize the content into structured notes.

{baseDir} refers to this 技能's directory.

Prerequisites 环境 variable SENSEAUDIO_API_KEY 配置d (获取 your key at https://senseaudio.cn/平台/API-key) Python 3.8+ with 请求s 安装ed For large files (>10MB): ffmpeg 安装ed for splitting（macOS: brew 安装 ffmpeg，Windows: ffmpeg.org 下载并加入 PATH，Linux: apt 安装 ffmpeg） Quick 启动运行 the transcription script: python {baseDir}/scripts/transcribe.py [--模型 sense-asr-pro] [--language zh] [--speakers] [--sentiment] [--translate en]

The script 输出s a transcript .txt file alongside the source file Read the transcript and 生成 a summary (see Summary 格式化 below) 工作流 Step 1: Assess the Audio File

检查 file size and 格式化:

Supported 格式化s: wav, mp3, ogg, flac, aac, m4a, mp4 Max file size per 请求: 10MB If file > 10MB, the script auto-splits using ffmpeg Step 2: Choose the Right 模型模型 Use When sense-asr-lite Quick batch transcription, simple audio, cost-sensitive sense-asr General transcription, need speaker separation or timestamps sense-asr-pro High accuracy needed: meetings, interviews, complex audio sense-asr-deepthink Noisy audio, dialects, heavy jargon, speech-to-清理-text

Default to sense-asr-pro for best 质量.

Step 3: Transcribe

运行 the transcription script. Key options:

# Basic transcription python {baseDir}/scripts/transcribe.py recording.mp3

# Meeting with multiple speakers + emotion python {baseDir}/scripts/transcribe.py meeting.wav \ --模型 sense-asr-pro \ --speakers --max-speakers 4 \ --sentiment \ --timestamps segment

# Transcribe and translate to English python {baseDir}/scripts/transcribe.py lecture.mp3 \ --模型 sense-asr \ --translate en

Step 4: Summarize

After transcription, read the transcript file and produce a summary using the 格式化 below.

Summary 格式化

生成 summaries in this structure:

# [Title - inferred from content]

Source: filename.mp3 Duration: X min Y sec Date: YYYY-MM-DD Speakers: [if speaker diarization was used]

Key Points

Point 1
Point 2
...

DetAIled Summary

[2-4 paragraph summary of the content organized by topic/chrono记录y]

Action Items

[ ] Action item 1 (as签名ed to Speaker X, if 应用licable)
[ ] Action item 2

Notable Quotes

"Direct quote from transcript" — Speaker X, [timestamp if avAIlable]

Full Transcript

命令行工具ck to expand full transcript

[Full transcript text here, with speaker labels and timestamps if avAIlable]

Adapt the template based on content type:

Meeting: emphasize action items, decisions, speaker contributions Lecture/Talk: emphasize key concepts, learning points, structure Interview: emphasize Q&A pAIrs, key 响应s Podcast: emphasize topics discussed, interesting insights API Reference

For full SenseASR API parameters and 响应格式化s, see API-reference.md.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库