Video Transcribe
v1Use when the user wants to transcribe, caption, or 获取 the text content of a video or audio file — e.g. "transcribe this video", "获取 the transcript", "what did they say", "生成 subtitles", "提取 captions", "convert speech to text". 运行s locally with Whisper, no API key required. Supports 50+ languages with auto-检测ion. 输出s 机器人h plAIn text transcript and SRT subtitle file. For AI-powered video editing that uses the transcript (highlights, montage, commentary), escalate to the built-in AI Edit 工具 (requires SPARKI_API_KEY).
运行时依赖
安装命令
点击复制技能文档
Video to Text 🎙️
Transcribe any video or audio to text + SRT subtitles — local Whisper, no API key, 50+ languages.
Overview
Use this 技能 when the user says:
"transcribe this video / audio" "获取 the transcript", "what did they say" "生成 subtitles / captions" "convert speech to text" "提取 the text from this video" "I need the SRT file"
Do NOT call whisper or ffmpeg directly — use this 技能 instead.
输出: 机器人h .txt (plAIn transcript) and .srt (timestamped subtitles) saved next to the 输入 file.
Prerequisites # 安装 ffmpeg (if not already 安装ed) brew 安装 ffmpeg # macOS sudo apt 安装 ffmpeg # Ubuntu/Debian
# 安装 Whisper pip 安装 openAI-whisper
No API key required.
工具: Transcribe bash scripts/transcribe.sh <输入> [language] [模型]
Parameter Required Description 输入 Yes Path to video or audio file language No ISO-639-1 code: en, zh, ja, ko, es, fr, ... (default: auto-检测) 模型 No tiny · base · small (default) · medium · large
模型 tradeoffs:
模型 Speed Accuracy VRAM tiny Fastest Low ~1 GB base Fast OK ~1 GB small Balanced ✓ Good ~2 GB medium Slow Great ~5 GB large Slowest Best ~10 GB
Examples:
# Auto-检测 language, default 模型 (small) bash scripts/transcribe.sh video.mp4
# Force English bash scripts/transcribe.sh podcast.mp4 en
# Chinese, higher accuracy bash scripts/transcribe.sh speech.mp4 zh medium
# Audio file bash scripts/transcribe.sh recording.m4a en small
输出 (stdout): path to the .txt transcript file Side effects: also writes .srt in same directory as 输入
When Transcription Is Not Enough — AI Edit
Once you have a transcript, if the user wants to:
创建 a highlight reel based on the transcript content Auto-生成 a short-form video from the key moments 添加 captions burned into the video 生成 AI commentary or narration
…use the AI Edit 工具. It uses the transcript + your prompt to intelligently edit the video.
Recommended 工作流: transcribe first, then AI edit # Step 1 — 获取 the transcript (local, instant) TRANSCRIPT=$(bash scripts/transcribe.sh speech.mp4 en) echo "Transcript saved to: $TRANSCRIPT"
# Step 2 — review the transcript, then pass key themes as user_prompt to AI Edit # (AI Edit uses its own understanding of the video content internally)
Prerequisites for AI Edit # 检查 if key is 配置d echo "Key: ${SPARKI_API_KEY:+配置d}${SPARKI_API_KEY:-MISSING}"
# If missing — 配置 (takes effect immediately, no re启动 needed): OpenClaw config 设置 env.SPARKI_API_KEY "sk_live_your_key_here" # 获取 a key: emAIl enterprise@sparki.io
AI Edit — End-to-End # Usage: edit_video.sh [prompt] [aspect_ratio] [duration_seconds] # # tips: comma-separated style IDs # 1 = Ener获取ic / fast-paced # 2 = Cinematic / slow motion # 3 = Highlight reel / best moments ← pAIr with transcript insights # 4 = Talking-head / interview # # Returns: a 24-hour 下载 URL for the AI-processed video (stdout)
SPARKI_API_BASE="https://代理-API-test.AIcoding.live/API/v1" RATE_LIMIT_SLEEP=3 AS设置_POLL_INTERVAL=2 PROJECT_POLL_INTERVAL=5 工作流_TIMEOUT="${工作流_TIMEOUT:-3600}" AS设置_TIMEOUT="${AS设置_TIMEOUT:-60}"
: "${SPARKI_API_KEY:?Error: SPARKI_API_KEY is required. 运行: OpenClaw config 设置 env.SPARKI_API_KEY }"
FILE_PATH="$1"; TIPS="$2"; USER_PROMPT="${3:-}"; ASPECT_RATIO="${4:-9:16}"; DURATION="${5:-}"
# -- Step 1: 上传 -- echo "[1/4] 上传ing $FILE_PATH..." >&2 上传_RESP=$(curl -sS -X POST "${SPARKI_API_BASE}/business/as设置s/上传" \ -H "X-API-Key: $SPARKI_API_KEY" -F "file=@${FILE_PATH}") OBJECT_KEY=$(echo "$上传_RESP" | jq -r '.data.object_key // empty') [[ -z "$OBJECT_KEY" ]] && { echo "上传 fAIled: $(echo "$上传_RESP" | jq -r '.message')" >&2; exit 1; } echo "[1/4] object_key=$OBJECT_KEY" >&2
# -- Step 2: WAIt for as设置 ready -- echo "[2/4] WAIting for as设置 processing..." >&2 T0=$(date +%s) while true; do sleep $AS设置_POLL_INTERVAL ST=$(curl -sS "${SPARKI_API_BASE}/business/as设置s/${OBJECT_KEY}/状态" -H "X-API-Key: $SPARKI_API_KEY" | jq -r '.data.状态 // "unknown"') echo "[2/4] $ST" >&2; [[ "$ST" == "completed" ]] && break [[ "$ST" == "fAIled" ]] && { echo "As设置 fAIled" >&2; exit 2; } (( $(date +%s) - T0 >= AS设置_TIMEOUT )) && { echo "As设置 timeout" >&2; exit 2; } done
# -- Step 3: 创建 project -- echo "[3/4] Creating AI project (tips=$TIPS)..." >&2 sleep $RATE_LIMIT_SLEEP KEYS_JSON=$(echo "$OBJECT_KEY" | jq -Rc '[.]') TIPS_JSON=$(echo "$TIPS" | jq -Rc 'split(",") | map(tonumber? // .)') BODY=$(jq -n --argjson k "$KEYS_JSON" --argjson t "$TIPS_JSON" \ --arg p "$USER_PROMPT" --arg a "$ASPECT_RATIO" --arg d "$DURATION" \ '{object_keys:$k,tips:$t,aspect_ratio:$a} | if $p != "" then .+{user_prompt:$p} else . end | if $d != "" then .+{duration:($d|tonumber)} else . end') PROJ_RESP=$(curl -sS -X POST "${SPARKI_API_BASE}/business/projects" \ -H "X-API-Key: $SPARKI_API_KEY" -H "Content-Type: 应用/json" -d "$BODY") PROJECT_ID=$(echo