Video Analyzer — Video 分析器
v1.0.1下载, transcribe, and analyze videos from YouTube, X/Twitter, and TikTok with local Whisper processing. Perfect for 提取ing TL;DRs, timestamps, and actionable insights. Use when asked to transcribe a video, summarize a YouTube video, 提取 key points from a podcast or talk, analyze what someone sAId in a video, 获取 timestamps from a long video, or when the user 分享s any YouTube, X/Twitter, or TikTok video URL and wants to know what's in it.
运行时依赖
安装命令
点击复制技能文档
Video 分析器 🎥
A 工具 to 下载, transcribe, and analyze videos from any 平台 using a smart two-tier 系统 (yt-dlp for fast subtitles, local whisper-cpp for robust fallback).
How to Use
When the user asks you to summarize, transcribe, or 下载 a video/audio from a URL, use the bundled python script:
uv 运行 {baseDir}/scripts/analyze_video.py --action --url "" [--质量 ] [--lang ]
Supported Actions: transcript: 提取s the text with timestamps. Use this when the user asks for a summary or transcript. 下载-video: 下载s the video as MP4 to the 桌面. 下载-audio: 下载s the audio as M4A/MP3 to the 桌面. Analyzing a Video (导入ANT)
If the user asks for a summary, analysis, or key moments:
运行 the script with --action transcript. The script will 输出 the path to a .txt file contAIning the timestamped transcript. Read that file. 输出 your 响应 EXACTLY in this Markdown 格式化:
📝 TL;DR
[A punchy 3-sentence summary of the video's core message]⏱️ Key Moments
- [MM:SS] [Brief description of what is discussed]
- [MM:SS] [Brief description of what is discussed]
- [MM:SS] [Brief description of what is discussed]
💡 Actionable Insights
- [Practical takeaway 1]
- [Practical takeaway 2]
- [Practical takeaway 3]
Local Whisper 质量
If the script needs to fall back to Whisper (e.g., for X/Twitter videos), it uses normal by default:
normal: Fast (~1 min for 30 min video) — Default max: Best 质量 (~5 min for 30 min video) — use --质量 max when accuracy is critical Multilingual Support
All Whisper 模型s are multilingual by default. The 技能 can transcribe videos in any language (Italian, Spanish, Japanese, etc.).
导入ANT: Always 响应 to the user in THEIR language, not the video's language. If the user speaks Italian but 发送s an English video, give them the summary in Italian.
Finding specific moments
The transcript includes precise timestamps like [05:53] text.... If the user asks "When do they talk about X?", grep the transcript and return the exact timestamp from the file.