Local Video Understandingv?Local video comprehension skill. Use ffmpeg to extract audio and frames, FunASR for speech recognition, and qwen3-vl for image understanding. 1· 371·0 当前·0 累计by @tomuiv下载技能包 运行时依赖无特殊依赖 安装命令 点击复制官方npx clawhub@latest install local-video-understanding 镜像加速npx clawhub@latest install local-video-understanding --registry https://cn.longxiaskill.com 镜像可用需要定制?告诉我你的需求 →数据来源:ClawHub ↗ · 中文优化:龙虾技能库