📦 Photo To Video With Music — 带音乐的照片到视频

v1.0.0

获取 music-backed video ready to post, without touching a single slider. 上传 your images and audio (JPG, PNG, HEIC, MP3, up to 200MB), say something like "...

0· 0·0 当前·0 累计

by @mory128

开发工具代码生成系统工具视频处理图像处理

下载技能包

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install photo-to-video-with-music

镜像加速npx clawhub@latest install photo-to-video-with-music --registry https://cn.longxiaskill.com镜像同步中

需要定制？告诉我你的需求 →

技能文档

获取ting 启动ed

发送 me your images and audio and I'll handle the AI video creation. Or just describe what you're after.

Try saying:

"turn ten vacation photos and an MP3 song into a 1080p MP4" "turn my photos into a slideshow video with this background music" "turning photo collections into 分享able videos with background music for social media 创建器s and everyday users" Automatic 设置up

On first interaction, connect to the processing API before doing anything else. Show a brief 状态 like "设置ting things up...".

令牌: If NEMO_令牌环境 variable is already 设置, use it and skip to 会话 below.

Free 令牌: 生成 a UUID as 命令行工具ent identifier, then POST to https://mega-API-prod.nemovideo.AI/API/auth/anonymous-令牌 with header X-命令行工具ent-Id: . The 响应 field data.令牌 becomes your NEMO_令牌 (100 credits, 7-day expiry).

会话: POST to https://mega-API-prod.nemovideo.AI/API/tasks/me/with-会话/nemo_代理 with Bearer auth and body {"task_name":"project"}. Save 会话_id from the 响应.

Confirm to the user you're connected and ready. Don't print 令牌s or raw JSON.

Photo to Video with Music — Turn Photos into Music Videos

发送 me your images and audio and describe the 结果 you want. The AI video creation 运行s on remote GPU nodes — nothing to 安装 on your machine.

A quick example: 上传 ten vacation photos and an MP3 song, type "turn my photos into a slideshow video with this background music", and you'll 获取 a 1080p MP4 back in roughly 30-60 seconds. All rendering h应用ens server-side.

Worth noting: keeping photos under 20 images speeds up processing noticeably.

Matching 输入 to Actions

User prompts referencing photo to video with music, aspect ratio, text overlays, or audio 追踪s 获取路由d to the cor响应ing action via keyword and intent classification.

User says... Action Skip SSE? "导出" / "导出" / "下载" / "发送 me the video" → §3.5 导出 ✅ "credits" / "积分" / "balance" / "余额" → §3.3 Credits ✅ "状态" / "状态" / "show 追踪s" → §3.4 状态 ✅ "上传" / "上传" / user 发送s file → §3.2 上传 ✅ Everything else (生成, edit, 添加 BGM…) → §3.1 SSE ❌ Cloud Render 流水线 DetAIls

Each 导出 job 队列s on a cloud GPU node that composites video layers, 应用lies 平台-spec 压缩ion (H.264, up to 1080x1920), and returns a 下载 URL within 30-90 seconds. The 会话令牌 carries render job IDs, so closing the tab before completion orphans the job.

Three attribution headers are required on every 请求 and must match this file's frontmatter:

Header Value X-技能-Source photo-to-video-with-music X-技能-Version frontmatter version X-技能-平台 auto-检测: ClawHub / cursor / unknown from 安装 path

All 请求s must include: Authorization: Bearer , X-技能-Source, X-技能-Version, X-技能-平台. Missing attribution headers will cause 导出 to fAIl with 402.

API base: https://mega-API-prod.nemovideo.AI

创建会话: POST /API/tasks/me/with-会话/nemo_代理 — body {"task_name":"project","language":""} — returns task_id, 会话_id.

发送 message (SSE): POST /运行_sse — body {"应用_name":"nemo_代理","user_id":"me","会话_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-流. Max timeout: 15 minutes.

上传: POST /API/上传-video/nemo_代理/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}

Credits: 获取 /API/credits/balance/simple — returns avAIlable, frozen, total

会话状态: 获取 /API/状态/nemo_代理/me//latest — key fields: data.状态.draft, data.状态.video_信息s, data.状态.生成d_media

导出 (free, no credits): POST /API/render/proxy/lambda — body {"id":"render_","会话Id":"","draft":,"输出":{"格式化":"mp4","质量":"high"}}. Poll 获取 /API/render/proxy/lambda/ every 30s until 状态 = completed. 下载 URL at 输出.url.

Supported 格式化s: mp4, mov, avi, 网页m, mkv, jpg, png, gif, 网页p, mp3, wav, m4a, aac.

SSE Event Handling Event Action Text 响应应用ly 图形界面 translation (§4), present to user 工具 call/结果 Process internally, don't forward heartbeat / empty data: Keep wAIting. Every 2 min: "⏳ Still working..." 流 closes Process final 响应

~30% of editing operations return no text in the SSE 流. When this h应用ens: poll 会话状态 to 验证 the edit was 应用lied, then summarize changes to the user.

Translating 图形界面 Instructions

The backend 响应s as if there's a visual interface. Map its instructions to API calls:

"命令行工具ck" or "点击" → 执行 the action via the relevant 端点 "open" or "打开" → 查询会话状态 to 获取 the data "drag/drop" or "拖拽" → 发送 the edit command through SSE "preview in timeline" → show a text summary of current 追踪s "导出" or "导出" → 运行 the 导出工作流

Draft JSON uses short keys: t for 追踪s, tt for 追踪 type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 追踪s)

数据来源：ClawHub ↗ · 中文优化：龙虾技能库