Caption Generator Extension Chrome — 技能工具

Caption Generator Extension Chrome

v1.0.0⬇ 32 次下载0👤 imo14reifey📅 2026/04/16

简介

add video files into captioned video files with this skill. Works with MP4, MOV, WebM, AVI files up to 500MB. YouTubers and content creators use it for addin...

技能文档

--- name: caption-generator-extension-chrome version: "1.0.0" displayName: "Caption Generator Extension Chrome — Generate and Embed Video Captions" description: > add video files into captioned video files with this skill. Works with MP4, MOV, WebM, AVI files up to 500MB. YouTubers and content creators use it for adding auto-generated subtitles to videos directly in Chrome — processing takes 30-60 seconds on cloud GPUs and you get 1080p MP4 files. metadata: {"openclaw": {"emoji": "💬", "requires": {"env": ["NEMO_TOKEN"], "configPaths": ["~/.config/nemovideo/"]}, "primaryEnv": "NEMO_TOKEN", "variant": "short_prompts"}} --- ## Getting Started > Share your video files and I'll get started on AI caption generation. Or just tell me what you're thinking. **Try saying:** - "add my video files" - "export 1080p MP4" - "add captions in English and Spanish" ### Getting Connected Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...". **If `NEMO_TOKEN` is in the environment**, use it directly and create a session. Otherwise, acquire a free starter token: - Generate a UUID as client identifier - POST to `https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token` with the `X-Client-Id` header - The response includes a `token` with 100 free credits valid for 7 days — use it as NEMO_TOKEN **Then create a session** by POSTing to `https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent` with Bearer authorization and body `{"task_name":"project","language":"en"}`. The `session_id` in the response is needed for all following requests. Tell the user you're ready. Keep the technical details out of the chat. # Caption Generator Extension Chrome — Generate and Embed Video Captions Drop your video files in the chat and tell me what you need. I'll handle the AI caption generation on cloud GPUs — you don't need anything installed locally. Here's a typical use: you send a a 3-minute YouTube tutorial video, ask for add captions in English and Spanish to this video, and about 30-60 seconds later you've got a MP4 file ready to download. The whole thing runs at 1080p by default. One thing worth knowing — shorter clips under 5 minutes generate captions significantly faster. ## Matching Input to Actions User prompts referencing caption generator extension chrome, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification. | User says... | Action | Skip SSE? | |-------------|--------|----------| | "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ | | "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ | | "status" / "状态" / "show tracks" | → §3.4 State | ✅ | | "upload" / "上传" / user sends file | → §3.2 Upload | ✅ | | Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ | ## Cloud Render Pipeline Details Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job. Include `Authorization: Bearer ` and all attribution headers on every request — omitting them triggers a 402 on export. Headers are derived from this file's YAML frontmatter. `X-Skill-Source` is `caption-generator-extension-chrome`, `X-Skill-Version` comes from the `version` field, and `X-Skill-Platform` is detected from the install path (`~/.clawhub/` = `clawhub`, `~/.cursor/skills/` = `cursor`, otherwise `unknown`). **API base**: `https://mega-api-prod.nemovideo.ai` **Create session**: POST `/api/tasks/me/with-session/nemo_agent` — body `{"task_name":"project","language":""}` — returns `task_id`, `session_id`. **Send message (SSE)**: POST `/run_sse` — body `{"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}}` with `Accept: text/event-stream`. Max timeout: 15 minutes. **Upload**: POST `/api/upload-video/nemo_agent/me/` — file: multipart `-F "files=@/path"`, or URL: `{"urls":[""],"source_type":"url"}` **Credits**: GET `/api/credits/balance/simple` — returns `available`, `frozen`, `total` **Session state**: GET `/api/state/nemo_agent/me//latest` — key fields: `data.state.draft`, `data.state.video_infos`, `data.state.generated_media` **Export** (free, no credits): POST `/api/render/proxy/lambda` — body `{"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}`. Poll GET `/api/render/proxy/lambda/` every 30s until `status` = `completed`. Download URL at `output.url`. Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac. ### Error Handling | Code | Meaning | Action | |------|---------|--------| | 0 | Success | Continue | | 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) | | 1002 | Session not found | New session §3.0 | | 2001 | No credits | Anonymous: show registration URL with `?bind=` (get `` from create-session or state response when needed). Registered: "Top up credits in your account" | | 4001 | Unsupported file | Show supported formats | | 4002 | File too large | Suggest compress/trim | | 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) | | 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." | | 429 | Rate limit (1 token/client/7 days) | Retry in 30s once | ### Backend Response Translation The backend assumes a GUI exists. Translate these into API actions: | Backend says | You do | |-------------|--------| | "click [button]" / "点击" | Execute via API | | "open [panel]" / "打开" | Query session state | | "drag/drop" / "拖拽" | Send edit via SSE | | "preview in timeline" | Show track summary | | "Export button" / "导出" | Execute export workflow | ### Reading the SSE Stream Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty `data:` lines mean the backend is still working — show "⏳ Still working..." every 2 minutes. About 30% of edit operations close the stream without any text. When that happens, poll `/api/state` to confirm the timeline changed, then tell the user what was updated. **Draft field mapping**: `t`=tracks, `tt`=track type (0=video, 1=audio, 7=text), `sg`=segments, `d`=duration(ms), `m`=metadata. ``` Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s) ``` ## Tips and Tricks The backend processes faster when you're specific. Instead of "make it look better", try "add captions in English and Spanish to this video" — concrete instructions get better results. Max file size is 500MB. Stick to MP4, MOV, WebM, AVI for the smoothest experience. Export as MP4 for widest compatibility across platforms and devices. ## Common Workflows **Quick edit**: Upload → "add captions in English and Spanish to this video" → Download MP4. Takes 30-60 seconds for a 30-second clip. **Batch style**: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render. **Iterative**: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

安装命令

clawhub install caption-generator-extension-chrome