🎬 Text To Video Best App — 技能工具

v1.0.0

Turn a 150-word product description into 1080p ready-to-share videos just by typing what you need. Whether it's generating videos from written scripts or tex...

0· 15·0 当前·0 累计
whitejohnk-26 头像by @whitejohnk-26·MIT-0
下载技能包
License
MIT-0
最后更新
2026/4/17
0
安全扫描
VirusTotal
无害
查看报告
OpenClaw
安全
medium confidence
The skill's requirements and runtime instructions are internally consistent with a text→video cloud-rendering service; nothing requested appears disproportionate, though there are minor metadata inconsistencies and a few runtime behaviors you should be aware of.
评估建议
This skill appears to be what it says: a cloud text→video frontend that uses a single service token. Before installing: (1) Confirm you trust the domain (mega-api-prod.nemovideo.ai) since the skill will upload text and any files you provide; (2) If you prefer, supply your own NEMO_TOKEN rather than letting the skill obtain an anonymous token; (3) Be mindful when uploading local files — only upload files you intend to send to the service; (4) Clarify the config path mention (~/.config/nemovideo/)...
详细分析 ▾
用途与能力
Name/description (text-to-video) match the actions described in SKILL.md (session creation, uploads, render/export endpoints). The single required credential (NEMO_TOKEN) is appropriate for authenticating to the stated backend.
指令范围
Runtime instructions include creating an anonymous token if none is present, creating/storing a session_id, uploading user files (multipart or URL), and polling render status. These are expected for a cloud render workflow, but the skill will perform network requests and may upload user-supplied local files when the user asks to upload — ensure you only upload files you intend to share. The skill also derives X-Skill-Platform from install paths, which implies it may check filesystem paths to determine platform attribution.
安装机制
Instruction-only skill with no install spec and no code files. No downloads or archive extraction are requested, so there is low installation risk.
凭证需求
Only NEMO_TOKEN is declared as required, which is proportional. However, the SKILL.md frontmatter also lists a config path (~/.config/nemovideo/) while the registry metadata reported none — a minor metadata mismatch worth clarifying. The skill's behavior to auto-request an anonymous token if NEMO_TOKEN is absent is consistent with its purpose.
持久化与权限
always is false and the skill does not request persistent system-level privileges. It instructs storing a session_id for use during the session, which is normal. Autonomous invocation is allowed (platform default) but is not combined with broad credential access here.
安全有层次,运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发,无需署名。

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/17

- Initial release of Text to Video Best App: convert text prompts or scripts into ready-to-share 1080p videos in minutes. - Automatically handles authentication and session setup; 100 free credits for new users. - Supports direct uploads, script-to-video conversion, audio/background music, and quick export to MP4 and other formats. - Includes error handling, session state management, and workflow tips for efficient video production. - Easy prompt matching to streamline tasks like upload, credits check, export, and video editing.

无害

安装命令

点击复制
官方npx clawhub@latest install text-to-video-best-app
镜像加速npx clawhub@latest install text-to-video-best-app --registry https://cn.longxiaskill.com

技能文档

Getting Started

Send me your text prompts and I'll handle the AI video creation. Or just describe what you're after.

Try saying:

  • "convert a 150-word product description into a 1080p MP4"
  • "turn this script into a 30-second video with visuals and background music"
  • "generating videos from written scripts or text prompts for marketers, content creators, educators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  • Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  • Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer , Content-Type: application/json, and body {"task_name":"project","language":""}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

# Text to Video Best App — Convert Text Into Shareable Videos

This tool takes your text prompts and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 150-word product description and want to turn this script into a 30-second video with visuals and background music — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter scripts under 100 words produce faster and more focused videos.

Matching Input to Actions

User prompts referencing text to video best app, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is text-to-video-best-app, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

All requests must include: Authorization: Bearer , X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me//latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/ every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn this script into a 30-second video with visuals and background music" — concrete instructions get better results.

Max file size is 200MB. Stick to TXT, DOCX, PDF, SRT for the smoothest experience.

Export as MP4 for widest compatibility across social platforms and devices.

Common Workflows

Quick edit: Upload → "turn this script into a 30-second video with visuals and background music" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

数据来源ClawHub ↗ · 中文优化:龙虾技能库