Free Text To Video Ai Generator — 技能工具

Name: Free Text To Video Ai Generator — 技能工具
Author: siddylcon

siddylcon

🎬 Free Text To Video Ai Generator — 技能工具

v1.0.0

Cloud-based free-text-to-video-ai-generator tool that handles generating videos from written text or scripts. Upload TXT, DOCX, PDF, plain text files (up to...

0· 35·0 当前·0 累计

by @siddylcon·MIT-0

开发工具代码生成 API工具文件处理

下载技能包

License

MIT-0

最后更新

2026/4/11

安全扫描

VirusTotal

无害

查看报告

OpenClaw

可疑

medium confidence

The skill's declared purpose (text→video) matches its runtime instructions and required token, but there are inconsistencies (missing source/homepage, mismatched metadata about config paths) and it will request/use a bearer token and upload local files — so proceed with caution.

评估建议

This skill appears to do what it says (text→video) and only needs a single API token, but there are a few red flags: no source/homepage to verify the vendor, and a mismatch in metadata about a local config path (~/.config/nemovideo/) which could indicate sloppy packaging or an attempt to access local config. Before installing: (1) ask the publisher for a homepage, privacy policy, and API ownership info; (2) do not store sensitive credentials in NEMO_TOKEN — use a limited/ephemeral token if possi...

详细分析 ▾

ℹ 用途与能力

Name/description align with the instructions: the skill calls a cloud API to create videos from text and requires a NEMO_TOKEN for that API. However, the SKILL.md frontmatter declares a required config path (~/.config/nemovideo/) while the registry metadata shows no required config paths — an incoherence that should be clarified. Source/homepage are absent, so the backend domain and ownership cannot be independently verified.

ℹ 指令范围

Runtime instructions are specific and constrained to the described service: obtain (or create) a token, create a session, send SSE messages, upload user files (multipart or URL), poll rendering status, and download results. These instructions do require the agent to read the skill's own frontmatter and detect install path to populate attribution headers, and to access any local file paths the user asks to upload. The skill does not instruct the agent to read unrelated system files or arbitrary secrets, but the upload capability means user-provided files can be transmitted to the remote API.

ℹ 安装机制

No install spec and no code files (instruction-only), which reduces disk-write risk. However, the skill has no verifiable source or homepage and points at a single API host (mega-api-prod.nemovideo.ai) — lack of provenance increases risk because you cannot audit the backend or confirm operator identity.

ℹ 凭证需求

Only one required environment variable is declared (NEMO_TOKEN), which is proportional for a cloud API client. The SKILL.md allows creating an anonymous token if none is present. The inconsistency between the SKILL.md frontmatter (which lists a config path) and the registry metadata (which lists none) is a discrepancy to resolve — the config path could grant access to local credentials if used, so its presence in one place but not the other is concerning.

✓ 持久化与权限

The skill is not always-enabled and uses normal autonomous invocation. It does not request elevated platform privileges. Be aware that autonomous invocation plus network access means the agent could call the remote API without further prompts, but that is standard for skills that implement cloud services.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/11

Initial release of Free Text to Video AI Generator. - Instantly generate 1080p MP4 videos from text prompts or uploaded documents (TXT, DOCX, PDF). - Simple, cloud-based workflow: upload, describe your needs, and receive a finished video in 1–2 minutes. - Automatic connection, authentication, and session management—no manual setup required. - Supports timeline editing, BGM, aspect ratio changes, overlays, and batching via chat instructions. - Free usage tier with 100 credits valid for 7 days; export jobs are queued and trackable. - Robust error handling, live status updates, and support for common video and audio formats.

● 无害

安装命令点击复制

官方npx clawhub@latest install free-text-to-video-ai-generator

镜像加速npx clawhub@latest install free-text-to-video-ai-generator --registry https://cn.clawhub-mirror.com

技能文档

Getting Started

Got text prompts to work with? Send it over and tell me what you need — I'll take care of the AI video creation.

Try saying:

"generate a 100-word product description into a 1080p MP4"
"turn this text into a 30-second explainer video with visuals and voiceover"
"generating videos from written text or scripts for marketers, content creators, students"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

# Free Text to Video AI Generator — Turn Text Into AI Videos

This tool takes your text prompts and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 100-word product description and want to turn this text into a 30-second explainer video with visuals and voiceover — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter, clearer text prompts produce more accurate and focused video results.

Matching Input to Actions

User prompts referencing free text to video ai generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Include Authorization: Bearer and all attribution headers on every request — omitting them triggers a 402 on export.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: free-text-to-video-ai-generator
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me//latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/ every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=` (get from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Common Workflows

Quick edit: Upload → "turn this text into a 30-second explainer video with visuals and voiceover" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn this text into a 30-second explainer video with visuals and voiceover" — concrete instructions get better results.

Max file size is 500MB. Stick to TXT, DOCX, PDF, plain text for the smoothest experience.

Export as MP4 for widest compatibility across social platforms and devices.

Getting Started

Got text prompts to work with? Send it over and tell me what you need — I'll take care of the AI video creation.

Try saying:

"generate a 100-word product description into a 1080p MP4"
"turn this text into a 30-second explainer video with visuals and voiceover"
"generating videos from written text or scripts for marketers, content creators, students"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

# Free Text to Video AI Generator — Turn Text Into AI Videos

This tool takes your text prompts and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 100-word product description and want to turn this text into a 30-second explainer video with visuals and voiceover — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter, clearer text prompts produce more accurate and focused video results.

Matching Input to Actions

User prompts referencing free text to video ai generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Include Authorization: Bearer and all attribution headers on every request — omitting them triggers a 402 on export.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: free-text-to-video-ai-generator
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me//latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/ every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=` (get from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Common Workflows

Quick edit: Upload → "turn this text into a 30-second explainer video with visuals and voiceover" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn this text into a 30-second explainer video with visuals and voiceover" — concrete instructions get better results.

Max file size is 500MB. Stick to TXT, DOCX, PDF, plain text for the smoothest experience.

Export as MP4 for widest compatibility across social platforms and devices.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

Getting Started

Getting Connected

Matching Input to Actions

Cloud Render Pipeline Details

Error Handling

Translating GUI Instructions

Reading the SSE Stream

Common Workflows

Tips and Tricks

Getting Started

Getting Connected

Matching Input to Actions

Cloud Render Pipeline Details

Error Handling

Translating GUI Instructions

Reading the SSE Stream

Common Workflows

Tips and Tricks

安装命令点击复制