Text To Image Ai Free — 文本转图像 AI (免费)

Name: Text To Image Ai Free — 文本转图像 AI (免费)
Author: tk8544-b

tk8544-b

🖼️ Text To Image Ai Free — 文本转图像 AI (免费)

v1.0.0

利用此技能，内容创作者可以将文本提示转换为 AI 生成的视觉效果。支持 JPG、PNG、WEBP、MP4 等格式，最大 200MB，云端 GPU 渲染 1080p，20-40 秒内输出 MP4 文件。适用于视频内容创作者的图像生成。

0· 27·0 当前·0 累计

by @tk8544-b·MIT-0

开发工具 API工具 AI模型访问视频处理图像处理

下载技能包

License

MIT-0

最后更新

2026/4/11

安全扫描

VirusTotal

无害

查看报告

OpenClaw

可疑

medium confidence

该技能的声明目的（文本→图像/视频通过 Nemo 后端）基本与其指令相符，但存在不一致和小范围的功能蔓延（自动匿名令牌创建、隐式文件系统探测和声明配置路径不匹配），安装前应谨慎。

评估建议

["使用可信来源的自己的 NEMO_TOKEN，而不是允许代理在您的 behalf 上生成匿名令牌。","了解代理将上传您提供的任何文件（最多约 200MB）到外部服务——检查敏感内容的隐私/条款。","指令要求代理探测安装路径以设置头部并引用 SKILL.md 中未在注册表中声明的配置路径——如果您想限制文件系统访问，请确认代理运行时是否实际执行这些检查。","如果继续，监控网络请求并避免保留您不控制的令牌。如需更高的保证，请向发布者请求源代码或官方主页，并澄清令牌存储位置和配置路径的用途。"]...

详细分析 ▾

ℹ 用途与能力

技能名称/描述与运行时指令和所需环境变量（NEMO_TOKEN）匹配。技能合法地调用远程渲染 API 并上传用户媒体。然而，SKILL.md 前置声明了一个配置路径（~/.config/nemovideo/），未列在注册表级的“必需配置路径”中（注册表元数据未指定）——值得注意的元数据不匹配。

⚠ 指令范围

指令告诉代理检查 NEMO_TOKEN，如果缺失，生成 UUID 并调用匿名令牌端点获取令牌；创建会话，上传用户文件，流式传输 SSE，并轮询状态。这些操作需要网络访问，并将用户媒体（最多 200MB）发送到远程后端。指令还要求代理检测其安装路径以设置归属头（这需要探测文件系统），这在声明的环境变量之外，但在每个请求中使用。该流程对于渲染技能是连贯的，但将代理操作扩展到简单 API 调用之外（令牌创建 + 文件系统检查 + 大文件上传）。

✓ 安装机制

无安装规范和代码文件 — 仅指令技能。这是低安装风险（安装程序下载或写入的内容为空）。

ℹ 凭证需求

仅声明了一个环境变量（NEMO_TOKEN），适合 API 支持的渲染服务。然而，SKILL.md 表示如果没有令牌，自动创建和使用匿名令牌（代理将调用身份验证端点获取短期令牌）。SKILL.md 前置也引用了注册表清单中未存在的配置路径（~/.config/nemovideo/）——此不匹配意味着技能可能期望未声明的基于磁盘的配置。

✓ 持久化与权限

使用 always:false 和正常的自主调用；技能不请求升级或永久的平台权限。没有指令修改其他技能或全局代理设置。潜在的持久性风险是，如果运行时存储令牌，代理可能获取和重用匿名令牌（7 天过期）——SKILL.md 没有明确指示持久存储令牌。

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/11

Text to Image AI Free 的初始发布。- 为视频内容创作者生成 AI 图像从文本提示。- 无需本地设置，云端 GPU 运行，输出 1080p MP4 文件，耗时 20-40 秒。- 支持常见文件格式（JPG、PNG、WEBP、MP4 等），最大 200MB。- 自动处理用户身份验证和会话创建。- 提供生成、上传、导出、检查信用或状态的操作。- 流畅的工作流和清晰的错误处理与用户通知。

● 无害

安装命令点击复制

官方npx clawhub@latest install text-to-image-ai-free

镜像加速npx clawhub@latest install text-to-image-ai-free --registry https://cn.clawhub-mirror.com

技能文档

内容创作者可以将文本提示转换为 AI 生成的视觉效果。支持 JPG、PNG、WEBP、MP4 等格式，最大 200MB，云端 GPU 渲染 1080p，20-40 秒内输出 MP4 文件。适用于视频内容创作者的图像生成。

name: text-to-image-ai-free version: "1.0.0" displayName: "文本转图像 AI (免费) — 生成图像从文本提示" description: > 内容创作者可以将文本提示转换为 AI 生成的视觉效果。支持 JPG、PNG、WEBP、MP4 等格式，最大 200MB，云端 GPU 渲染 1080p，20-40 秒内输出 MP4 文件。适用于视频内容创作者的图像生成。 metadata: {"openclaw": {"emoji": "🖼️", "requires": {"env": ["NEMO_TOKEN"], "configPaths": ["~/.config/nemovideo/"]}, "primaryEnv": "NEMO_TOKEN", "variant": "greeting_v2"}}

获取开始

请发送您的文本提示，我将处理 AI 图像生成。或者，只需描述您想要的内容。

尝试说：

"使用 '山湖日落' 这样的短文本提示生成 1080p MP4"
"生成一幅夜间霓虹灯下的未来城市的真实图像"
"为内容创作者生成视频内容的文本描述图像"

... （以下内容由于长度限制，仅提供部分翻译，完整内容请参考原文）

Getting Started

Send me your text prompts and I'll handle the AI image generation. Or just describe what you're after.

Try saying:

"generate a short descriptive text prompt like 'sunset over a mountain lake' into a 1080p MP4"
"generate a realistic image of a futuristic city at night with neon lights"
"generating images from text descriptions for video content for content creators"

Quick Start Setup

This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").

Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:

Generate a UUID as client identifier
POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header
Extract data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)

Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.

Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.

# Text to Image AI Free — Generate Images from Text Prompts

Drop your text prompts in the chat and tell me what you need. I'll handle the AI image generation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a a short descriptive text prompt like 'sunset over a mountain lake', ask for generate a realistic image of a futuristic city at night with neon lights, and about 20-40 seconds later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — shorter, specific prompts tend to produce more accurate and usable images.

Matching Input to Actions

User prompts referencing text to image ai free, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Base URL: https://mega-api-prod.nemovideo.ai

Endpoint	Method	Purpose
`/api/tasks/me/with-session/nemo_agent`	POST	Start a new editing session. Body: `{"task_name":"project","language":""}`. Returns `session_id`.
`/run_sse`	POST	Send a user message. Body includes `app_name`, `session_id`, `new_message`. Stream response with `Accept: text/event-stream`. Timeout: 15 min.
`/api/upload-video/nemo_agent/me/`	POST	Upload a file (multipart) or URL.
`/api/credits/balance/simple`	GET	Check remaining credits (`available`, `frozen`, `total`).
`/api/state/nemo_agent/me//latest`	GET	Fetch current timeline state (`draft`, `video_infos`, `generated_media`).
`/api/render/proxy/lambda`	POST	Start export. Body: `{"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}`. Poll status every 30s.

Accepted file types: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is text-to-image-ai-free, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

Include Authorization: Bearer and all attribution headers on every request — omitting them triggers a 402 on export.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=` (get from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "generate a realistic image of a futuristic city at night with neon lights" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, MP4 for the smoothest experience.

Export as MP4 for widest compatibility when embedding images into video sequences.

Common Workflows

Quick edit: Upload → "generate a realistic image of a futuristic city at night with neon lights" → Download MP4. Takes 20-40 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

获取开始

Getting Started

Quick Start Setup

Matching Input to Actions

Cloud Render Pipeline Details

Error Handling

Reading the SSE Stream

Backend Response Translation

Tips and Tricks

Common Workflows

安装命令点击复制