使用 Veo 3 生成电影级视频片段,在浏览器预览中审查,通过反馈进行迭代,并组装最终的 A/B 测试视频 — 所有操作都以最小的 token 消耗完成。
快速开始
cd ~/.openclaw/workspace/skills/video-production# 1. 从故事板生成所有片段
.venv/bin/python3 scripts/batch_generate.py --storyboard /path/to/storyboard.json
# 2. 打开浏览器预览
.venv/bin/python3 scripts/generate_preview.py --storyboard /path/to/storyboard.json
# 3. (反馈后) 仅重新生成修订场景
.venv/bin/python3 scripts/apply_feedback.py --storyboard storyboard.json --feedback feedback.json
# 4. 组装最终视频
.venv/bin/python3 scripts/ffmpeg_assembler.py --storyboard storyboard.json
A/B 视频架构
目标:15秒视频,3个片段 × 每个5秒
[HOOK: 5s] → [CORE: 5s] → [CTA/PAYOFF: 5s]
↑ ↑
swap for A/B swap for A/B
经济性:
- 5个 Veo 提示 → 4个独特的 A/B 视频(2个钩子 × 1个核心 × 2个 CTA)
- 7个提示 → 9个视频 | 9个提示 → 16+个视频
- 在 5秒和 10秒处过渡 — 便于分析
流程概述
storyboard.json
↓
batch_generate.py → clips/scene_01.mp4 ... scene_05.mp4
↓
generate_preview.py → preview.html (在浏览器中打开,零 token)
↓
[审查 + 将反馈 JSON 粘贴到 Muffin]
↓
[Muffin 建议修订提示,更新 storyboard.json]
↓
apply_feedback.py → 仅重新生成 'revise' 场景
↓
ffmpeg_assembler.py → final_AA.mp4, final_BA.mp4, final_AB.mp4, final_BB.mp4
Token 成本: 仅在编写故事板 + 解释反馈时产生。预览、生成和组装都是零 token。
故事板格式
{
"project": "my-video",
"output_dir": "clips",
"final_output": "final.mp4",
"scenes": [
{
"id": "scene_01",
"role": "hook_a",
"label": "Hook A",
"order": 1,
"duration": 5,
"aspect_ratio": "16:9",
"prompt": "..."
}
],
"_ab_combinations": {
"video_1_AA": ["scene_01", "scene_03", "scene_04"],
"video_2_BA": ["scene_02", "scene_03", "scene_04"],
"video_3_AB": ["scene_01", "scene_03", "scene_05"],
"video_4_BB": ["scene_02", "scene_03", "scene_05"]
}
}
完整模板见 scripts/storyboard_template.json。
反馈格式
审查 preview.html 后将此 JSON 粘贴到 Muffin:
{
"scenes": [
{ "id": "scene_01", "action": "approve", "notes": "" },
{ "id": "scene_02", "action": "revise", "notes": "slower camera, warmer light" }
]
}
Veo 3 API — 当前限制(Gemini API,2026-02-23 验证)
| 参数 | 支持 |
|---|
aspect_ratio | ✅ |
number_of_videos | ✅ |
negative_prompt | ✅ |
duration_seconds | ❌ 损坏(即使值有效也抛出 400) |
fps | ❌ 仅 Vertex AI |
compression_quality | ❌ 仅 Vertex AI |
enhance_prompt | ❌ 仅 Vertex AI |
模型: veo-3.1-generate-preview(最佳)|
veo-3.1-fast-generate-preview |
veo-3.0-generate-001SDK: google-genai(不是 google-generativeai)
提示词技巧
每句话都要有动作 — Veo 从静态提示词产生的输出会卡顿。每句话都应描述相机或主体的运动。
角色连续性 — Veo 无法在片段之间保持完全相同的角色。在包含同一角色的每个场景中显式描述物理细节。
✅ "与开场相同的客户角色 — 深色夹克,专业气质,30-40岁"
剪辑连续性 — 为实现无缝剪辑,每个提示词都应以前一个片段结束时的颜色/光线状态开头。
✅ "温暖的琥珀色光线,与后期制作室的直接视觉延续..."
单一连续镜头 — 每个提示词都是一个连续的片段。设计为一个揭示多个元素的相机移动 — 而不是蒙太奇描述。
内容政策 — 环境/道具场景生成可靠。打电话的紧张人物可能会无声返回无视频。让人物保持冷静或改为描述环境。
配额管理
达到每日限制(429 RESOURCE_EXHAUSTED)时,使用配额监视器:
# 设置每 30 分钟重试一次的 cron,完成时发送短信给 Master
chmod +x scripts/quota_watcher.sh# 添加到 crontab:
(crontab -l 2>/dev/null | grep -v quota_watcher; \
echo "/30 * /path/to/quota_watcher.sh >> /tmp/quota_watcher.log 2>&1") | crontab -
通用模式见 api-quota-watcher 技能。
脚本
| 脚本 | 用途 |
|---|
scripts/batch_generate.py | 从故事板生成所有场景,跳过已存在的 |
scripts/generate_preview.py | 构建带有视频播放器 + 反馈表单的 preview.html |
scripts/apply_feedback.py | 仅重新生成标记为 'revise' 的场景 |
scripts/ffmpeg_assembler.py | 将批准的片段拼接 → 最终 MP4(剪切或交叉淡入淡出) |
scripts/quota_watcher.sh | 重试 + 通知 cron 以恢复配额 |
scripts/storyboard_template.json | 故事板模板起点 |
环境设置
cd ~/.openclaw/workspace/skills/video-production
uv venv .venv
uv pip install google-genai Pillow requests# API 密钥必须在 ~/.zshenv 中:
export GOOGLE_API_KEY="AIza..."
组装 A/B 组合
所有场景批准后,为每个组合运行组装器:
# 组装所有 4 个 A/B 视频
for combo in AA BA AB BB; do
# 编辑故事板或直接传递场景列表
.venv/bin/python3 scripts/ffmpeg_assembler.py \
--storyboard storyboard.json \
--output "final_${combo}.mp4"
done
或者在 storyboard.json 的 _ab_combinations 中硬编码 — 组装器会自动读取。
格式适配
| 格式 | 注意事项 |
|---|
| 16:9(主格式) | 默认 — 所有脚本使用此格式 |
| 9:16(竖屏) | 在故事板中将 aspect_ratio 更改为 "9:16" |
| 1:1(方形) | 将 aspect_ratio 更改为 "1:1" |
为获得最佳效果,为每种格式生成单独的故事板。不要在后期将 16:9 裁剪为 9:16 — 使用正确的宽高比重新生成。
Veo 3 擅长的方面
- 氛围/情绪镜头
- 流畅的相机运动(推进、升降、跟踪)
- 单个片段内的灯光过渡
- 办公室/工作室/城市环境
- 抽象美感(自然、空间、产品)
Veo 3 不擅长的方面
- 屏幕上的精确文字(通过 After Effects/Resolve 在后期添加)
- 在片段之间保持角色一致性
- 单个生成中非常快速的蒙太奇
- 复杂的多人场景
- 特定的道具/品牌细节
角色注册表与学习系统
全新默认值
每个新活动从头开始。 没有继承的角色,没有假设的演员阵容,没有来自先前运行的提示词权重。如果您想从过去的活动保持连续性,请明确说明:
"使用 MMM 活动中的 HERO_01"
角色 ID(默认引导)
如果未定义演员阵容,使用这些占位符:
HERO_01 — 主要 UGC 创建者
FRIEND_01 — 经常性配角
HAND_MODEL_01 — 仅手部产品展示者
第一个批准的输出成为该活动的规范身份基准。
角色圣经(每活动)
定义角色后,在项目文件夹中维护
character_registry.json:
{
"HERO_01": {
"identity": {
"age_range": "28-35",
"gender": "male",
"skin_tone": "...",
"hair": "...",
"build": "..."
},
"wardrobe": {
"preferred": [],
"avoid": [],
"signature": ""
},
"camera_rules": {
"preferred_framing": "medium close-up",
"avoid": []
},
"negative_constraints": [],
"reference_frames": [],
"phrase_weights": {}
}
}
CAST 块注入
定义角色后,每个提示词必须包含:
CAST:
- HERO: HERO_01 (identity locked; must match reference frames exactly)
Do not alter identity traits across frames or across future assets.
验证阈值
生成后,针对参考帧运行视觉模型一致性检查:
- >= 85 → 自动通过
- 75–84 → 升级到 Master(Telegram),不自动重新生成
- <= 74 → 自动失败,应用稳定补丁,重试一次 → 如果仍然失败则升级
学习循环
每次人工审查决策后,更新:
- 批准 → 增加产生良好一致性的短语权重;将最佳帧添加到批准的参考集
- 拒绝 → 识别漂移属性;降低或禁止导致漂移的短语;添加负面约束
- 边界 → 对该引擎+角色+场景组合应用稳定补丁
生成日志
每次尝试追加到
generation_log.jsonl(永不删除):
{
"timestamp": "...",
"campaign": "...",
"scene_id": "...",
"engine": "veo-3.1-generate-preview",
"attempt": 1,
"characters": ["HERO_01"],
"prompt": "...",
"output": "clips/scene_01.mp4",
"verification_score": 88,
"drift_notes": "",
"decision": "auto_pass",
"human_outcome": "approved",
"worked_phrases": [],
"failed_phrases": []
}
升级策略 — 猜测前先询问
通过 Telegram 升级到 Master(从不静默循环)当:
- 验证分数处于边界(75–84)
- 角色首次在新引擎上
- 场景类型对于该角色+引擎组合是新的
- 相同提示词连续失败 2+ 次
升级消息必须包含:场景 ID、引擎、分数、漂移注释和 2-3 个选项。
存档(跨活动持久)
尽管每个活动都从头开始,但这些在技能文件夹中持久存在:
generation_log.jsonl — 完整审计跟踪
approved_references/ — 按活动分类的规范帧,可应请求加载
campaign_phrase_weights/ — 每活动权重存档,可加载以保持连续性
Generate cinematic video clips with Veo 3, review them in a browser preview, iterate with feedback, and assemble final A/B test videos — all with minimal token spend.
Quick Start
cd ~/.openclaw/workspace/skills/video-production# 1. Generate all clips from storyboard
.venv/bin/python3 scripts/batch_generate.py --storyboard /path/to/storyboard.json
# 2. Open browser preview
.venv/bin/python3 scripts/generate_preview.py --storyboard /path/to/storyboard.json
# 3. (After feedback) Re-generate only revised scenes
.venv/bin/python3 scripts/apply_feedback.py --storyboard storyboard.json --feedback feedback.json
# 4. Assemble final video
.venv/bin/python3 scripts/ffmpeg_assembler.py --storyboard storyboard.json
A/B Video Architecture
Target: 15-second videos, 3 clips × 5s each
[HOOK: 5s] → [CORE: 5s] → [CTA/PAYOFF: 5s]
↑ ↑
swap for A/B swap for A/B
Economics:
- 5 Veo prompts → 4 unique A/B videos (2 hooks × 1 core × 2 CTAs)
- 7 prompts → 9 videos | 9 prompts → 16+ videos
- Transitions at 5s and 10s marks — clean for analytics
Pipeline Overview
storyboard.json
↓
batch_generate.py → clips/scene_01.mp4 ... scene_05.mp4
↓
generate_preview.py → preview.html (opens in browser, zero tokens)
↓
[review + paste feedback JSON to Muffin]
↓
[Muffin suggests revised prompts, updates storyboard.json]
↓
apply_feedback.py → re-generates only 'revise' scenes
↓
ffmpeg_assembler.py → final_AA.mp4, final_BA.mp4, final_AB.mp4, final_BB.mp4
Token cost: Only when writing storyboard + interpreting feedback. Preview, generation, and assembly are all zero tokens.
Storyboard Format
{
"project": "my-video",
"output_dir": "clips",
"final_output": "final.mp4",
"scenes": [
{
"id": "scene_01",
"role": "hook_a",
"label": "Hook A",
"order": 1,
"duration": 5,
"aspect_ratio": "16:9",
"prompt": "..."
}
],
"_ab_combinations": {
"video_1_AA": ["scene_01", "scene_03", "scene_04"],
"video_2_BA": ["scene_02", "scene_03", "scene_04"],
"video_3_AB": ["scene_01", "scene_03", "scene_05"],
"video_4_BB": ["scene_02", "scene_03", "scene_05"]
}
}
See scripts/storyboard_template.json for full template.
Feedback Format
Paste this JSON to Muffin after reviewing preview.html:
{
"scenes": [
{ "id": "scene_01", "action": "approve", "notes": "" },
{ "id": "scene_02", "action": "revise", "notes": "slower camera, warmer light" }
]
}
Veo 3 API — Current Limits (Gemini API, verified 2026-02-23)
| Parameter | Supported |
|---|
aspect_ratio | ✅ |
number_of_videos | ✅ |
negative_prompt | ✅ |
duration_seconds | ❌ Broken (throws 400 even with valid values) |
fps | ❌ Vertex AI only |
compression_quality | ❌ Vertex AI only |
enhance_prompt | ❌ Vertex AI only |
Models: veo-3.1-generate-preview (best) |
veo-3.1-fast-generate-preview |
veo-3.0-generate-001SDK: google-genai (NOT google-generativeai)
Prompting Techniques
Motion in every sentence — Veo produces laggy output from static prompts. Every sentence should describe camera OR subject movement.
Character continuity — Veo can't maintain exact characters across clips. Describe physical details explicitly in every scene that includes the same character.
✅ "The same client character from the opening — dark jacket, professional bearing, 30s-40s"
Stitch continuity — For seamless cuts, open each prompt with the color/light state the previous clip ends on.
✅ "Warm amber light, a direct visual continuation from the post-production suite..."
Single continuous shot — Each prompt is one continuous clip. Design it as one camera move that reveals multiple elements — not a montage description.
Content policy — Environmental/prop-only scenes generate reliably. Stressed people on phones can silently return no video. Keep humans calm or describe the environment instead.
Quota Management
When you hit the daily limit (429 RESOURCE_EXHAUSTED), use the quota watcher:
# Sets a cron that retries every 30 min, texts Master when done
chmod +x scripts/quota_watcher.sh# Add to crontab:
(crontab -l 2>/dev/null | grep -v quota_watcher; \
echo "/30 * /path/to/quota_watcher.sh >> /tmp/quota_watcher.log 2>&1") | crontab -
See api-quota-watcher skill for the generic pattern.
Scripts
| Script | Purpose |
|---|
scripts/batch_generate.py | Generate all scenes from storyboard, skip existing |
scripts/generate_preview.py | Build preview.html with video players + feedback form |
scripts/apply_feedback.py | Re-generate only scenes marked 'revise' |
scripts/ffmpeg_assembler.py | Stitch approved clips → final MP4 (cut or crossfade) |
scripts/quota_watcher.sh | Retry + notify cron for quota recovery |
scripts/storyboard_template.json | Starting storyboard template |
Environment Setup
cd ~/.openclaw/workspace/skills/video-production
uv venv .venv
uv pip install google-genai Pillow requests# API key must be in ~/.zshenv:
export GOOGLE_API_KEY="AIza..."
Assembling A/B Combinations
After all scenes approved, run assembler for each combo:
# Assemble all 4 A/B videos
for combo in AA BA AB BB; do
# Edit storyboard or pass scene list directly
.venv/bin/python3 scripts/ffmpeg_assembler.py \
--storyboard storyboard.json \
--output "final_${combo}.mp4"
done
Or hardcode in _ab_combinations in storyboard.json — assembler reads it automatically.
Format Adaptation
| Format | Notes |
|---|
| 16:9 (master) | Default — all scripts use this |
| 9:16 (vertical) | Change aspect_ratio to "9:16" in storyboard |
| 1:1 (square) | Change aspect_ratio to "1:1" |
Generate separate storyboards per format for best results. Don't crop 16:9 to 9:16 in post — re-generate with proper aspect.
What Veo 3 Does Well
- Atmospheric/mood shots
- Smooth camera movements (push-in, crane, tracking)
- Lighting transitions within a single clip
- Office/studio/urban environments
- Abstract beauty (nature, space, product)
What Veo 3 Struggles With
- Exact text on screen (add in post via After Effects/Resolve)
- Maintaining character consistency across clips
- Very fast montage within a single generation
- Complex multi-person scenes
- Specific prop/brand details
Character Registry & Learning System
Clean Slate Default
Every new campaign starts fresh. No inherited characters, no assumed cast, no prompt weights from previous runs. If you want continuity from a past campaign, explicitly say so:
"Use HERO_01 from the MMM campaign"
Character IDs (Bootstrap Defaults)
If no cast is defined, use these placeholders:
HERO_01 — Primary UGC creator
FRIEND_01 — Recurring side character
HAND_MODEL_01 — Hands-only product handler
First approved output becomes the canonical identity baseline for that campaign.
Character Bible (Per Campaign)
When characters are defined, maintain a
character_registry.json in the project folder:
{
"HERO_01": {
"identity": {
"age_range": "28-35",
"gender": "male",
"skin_tone": "...",
"hair": "...",
"build": "..."
},
"wardrobe": {
"preferred": [],
"avoid": [],
"signature": ""
},
"camera_rules": {
"preferred_framing": "medium close-up",
"avoid": []
},
"negative_constraints": [],
"reference_frames": [],
"phrase_weights": {}
}
}
CAST Block Injection
When characters are defined, every prompt must include:
CAST:
- HERO: HERO_01 (identity locked; must match reference frames exactly)
Do not alter identity traits across frames or across future assets.
Verification Thresholds
After generation, run vision model consistency check against reference frames:
- >= 85 → auto-pass
- 75–84 → escalate to Master (Telegram), do not auto-regen
- <= 74 → auto-fail, apply stabilize patch, retry once → then escalate if still failing
Learning Loop
After every human review decision, update:
- Approved → increase weights for phrases that produced good consistency; add best frames to approved reference set
- Rejected → identify drift attributes; downweight or ban phrases causing drift; add negative constraints
- Borderline → apply stabilize patch for that engine+character+scene combo
Generation Log
Append every attempt to
generation_log.jsonl (never deleted):
{
"timestamp": "...",
"campaign": "...",
"scene_id": "...",
"engine": "veo-3.1-generate-preview",
"attempt": 1,
"characters": ["HERO_01"],
"prompt": "...",
"output": "clips/scene_01.mp4",
"verification_score": 88,
"drift_notes": "",
"decision": "auto_pass",
"human_outcome": "approved",
"worked_phrases": [],
"failed_phrases": []
}
Escalation Policy — Ask Before Guessing
Escalate to Master via Telegram (never silently loop) when:
- Verification score is borderline (75–84)
- Character is on a new engine for the first time
- Scene type is new for that character+engine combo
- Same prompt has failed 2+ times in a row
Escalation message must include: scene ID, engine, score, drift notes, and 2–3 options.
Archive (Persists Across Campaigns)
Even though each campaign starts clean, these persist in the skill folder:
generation_log.jsonl — full audit trail
approved_references/ — canonical frames by campaign, available to load on request
campaign_phrase_weights/ — weight archives per campaign, loadable for continuity