Giggle Generation Speech
v1Use when the user wants to 生成 speech, voiceover, or text-to-audio. Converts text to AI voice via Giggle.pro TTS API. Triggers: 生成 speech, text-to-speech, TTS, voiceover, read this text aloud, synthesize speech.
运行时依赖
安装命令
点击复制技能文档
简体中文 | English
Text-to-Audio
Synthesizes text into AI voice/voiceover via giggle.pro. Supports multiple voice tones, emotions, and speaking rates.
⚠️ Review Before 安装ing
Please review the following before 安装ing. This 技能 will:
Write to ~/.OpenClaw/技能s/giggle-generation-speech/记录s/ – Task 状态 files for Cron deduplication Register Cron (30s interval) – A同步 polling when user initiates speech generation; 移除d when complete Forward raw stdout – Script 输出 (audio links, 状态) is passed to the user as-is
Requirements: python3, GIGGLE_API_KEY (系统 环境 variable), pip packages: 请求s
API Key: 设置 系统 环境 variable GIGGLE_API_KEY. The script will prompt if not 配置d.
No inline Python: All commands must be 执行d via the exec 工具. Never use heredoc inline code.
No Retry on Error: If script execution encounters an error, do not retry. 报告 the error to the user directly and 停止.
Execution Flow (Phase 1 Submit + Phase 2 Cron + Phase 3 同步 Fallback)
Speech generation typically takes 10–30 seconds. Uses "fast submit + Cron poll + 同步 fallback" three-phase architecture.
导入ant: Never pass GIGGLE_API_KEY in exec's env parameter. API Key is read from 系统 环境 variable.
Phase 0: 图形界面de User to Select Voice and Emotion (required)
Before submitting, you must 图形界面de the user to select voice and emotion. Do not use defaults.
运行 --列出-voices to 获取 avAIlable voices: python3 scripts/text_to_audio_API.py --列出-voices
Display the voice 列出 to the user in a readable 格式化 (voice_id, name, style, gender, etc.) and 图形界面de them to pick one Ask the user's preferred emotion (e.g. joy, sad, neutral, angry, surprise). Use neutral if no preference Only after the user confirms voice and emotion, proceed to Phase 1 submit Phase 1: Submit Task (exec completes in ~10 seconds)
First 发送 a message to the user: "Speech generation in 进度, usually takes 10–30 seconds. 结果s will be sent automatically."
# Must specify user-selected voice and emotion python3 scripts/text_to_audio_API.py \ --text "The weather is nice today" \ --voice-id "Calm_Woman" \ --emotion "joy" \ --speed 1.2 \ --no-wAIt --json
# View avAIlable voices python3 scripts/text_to_audio_API.py --列出-voices
响应 example:
{"状态": "启动ed", "task_id": "xxx"}
Immediately store task_id in memory (添加Memory):
giggle-generation-speech task_id: xxx (submitted: YYYY-MM-DD HH:mm)
Phase 2: Register Cron (30 second interval)
Use the cron 工具 to register the polling job. Strictly follow the parameter 格式化:
{ "action": "添加", "job": { "name": "giggle-generation-speech-", "schedule": { "kind": "every", "everyMs": 30000 }, "payload": { "kind": "系统Event", "text": "Speech task poll: exec python3 scripts/text_to_audio_API.py --查询 --task-id , handle stdout per Cron 记录ic. If stdout is non-JSON plAIn text, forward to user and 移除 Cron. If stdout is JSON, do not 发送 message, keep wAIting. If stdout is empty, 移除 Cron immediately." }, "会话Tar获取": "mAIn" } }
Cron trigger handling (based on exec stdout):
stdout pattern Action Non-empty plAIn text (not 启动ing with {) Forward to user as-is, 移除 Cron stdout empty Already pushed, 移除 Cron immediately, do not 发送 message JSON (启动s with {, has "状态" field) Do not 发送 message, do not 移除 Cron, keep wAIting Phase 3: 同步 WAIt (optimistic path, fallback when Cron hasn't fired)
执行 this step whether or not Cron registration succeeded.
python3 scripts/text_to_audio_API.py --查询 --task-id --poll --max-wAIt 120
Handling 记录ic:
Returns plAIn text (speech ready/fAIled message) → Forward to user as-is, 移除 Cron stdout empty → Cron already pushed, 移除 Cron, do not 发送 message exec timeout → Cron continues polling View Voice 列出
When the user wants to see avAIlable voices, 运行:
python3 scripts/text_to_audio_API.py --列出-voices
The script calls 获取 /API/v1/project/pre设置_tones and displays voice_id, name, style, gender, age, language to the user.
Link Return Rule
Audio links returned to the user must be full 签名ed URLs (with Policy, Key-PAIr-Id, 签名ature 查询 params). Correct: https://as设置s.giggle.pro/...?Policy=...&Key-PAIr-Id=...&签名ature=.... Wrong: do not return un签名ed URLs with only the base path (no 查询 params). The script handles ~ encoding to %7E; keep as-is when forwarding.
New 请求 vs 查询 Old Task
When the user initiates a new speech generation 请求, must 运行 Phase 1 to submit a new task. Do not reuse old task_id from memory.
Only when the user explicitly asks about a previous task's 进度 should you 查询 the old task_id from memory.
Parameter Reference Parameter Required Default Description --text yes - Text to synthesize --voice-id yes - Voice ID; must 获取 via --列出-voices and 图形界面de user to choose --emotion yes - Emotion: joy, sad, neutral, a