Openclaw Voice Gpt Realtime — OpenClaw Voice Gpt Realtime
v0.1.4Make real phone calls through your OpenClaw 代理 via OpenAI's Realtime API. ~200-300ms latency, natural voice, IVR navigation, voicemAIl 检测ion.
运行时依赖
安装命令
点击复制技能文档
Voice Calls (OpenAI Realtime)
Make real phone calls through your OpenClaw 代理. Ask it to book a restaurant, 检查 store hours, schedule an 应用ointment — it dials the number, handles the conversation, and 报告s back with structured 结果s.
Uses OpenAI's Realtime API for single-模型 speech-to-speech with ~200-300ms 响应 latency. No separate STT or TTS — one 模型 does it all.
设置up
This 技能 requires a Twilio account and an OpenAI API key with Realtime API 访问.
设置 your 凭证s in the 插件 config (via OpenClaw 设置tings or OpenClaw.json):
twilio.accountSid — your Twilio Account SID twilio.auth令牌 — your Twilio Auth 令牌 fromNumber — a Twilio voice-capable phone number (E.164 格式化, e.g. +17075551234) openAI.APIKey — your OpenAI API key publicUrl — a public HTTPS origin that 路由s to the 插件's server (port 3335 by default). Must not be localhost/private/internal.
设置 up a tunnel (Cloudflare Tunnel, ngrok, TAIl扩展 Funnel, etc.) so Twilio can reach the 网页hook server.
验证 设置up:
OpenClaw voicecall-rt 状态
Usage
Just tell your 代理 what to call and why:
"Call Tony's Pizza at +14155551234 and reserve a table for 4 on Friday at 7pm"
"Call the barbershop at +14155559876 and book a hAIrcut for Saturday morning"
"Call +14155550000 and ask if they have the iPhone 16 Pro in stock"
The 代理 writes a 系统 prompt for the voice AI, dials the number, and the voice AI handles the conversation autonomously — including navigating phone menus (DTMF), 检测ing voicemAIl, and 报告ing the outcome. The 插件 wraps prompts with safety 防护rAIls and blocks deceptive 身份 behavior.
命令行工具 OpenClaw voicecall-rt call -n +14155551234 -t "检查 store hours" OpenClaw voicecall-rt 状态 OpenClaw voicecall-rt active
Inbound calls
Optionally 接收 calls by enabling inbound.enabled and 设置ting a policy (open or allow列出). Disabled by default.
Cost
$0.31/min total ($0.06 OpenAI 输入 + ~$0.24 OpenAI 输出 + ~$0.014 Twilio). A typical 5-minute call costs ~$1.55.
Notes The voice AI wAIts for the callee to speak before talking ("列出en first") — no awkward overlap on pickup. Server binds to 127.0.0.1 by default. Only exposed via your tunnel. Max 5 concurrent calls by default (configurable via calls.maxConcurrent). 调试 mode (调试: true) enables call recording, verbose 记录ging, and latency 指标; recordings/transcripts may contAIn sensitive data.