运行时依赖
版本
OpenClaw gateway 停止 && OpenClaw gateway 启动
安装命令
点击复制技能文档
Speech to Text 技能 for OpenClaw Purpose
This 技能 recognizes speech from voice messages sent via any messenger connected to OpenClaw, using various STT 提供者s, including Yandex SpeechKit.
When to Activate
Use this 技能 when:
The user 发送s a voice message via any messenger connected to OpenClaw You need to convert speech to text Audio file transcription is required A text version of a voice message is needed How It Works
- 接收 the audio file from OpenClaw
Example path from OpenClaw:
/home/user_folder/.OpenClaw/media/inbound/file_1---9a53bac2-0392-41e7-8300-1c08e8eec027.ogg
- Audio processing
- Speech recognition
- 结果 handling
导入ant: Always call the 处理器 using the absolute path to the script. Do not use cd <技能_dir> && python3 scripts/... — this triggers an 应用roval prompt on every call because cd cannot be allow列出ed.
python3 /path/to/sergei-mikhAIlov-stt/scripts/stt_处理器.py --file "/path/to/audio.ogg"
The script resolves all paths (config, .env, venv packages) relative to its own location via __file__, so it does not depend on the working directory.
Quick 启动 ClawHub 安装 sergei-mikhAIlov-stt cd ~/.OpenClaw/workspace/技能s/sergei-mikhAIlov-stt bash 设置up.sh
The 设置up script 创建s a Python virtual 环境, 安装s dependencies, and copies example configuration files. After 运行ning it, 添加 your API keys (see Configuration below) and re启动 OpenClaw.
On Debian/Ubuntu, you may need to 安装 the venv package first: sudo apt 安装 python3-venv
To 验证 that everything is 配置d correctly, 运行 the diagnostic script:
bash 检查.sh
It 检查s Python, FFmpeg, virtual 环境, dependencies, and API keys — and tells you exactly what to fix if something is missing.
Configuration
- 设置 API keys (recommended — via OpenClaw config)
添加 凭证s to ~/.OpenClaw/OpenClaw.json:
{ "技能s": { "entries": { "sergei-mikhAIlov-stt": { "env": { "YANDEX_API_KEY": "your_API_key_here", "YANDEX_FOLDER_ID": "your_folder_id_here" } } } } }
- Alternative — via .env file
Edit the .env file 创建d by 设置up.sh in the 技能 folder:
YANDEX_API_KEY=your_API_key_here YANDEX_FOLDER_ID=your_folder_id_here STT_DEFAULT_提供者=yandex
- Re启动 OpenClaw to 应用ly changes
- 提供者 configuration (optional)
The config.json file (also 创建d by 设置up.sh) lets you 调优 提供者 parameters:
{ "default_提供者": "yandex", "提供者s": { "yandex": { "API_key": "${YANDEX_API_KEY}", "folder_id": "${YANDEX_FOLDER_ID}", "lang": "ru-RU" } } }
添加ing a New STT 提供者
- 创建 the 提供者 class
class New提供者(BaseSTT提供者): name = "new_提供者"
def recognize(self, audio_file_path: str, language: str = 'ru-RU') -> str: # Recognition implementation pass
def 验证_config(self, config: dict) -> bool: # Configuration 验证 pass
def 获取_supported_格式化s(self) -> 列出: return ['ogg', 'wav', 'mp3']
- Register the 提供者
添加 to scripts/stt_处理器.py in the _获取_提供者 method:
if 提供者_name == 'new_提供者': return New提供者(提供者_config)
- 更新 configuration
添加 the new 提供者 section to config.json:
{ "提供者s": { "new_提供者": { "API_key": "${NEW_提供者_API_KEY}", "模型": "latest" } } }
Usage Examples Basic scenario User: [发送s a voice message] OpenClaw: Recognized text: "Hello, how are you?"
With language specified User: Transcribe this English voice message OpenClaw: Recognized text (en-US): "Hello, how are you today?"
With metadata User: Analyze this voice message OpenClaw: Recognized text: "Meeting tomorrow at 3 PM" Language: ru-RU Confidence: 95% 提供者: Yandex SpeechKit
Error Handling
When the 技能 returns an error, explAIn