Telegram Voice To Voice Macos — 技能工具

Name: Telegram Voice To Voice Macos — 技能工具
Author: Fiberian

Fiberian

Telegram Voice To Voice Macos — 技能工具

v0.1.3

[自动翻译] Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes v...

0· 1,528·4 当前·4 累计

by @fiberian1981 (Fiberian)·MIT-0

系统工具开发工具自动化

下载技能包

License

MIT-0

最后更新

2026/2/26

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

high confidence

The skill's requests and included scripts are consistent with a macOS-only Telegram voice-to-voice workflow: required binaries, file paths, and behavior match the description and there are no unexpected network endpoints or credential requests.

评估建议

This skill appears to do exactly what it says: transcribe .ogg voice notes locally (yap) and produce Telegram voice notes via say+ffmpeg. Before installing, confirm you are on macOS Apple Silicon and that you trust the local 'yap' and 'ffmpeg' binaries you will provide. Understand the skill will read inbound .ogg files from ~/.openclaw/media/inbound and create TTS output in ~/.openclaw/workspace/voice_out and (per the SKILL.md) expects a per-user state file voice_state/telegram.json in the works...

详细分析 ▾

✓ 用途与能力

Name/description align with what the skill asks for: yap for Speech.framework transcription, say + ffmpeg for TTS/encoding, and defaults for reading macOS locale. The included helper scripts implement transcription and TTS and operate on the documented ~/.openclaw media/workspace paths.

ℹ 指令范围

SKILL.md instructs the agent to read inbound .ogg files from ~/.openclaw/media/inbound and to write reply files under workspace paths; the helper scripts do the transcription and TTS but do not implement the per-user 'voice_state/telegram.json' preference logic described in SKILL.md (that state management is expected to be done by the agent). The instructions do not request secrets or contact unknown external endpoints — sending replies is delegated to the agent's message tool as expected.

✓ 安装机制

No install spec (instruction-only plus two small shell scripts). Nothing downloads or executes remote code; risk from install-time actions is low.

✓ 凭证需求

The skill requires no credentials or sensitive environment variables. It accesses files under the user's home (~/.openclaw/*) and the macOS system locale, which are proportionate to the described functionality.

ℹ 持久化与权限

The skill is not always-enabled and does not request elevated privileges, but it does write/read files in the user's home (~/.openclaw/workspace and voice_state paths). Autonomous invocation is allowed by default (normal for skills); this combined with file I/O is expected for the workflow but worth noticing.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

🖥️ OSmacOS

版本

latestv0.1.32026/2/9

telegram-voice-to-voice-macos v0.1.3 - Updated scripts/tts_telegram_voice.sh (details not shown). - No user-facing changes in SKILL.md.

● 无害

安装命令点击复制

官方npx clawhub@latest install telegram-voice-to-voice-macos

镜像加速npx clawhub@latest install telegram-voice-to-voice-macos --registry https://cn.clawhub-mirror.com

技能文档

This is an OpenClaw skill.

Requirements

macOS on Apple Silicon.
yap CLI available in PATH (Speech.framework transcription).

- Project: https://github.com/finnvoor/yap (by finnvoor)

ffmpeg available in PATH.

Compatibility note (important)

This skill is macOS-only (uses say + Speech.framework). The skill registry cannot enforce OS restrictions, so installing/running it on Linux/Windows will result in runtime failures.

Persistent reply mode (voice vs text)

Store a small per-user preference file in the workspace:

State file: voice_state/telegram.json
Key: Telegram sender user id (string)
Values:

- "voice" (default): reply with a Telegram voice note - "text": reply with a single text message

If the file does not exist or the sender id is missing: assume "voice".

Toggle commands

If an inbound text message is exactly:

/audio off → set state to "text" and confirm with a short text reply.
/audio on → set state to "voice" and confirm with a short text reply.

Getting the inbound audio (.ogg)

Telegram voice notes often show up as in message text. OpenClaw saves the attachment to disk (typically .ogg) under:

~/.openclaw/media/inbound/

Recommended approach:

1) If the inbound message context includes an attachment path, use it. 2) Otherwise, take the most recent *.ogg from ~/.openclaw/media/inbound/.

Transcription

Default locale: macOS system locale.

Optional env:

YAP_LOCALE — override the transcription locale (e.g. it-IT, en-US).

Preferred:

yap transcribe --locale "${YAP_LOCALE:-}"

- If YAP_LOCALE is not set, the helper script will use the macOS system locale (from defaults read -g AppleLocale).

If transcription fails or is empty: ask the user to repeat or send text.

Helper script:

scripts/transcribe_telegram_ogg.sh [path.ogg]

Reply behavior

Mode: voice (default)

Voice default: SYSTEM (uses the current macOS system voice). You can override by passing a specific voice name to the helper script.

1) Generate the reply text. 2) Convert reply text to an OGG/Opus voice note using:

scripts/tts_telegram_voice.sh "" [SYSTEM|VoiceName]

The script prints the generated .ogg path to stdout.

3) Send the .ogg back to Telegram as a voice note (not a generic audio file):

use the message tool with asVoice: true and media:
optionally set replyTo to thread the response

Notes:

Use SYSTEM to rely on the current macOS system voice (recommended).

Mode: text

Reply with a single text message:

Transcription: <...>
Reply: <...>

This is an OpenClaw skill.

Requirements

macOS on Apple Silicon.
yap CLI available in PATH (Speech.framework transcription).

- Project: https://github.com/finnvoor/yap (by finnvoor)

ffmpeg available in PATH.

Compatibility note (important)

This skill is macOS-only (uses say + Speech.framework). The skill registry cannot enforce OS restrictions, so installing/running it on Linux/Windows will result in runtime failures.

Persistent reply mode (voice vs text)

Store a small per-user preference file in the workspace:

State file: voice_state/telegram.json
Key: Telegram sender user id (string)
Values:

- "voice" (default): reply with a Telegram voice note - "text": reply with a single text message

If the file does not exist or the sender id is missing: assume "voice".

Toggle commands

If an inbound text message is exactly:

/audio off → set state to "text" and confirm with a short text reply.
/audio on → set state to "voice" and confirm with a short text reply.

Getting the inbound audio (.ogg)

Telegram voice notes often show up as in message text. OpenClaw saves the attachment to disk (typically .ogg) under:

~/.openclaw/media/inbound/

Recommended approach:

1) If the inbound message context includes an attachment path, use it. 2) Otherwise, take the most recent *.ogg from ~/.openclaw/media/inbound/.

Transcription

Default locale: macOS system locale.

Optional env:

YAP_LOCALE — override the transcription locale (e.g. it-IT, en-US).

Preferred:

yap transcribe --locale "${YAP_LOCALE:-}"

- If YAP_LOCALE is not set, the helper script will use the macOS system locale (from defaults read -g AppleLocale).

If transcription fails or is empty: ask the user to repeat or send text.

Helper script:

scripts/transcribe_telegram_ogg.sh [path.ogg]

Reply behavior

Mode: voice (default)

Voice default: SYSTEM (uses the current macOS system voice). You can override by passing a specific voice name to the helper script.

1) Generate the reply text. 2) Convert reply text to an OGG/Opus voice note using:

scripts/tts_telegram_voice.sh "" [SYSTEM|VoiceName]

The script prints the generated .ogg path to stdout.

3) Send the .ogg back to Telegram as a voice note (not a generic audio file):

use the message tool with asVoice: true and media:
optionally set replyTo to thread the response

Notes:

Use SYSTEM to rely on the current macOS system voice (recommended).

Mode: text

Reply with a single text message:

Transcription: <...>
Reply: <...>

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

Requirements

Compatibility note (important)

Persistent reply mode (voice vs text)

Toggle commands

Getting the inbound audio (.ogg)

Transcription

Reply behavior

Mode: voice (default)

Mode: text

Requirements

Compatibility note (important)

Persistent reply mode (voice vs text)

Toggle commands

Getting the inbound audio (.ogg)

Transcription

Reply behavior

Mode: voice (default)

Mode: text

安装命令点击复制