gtts — 文本转语音

v0.1.0

Google Text-to-Speech (gTTS) Python 库，用于将文本转换为音频。适用于创建有声书、播客或语音合成。通过分块处理长文本，支持使用 pydub 或 ffmpeg 合并音频片段。

0· 19·0 当前·0 累计

by @lnj22·MIT-0

生产力工具

下载技能包

License

MIT-0

最后更新

2026/4/15

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

high confidence

该技能是一个仅提供指令的指南，介绍如何使用 gTTS Python 库将文本转换为音频；其需求和说明与该目的一致，不请求无关的凭据或特权访问。

评估建议

这是一个普通的 gTTS 使用教程。使用前：(1) 在虚拟环境中运行 pip 安装，(2) 如果计划使用 pydub 或 ffmpeg 连接方式，需安装 ffmpeg，(3) 注意 gTTS 会将文本发送到 Google 的公共 TTS 端点（避免发送敏感或机密文本），(4) gTTS 不同于 Google Cloud Text-to-Speech（无需 API 密钥），可能会被限流或出现故障，(5) 在安装前检查已安装包的许可证和版本。该技能本身不请求密钥或提升权限。...

详细分析 ▾

✓ 用途与能力

名称/描述与 SKILL.md 内容一致：介绍了使用 gTTS 和 pydub 转换和连接长文本的音频片段。说明中没有请求无关的服务或功能。

✓ 指令范围

运行时指令仅演示安装 gtts/pydub、分块文本、创建临时文件、生成 MP3、连接音频和清理。步骤仅对提供的文本和临时文件进行操作，并指出需要访问 Google TTS 端点的网络权限；不读取无关的系统文件或环境变量。

ℹ 安装机制

注册表中没有正式的安装规范（仅提供指令）。SKILL.md 建议使用 'pip install gtts pydub'，可选使用 ffmpeg 进行连接。这是正常的，但意味着用户将在本地运行包安装；ffmpeg 是需要单独安装的外部二进制依赖。

✓ 凭证需求

不请求环境变量、凭据或配置路径。唯一的外部交互是向 Google TTS 服务的网络调用（gTTS 使用公共 Google 端点，不需要 API 密钥），因此技能本身不请求暴露密钥。

✓ 持久化与权限

技能不是始终启用的，不请求持久系统权限，也不修改其他技能或代理级配置。它是一个仅提供指令的技能，可以由用户/代理正常调用。

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv0.1.02026/4/15

从 all-task-skills-dedup 批量发布

● 无害

安装命令

点击复制

官方npx clawhub@latest install pg-essay-to-audiobook-gtts

镜像加速npx clawhub@latest install pg-essay-to-audiobook-gtts --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

Use the gTTS (Google Text-to-Speech) Python library to convert text to audio.

When to use this skill

Converting text to spoken audio (e.g., for audiobooks, podcasts)
Speech synthesis from text input
Handling long text by chunking and concatenating audio

Installation

pip install gtts pydub

Optionally, install ffmpeg for audio concatenation:

# Ubuntu/Debian sudo apt-get install ffmpeg # macOS brew install ffmpeg

# Windows # Download from https://ffmpeg.org/download.html

Usage

Basic usage

from gtts import gTTStext = "Hello, this is a test."
tts = gTTS(text=text, lang='en')
tts.save("output.mp3")

Handling long text

For long text, chunk it and concatenate the audio files:

import os
from gtts import gTTS
from pydub import AudioSegment
from pydub.playback import play
def text_to_audio(text, lang='en', chunk_size=1000):
    """Convert long text to audio by chunking."""
    # Split text into chunks
    chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
    
    temp_files = []
    for i, chunk in enumerate(chunks):
        tts = gTTS(text=chunk, lang=lang)
        temp_file = f"temp_chunk_{i}.mp3"
        tts.save(temp_file)
        temp_files.append(temp_file)
    
    # Concatenate audio files
    combined = AudioSegment.empty()
    for file in temp_files:
        combined += AudioSegment.from_mp3(file)
    
    combined.export("output.mp3", format="mp3")
    
    # Cleanup
    for file in temp_files:
        os.remove(file)# Usage
long_text = "Your long text here..."
text_to_audio(long_text, lang='en')

Using ffmpeg for concatenation

import os
import subprocess
from gtts import gTTS
def text_to_audio_ffmpeg(text, lang='en', chunk_size=1000):
    """Convert long text to audio using ffmpeg for concatenation."""
    chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
    
    temp_files = []
    for i, chunk in enumerate(chunks):
        tts = gTTS(text=chunk, lang=lang)
        temp_file = f"temp_chunk_{i}.mp3"
        tts.save(temp_file)
        temp_files.append(temp_file)
    
    # Create file list for ffmpeg
    with open("file_list.txt", "w") as f:
        for file in temp_files:
            f.write(f"file '{file}'\n")
    
    # Concatenate using ffmpeg
    subprocess.run([
        "ffmpeg", "-f", "concat", "-safe", "0",
        "-i", "file_list.txt", "-c", "copy", "output.mp3"
    ])
    
    # Cleanup
    for file in temp_files:
        os.remove(file)
    os.remove("file_list.txt")# Usage
long_text = "Your long text here..."
text_to_audio_ffmpeg(long_text, lang='en')

Notes

gTTS sends text to Google's public TTS endpoints
Avoid sending sensitive or confidential information
gTTS is different from Google Cloud Text-to-Speech (no API key required)
May be rate-limited or subject to service changes
For production use, consider Google Cloud Text-to-Speech API

License

运行时依赖

版本

安装命令

技能文档

When to use this skill

Installation

Usage

Basic usage

Handling long text

Using ffmpeg for concatenation

Notes

相关技能推荐