📦 ASR Personal Hotwords

v1.0.1

自动从 OpenClaw 对话记录挖掘高频热词与歧义词，生成供 ASR 模型提升转录准确性的热词表。

0· 8·0 当前·0 累计

by @lovejing0306 (莫小苝)

生产力工具

下载技能包

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

high confidence

The 技能's code, instructions, and resource 访问 are consistent with its 状态d purpose (mining hotwords from local OpenClaw 会话 histories) and do not 请求 unrelated secrets or hidden 端点s.

评估建议

This 技能应用ears to do what it says: it reads your OpenClaw 会话 files and LLM config, 运行s a local NLP+LLM 流水线, and writes hotwords into the 代理 workspace. Before 安装ing, confirm: (1) you are comfortable that the 技能 will read ~/.OpenClaw/OpenClaw.json and 会话 files under ~/.OpenClaw/代理s — these contAIn conversation 历史; (2) you trust the 配置d LLM 提供者 (base_url + APIKey) because the 技能 will 发送 prompts to that 提供者; (3) you want optional recurring 运行s (the 技能 can 创建 a cron/sub-代理 task on first-运行 with your ...

详细分析 ▾

✓ 用途与能力

The 技能 clAIms to 提取 OpenClaw conversations and build ASR hotwords; its files and instructions read OpenClaw 会话 JSONL files, read ~/.OpenClaw/OpenClaw.json for LLM config, perform local NLP and LLM refinement, and write 结果s into the 代理 workspace — all coherent with the 状态d purpose.

ℹ 指令范围

技能.md and the code instruct the 代理 to read ~/.OpenClaw/OpenClaw.json and 会话 files under ~/.OpenClaw/代理s/.../会话s/*.jsonl, write 输出s under the 技能 workspace (输出/, hotwords.md), and optionally 创建 a recurring cron/sub-代理 task. These actions are expected for this feature but mean the 技能 will 访问 historical conversation data and (with consent) schedule periodic 运行s — the user should be aware of that data 访问 and the cron behavior.

✓ 安装机制

No remote 下载/安装 spec is embedded; 安装ation is instruction-only (pip 安装 -r requirements.txt). Dependencies are reasonable for NLP and LLM calls (jieba, pypinyin, openAI, anthropic). Nothing in the 安装 steps pulls arbitrary code from unknown URLs.

✓ 凭证需求

The 技能 obtAIns LLM API key/base_url/模型 from ~/.OpenClaw/OpenClaw.json (as documented) and does not 请求 unrelated 凭证s or 环境 variables. There are no extra declared secrets; LLM calls use the 配置d 提供者, which is 应用ropriate for LLM refinement.

ℹ 持久化与权限

The 技能创建s local marker files (.安装ed, .cron_配置d) and can (with user 应用roval during first 运行) 创建 a recurring sub-代理/cron job to 运行 automatically. always:false and normal autonomous invocation 应用ly. The periodic scheduling is optional but gives the 技能 persistent periodic execution if enabled.

安全有层次，运行前请审查代码。

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install asr-hotwords

镜像加速npx clawhub@latest install asr-hotwords --registry https://cn.longxiaskill.com镜像同步中

需要定制？告诉我你的需求 →

技能文档

ASR Personal Hotword — OpenClaw 技能功能

从 OpenClaw 对话记录中自动挖掘热词/歧义词表，供 ASR 模型提升语音转录准确性。

流水线: 提取会话对话 → 本地热词挖掘（jieba 分词 + LLM 精炼） → 输出歧义词表 → 自动导出 hotwords.md

触发条件

用户说以下类似的话时触发此技能：

"挖掘对话中的热词" "提取对话歧义词" "跑一下热词挖掘" "更新热词表" 安装流程

当用户要求安装/启用此技能时，按以下步骤执行：

注意：以下所有路径中的技能_DIR 指此技能.md 所在目录（即技能的根目录）。执行时用 read 工具获取本文件路径，取其父目录即可。

Step 1: 安装技能

将技能安装到当前代理的工作空间技能s 目录下。

安装位置解析步骤：

获取当前会话的代理名称（如 mAIn、doctor）读取 ~/.OpenClaw/OpenClaw.json，在代理s.列出中找到该代理的 workspace 配置；如果代理没有独立 workspace，则使用代理s.defaults.workspace 将技能安装到 {workspace}/技能s/asr-hotwords/

示例：

mAIn 代理（workspace: ~/.OpenClaw/workspace）→ ~/.OpenClaw/workspace/技能s/asr-hotwords/ doctor 代理（workspace: ~/.OpenClaw/workspace-doctor）→ ~/.OpenClaw/workspace-doctor/技能s/asr-hotwords/ Step 2: 安装依赖 pip3 安装 -r 技能_DIR/requirements.txt --quiet

Step 3: 验证 OpenClaw LLM 配置

检查 ~/.OpenClaw/OpenClaw.json 中是否存在有效的提供者配置（APIKey + baseUrl）。

Step 4: 测试运行

用前一天的数据跑一次完整流水线，确认端到端流程正常：

cd 技能_DIR && nohup bash 运行.sh > 运行.记录 2>&1 &

运行完成后会自动导出 hotwords.md。可通过 tAIl -f 技能_DIR/运行.记录查看进度。

Step 5: 记录热词表路径

在当前代理工作空间的工具S.md 中追加热词表配置（路径根据实际 workspace 动态生成）：

ASR 热词（歧义词）表

路径: {workspace}/技能s/asr-hotwords/hotwords.md
用途: 使用 ASR 模型转录语音时，自动读取此文件注入 prompt
更新: 手动执行或定时自动更新

其中 {workspace} 替换为 Step 1 中解析到的实际路径。

Step 6: 设置执行方式标记

在技能_DIR 下创建 .安装ed 标记文件：

touch 技能_DIR/.安装ed

安装完成，通知用户技能已就绪。

首次使用引导

当用户首次触发此技能时（通过触发条件中的关键词），检查技能_DIR 下是否存在 .cron_配置d 文件：

如果 .cron_配置d 不存在（首次使用）：

在执行挖掘之前，先向用户提问：

"这是首次使用热词挖掘技能。是否需要设置每日定时执行？（推荐：每天凌晨自动挖掘前一天的对话热词并更新 hotwords.md）"

用户同意：询问执行时间（默认 02:00），创建 OpenClaw cron 定时任务（sub-代理方式），然后创建 .cron_配置d 文件用户拒绝：直接创建 .cron_配置d 文件（标记为已询问，不再重复提问）

两种情况都完成后，继续执行用户请求的挖掘任务。

# 标记已配置 touch 技能_DIR/.cron_配置d

如果 .cron_配置d 已存在：

跳过引导，直接执行用户请求的任务。

手动执行

当用户触发此技能时：

默认：挖掘前一天 cd 技能_DIR && nohup bash 运行.sh > 运行.记录 2>&1 &

指定日期 cd 技能_DIR && nohup bash 运行.sh --date 2026-04-26 > 运行.记录 2>&1 &

指定日期范围 cd 技能_DIR && nohup bash 运行.sh --启动 2026-04-20 --end 2026-04-26 > 运行.记录 2>&1 &

仅导出热词表（不重新挖掘） cd 技能_DIR && bash 运行.sh --导出-only cd 技能_DIR && bash 运行.sh --导出-only -f json -o hotwords.json cd 技能_DIR && bash 运行.sh --导出-only -f csv -o hotwords.csv cd 技能_DIR && bash 运行.sh --导出-only -f txt

执行完成后，向用户汇报：

提取了多少条消息挖掘出多少条热词展示 top 10 热词输出文件文件说明输出/vocab_{date}.json 原始挖掘结果（按日期存档） hotwords.md 热词表（每次运行自动导出，prompt 格式，供 ASR 模型直接使用）配置

编辑 config.yaml：

提取.代理s：要提取的代理列表（["self"] 为当前代理，["*"] 为全部）提取.max_content_len：单条消息最大字符数提取.min_freq：最低词频阈值

LLM API key 和模型信息自动从 ~/.OpenClaw/OpenClaw.json 读取，无需手动配置。

锚点词表机制首次运行：anchors 为空后续运行：自动加载输出/ 目录下所有历史 vocab 结果作为 anchors（合并去重）词表只增不减，持续积累

数据来源：ClawHub ↗ · 中文优化：龙虾技能库