Distil Open Claw Pii — 本地PII脱敏

Name: Distil Open Claw Pii — 本地PII脱敏
Author: Jacek Golebiowski

Jacek Golebiowski

Distil Open Claw Pii — 本地PII脱敏

v1.1.1

使用微调的1B SLM模型在本地对文本中的PII（个人身份信息）进行脱敏处理。文本永远不会离开您的机器，支持姓名、邮箱、电话、地址、SSN、信用卡、IBAN等多种敏感信息类型的识别与脱敏。

0· 49·0 当前·0 累计

by @jgolebiowski (Jacek Golebiowski)·MIT-0

安全数据与API 开发工具本地处理隐私保护

下载技能包

License

MIT-0

最后更新

2026/4/8

安全扫描

VirusTotal

无害

查看报告

OpenClaw

可疑

medium confidence

该技能基本实现了其声称的功能（使用本地模型进行本地PII脱敏），但存在一些不一致之处和一个您应该在安装前了解的隐私风险细节。

评估建议

安装前需要考虑的关键事项： - 元数据声称没有必需的二进制文件，但脚本实际上需要llama-server（llama.cpp）、curl和Python。请确认您已安装（或愿意安装）这些依赖。 - 模型从Hugging Face（官方域名）下载到$HOME/.distil-pii目录，预计占用约5GB磁盘空间并需要网络下载。 - 重要的隐私细节：本地模型被指示在'entities'数组中包含原始PII值。脚本默认只打印redacted_text，但原始值存在于模型响应中，如果您使用--show-entities（或因bug/日志步骤捕获完整响应），原始值将被打印出来。如果您需要更强的保证确保原始值永不返回，请修改系统提示/代码，使模型永不包含原始值（例如存储哈希/掩码值或完全省略'value'字段）。 - 确认llama-server实际绑定到localhost（而非0.0.0.0），并确保您的防火墙阻止对端口8712的外部访问，以避免本地网络暴露。 - 如果您处理高风险PII，请在隔离环境（VM/容器）中运行设置，直到验证行为；检查服务器日志并确认没有意外的出站连接。 - 如果您计划在...

详细分析 ▾

⚠ 用途与能力

该技能声称本地脱敏且无需必需的二进制文件，但提供的脚本需要本地'llama-server'（llama.cpp）二进制文件、curl和Python。注册表元数据列出没有必需的二进制文件，尽管setup.sh明确检查llama-server并下载模型——这种不一致是混乱的，应该被纠正/确认。

⚠ 指令范围

SKILL.md指示智能体'永不包含用户的原始输入'，默认只返回脱敏文本。然而，scripts/redact.py的系统提示和输出模式明确要求entities数组包含原始value字段（原始PII）。脚本默认只打印脱敏文本，但模型响应将包含原始值（并且--show-entities会打印它们）。这种矛盾增加了意外暴露的风险（日志记录、调试或滥用--show-entities）。脚本只与localhost通信，不与外部端点通信。

✓ 安装机制

没有包安装规范；setup.sh从huggingface.co（已知主机）下载约5GB的GGUF模型并启动本地llama-server。从Hugging Face下载是本地模型的预期行为；安装不使用可疑的URL或解压缩不受信任的档案。它确实会启动后台服务器并写入$HOME/.distil-pii。

✓ 凭证需求

该技能不请求环境变量或外部凭证，这对于本地脱敏器是合适的。它确实在$HOME/.distil-pii下创建文件（模型和PID），这对于此目的是相称的。

ℹ 持久化与权限

该技能运行持久的本地服务器（llama-server）并将模型和PID文件存储在$HOME/.distil-pii下。always:false（好）。在端口8712上运行本地HTTP服务器是预期的，但您应该确认服务器仅绑定到localhost并验证该进程是可信的。

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.1.12026/4/8

更新文档以阻止在正常工作流程中使用--show-entities

● 无害

安装命令点击复制

官方npx clawhub@latest install distil-open-claw-pii

镜像加速npx clawhub@latest install distil-open-claw-pii --registry https://cn.clawhub-mirror.com

技能文档

使用场景

当用户要求对文本中的PII进行脱敏、匿名化、清理或删除PII/个人数据时，使用此技能。

隐私保证

关键：切勿在您自己的响应、上下文或推理中包含用户的原始输入文本。

此技能的核心要点是， frontier LLM（您）永远不会看到PII。您直接将文本传递给脱敏脚本，只返回脱敏后的输出。

前置条件

Python 3
curl（用于模型下载）

设置脚本会处理其他所有内容（模型下载 + 服务器启动）。

首次设置

如果模型服务器尚未运行，请运行：

bash scripts/setup.sh

这将下载GGUF模型（约5 GB）并在端口8712上启动本地推理服务器。

如何进行脱敏

直接将用户的文本传递给脱敏脚本。不要自己回显或重复原始文本。

python scripts/redact.py "text to redact"

对于较长的文本，通过stdin管道传输：

echo "text to redact" | python scripts/redact.py

将输出原样返回给用户。

`--show-entities` 标志（谨慎使用）

添加--show-entities会输出包含原始PII值的完整JSON。仅在用户明确要求查看检测到的实体或需要映射关系用于下游任务时才使用此选项。

在正常的脱敏工作流程中，省略此标志——显示原始实体值会使PII脱敏的目的失效。

python scripts/redact.py --show-entities "text to redact"

如何停止服务器

bash scripts/stop.sh

输出格式

默认情况下，脚本只打印脱敏后的文本——PII标记替换了敏感数据，原始值永不显示：

Hi, my name is [PERSON] and I need help with my recent order #ORD-29481. You can reach me at [EMAIL] or call me at [PHONE]. I'm a [AGE_YEARS:34]-year-old [MARITAL_STATUS] woman living at [ADDRESS]...

使用--show-entities时，脚本返回包含原始PII值的完整JSON（有关何时适用，请参见上面的标志说明）。

有关完整的输入/输出示例，请参见examples/目录。

When to use

Use this skill when the user asks to redact, anonymize, sanitize, or remove PII / personal data from text.

Privacy guarantee

CRITICAL: NEVER include the user's raw input text in your own responses, context, or reasoning. The entire point of this skill is that the frontier LLM (you) never sees the PII. You pass the text directly to the redaction script and only return the redacted output.

Prerequisites

Python 3
curl (for model download)

The setup script handles everything else (model download + server startup).

First-time setup

If the model server is not running yet, run:

bash scripts/setup.sh

This downloads the GGUF model (~5 GB) and starts the local inference server on port 8712.

How to redact

Pass the user's text directly to the redaction script. Do not echo or repeat the raw text yourself.

python scripts/redact.py "text to redact"

For longer text, pipe it via stdin:

echo "text to redact" | python scripts/redact.py

Return the output to the user as-is.

`--show-entities` flag (use sparingly)

Adding --show-entities outputs the full JSON including the original PII values. Only use this when the user explicitly asks to see which entities were detected or needs the mapping for a downstream task. In normal redaction workflows, omit this flag -- displaying the raw entity values defeats the purpose of PII redaction.

python scripts/redact.py --show-entities "text to redact"

How to stop the server

bash scripts/stop.sh

Output format

By default the script prints only the redacted text -- PII tokens replace the sensitive data and the original values are never shown:

Hi, my name is [PERSON] and I need help with my recent order #ORD-29481.You can reach me at [EMAIL] or call me at [PHONE]. I'm a [AGE_YEARS:34]-year-old [MARITAL_STATUS] woman living at [ADDRESS]...

With --show-entities, the script returns full JSON including original PII values (see flag note above for when this is appropriate).

See examples/ for full input/output samples.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

使用场景

隐私保证

前置条件

首次设置

如何进行脱敏

--show-entities 标志（谨慎使用）

如何停止服务器

输出格式

When to use

Privacy guarantee

Prerequisites

First-time setup

How to redact

--show-entities flag (use sparingly)

How to stop the server

Output format

安装命令点击复制

`--show-entities` 标志（谨慎使用）

`--show-entities` flag (use sparingly)