Redact — 技能工具

Name: Redact — 技能工具
Author: noah

noah

Redact — 技能工具

v0.1.1

Privacy redaction toolkit for images, PDFs, Word documents, and PowerPoint presentations. Use when the user needs to redact, mask, or replace sensitive/priva...

0· 99·0 当前·0 累计

by @darknoah (noah)·MIT-0

文件处理生产力工具图像处理文档工具

下载技能包

License

MIT-0

最后更新

2026/3/24

安全扫描

VirusTotal

可疑

查看报告

OpenClaw

安全

high confidence

The skill's code, dependencies, and runtime instructions are coherent with a local redaction/ocr toolkit; nothing in the bundle appears to require unrelated credentials or network endpoints, though it relies on heavy OCR libraries and writes temporary files to disk.

评估建议

This package appears to be a legitimate local redaction/OCR toolkit. Before installing, note: 1) It requires heavy ML packages (paddlepaddle/paddleocr/paddlex) that may download large binaries or model files during install or first run and may require a lot of disk space and time. 2) The scripts create a cache directory (~/.cache/redact_temp) and temporary result directories; inspect and clean these if you need to avoid leaving extracted content on disk. 3) The code sets DISABLE_MODEL_SOURCE_CHE...

详细分析 ▾

✓ 用途与能力

Name and description claim OCR-based redaction for images/PDF/docx/pptx; the repository includes scripts for reading and redacting each format and lists expected Python OCR and document libraries (PaddleOCR/PPStructureV3, PyMuPDF, python-docx, python-pptx, Pillow). These dependencies and scripts are expected for the stated purpose.

ℹ 指令范围

SKILL.md instructs running the included scripts with a rules CSV and to use 'uv sync' to install dependencies. The scripts operate on local files and perform OCR and in-place replacements/masking. They create temporary directories and a persistent cache directory (~/.cache/redact_temp) for intermediate outputs; some temp directories are cleaned up but others may remain depending on code paths. The runtime instructions do not ask for unrelated files, credentials, or external endpoints, but they do not explicitly warn about model weight downloads (see install_mechanism note).

ℹ 安装机制

There is no platform install spec in the registry; SKILL.md recommends using 'uv sync' to create a venv and install dependencies from pyproject. Dependencies include paddlepaddle/paddleocr/paddlex which are large and may pull model binaries or wheels from package/model hosting during installation or first use. No arbitrary URL downloads or obscure installers are present in the bundle itself.

✓ 凭证需求

The skill declares no required environment variables, credentials, or config paths. The code sets a few environment variables locally (e.g., DISABLE_MODEL_SOURCE_CHECK, FLAGS_use_mkldnn) which affect runtime behavior but are internal to the scripts. No secret-exposing env vars are requested.

✓ 持久化与权限

always is false and the skill does not request elevated or persistent platform privileges. It writes temporary data to disk (creates ~/.cache/redact_temp and various temp dirs) but does not modify other skills or system-wide agent settings.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv0.1.12026/3/22

- Added read.py script to support extracting text (via OCR) from images, PDFs, Word, and PowerPoint files - Updated SKILL.md with usage and output formats for read.py, including structured and JSON output - Switched to uv for environment setup and dependency management instructions - Removed the obsolete scripts/init-runtime.sh file

● 可疑

安装命令点击复制

官方npx clawhub@latest install redact

镜像加速npx clawhub@latest install redact --registry https://cn.clawhub-mirror.com

技能文档

Privacy redaction toolkit using PPStructureV3 OCR for text detection and replacement.

Scripts

Script	Format	Command
`read.py`	Images / PDF / Word / PowerPoint	`read.py [--info] [--mode json]`
`redact-image.py`	Images (png, jpg, etc.)	`redact-image.py`
`redact-pdf.py`	PDF	`redact-pdf.py`
`redact-document.py`	Word (docx, doc)	`redact-document.py`
`redact-presentation.py`	PowerPoint (pptx, ppt)	`redact-presentation.py`

CSV Rules Format

target_text,replacement_text
张三,李四
手机号,
身份证号,

Rule	Effect
`原文本,新文本`	Replace with new text
`原文本,`	Empty = mask with █ (documents) or solid color block (images/PDF)

Masking Behavior

Format	Empty Replacement
Images, PDF	Solid color block overlay
Word, PowerPoint	`█` characters (same length as target)

Read Features

read.py supports:

Reading text from images, PDF, Word, and PowerPoint files
OCR for image files and embedded images
Page-aware output for PDF / Word / PowerPoint
--info structured output:

- - ... for OCR text extracted from images

JSON Output

Document-like files (pdf, docx, doc, pptx) output:

{
  "type": "pptx",
  "pages": [
    {
      "page_index": 1,
      "content": [
        { "type": "text", "text": "..." },
        { "type": "image", "text": "ocr text..." }
      ]
    }
  ]
}

Image files output:

{
  "type": "image",
  "content": "..."
}

Features

Feature	Image	PDF	Document	Presentation
Read text	✅	✅	✅	✅
JSON output	✅	✅	✅	✅
Text replacement	✅	✅	✅	✅
Solid color mask	✅	✅	-	-
█ character mask	-	-	✅	✅
OCR detection	✅	✅	✅ (images)	✅ (images)
Tables	-	✅	✅	✅
Headers/Footers	-	✅	✅	-
Embedded images	-	✅	✅	✅

Environment Setup

使用 uv 安装依赖：

# 进入 skill 目录 cd skills/redact

# 同步依赖（自动创建虚拟环境并安装） uv sync

Dependencies

Python 3.10+
PaddleOCR / PPStructureV3
python-docx, python-pptx, PyMuPDF, Pillow

Privacy redaction toolkit using PPStructureV3 OCR for text detection and replacement.

Scripts

Script	Format	Command
`read.py`	Images / PDF / Word / PowerPoint	`read.py [--info] [--mode json]`
`redact-image.py`	Images (png, jpg, etc.)	`redact-image.py`
`redact-pdf.py`	PDF	`redact-pdf.py`
`redact-document.py`	Word (docx, doc)	`redact-document.py`
`redact-presentation.py`	PowerPoint (pptx, ppt)	`redact-presentation.py`

CSV Rules Format

target_text,replacement_text
张三,李四
手机号,
身份证号,

Rule	Effect
`原文本,新文本`	Replace with new text
`原文本,`	Empty = mask with █ (documents) or solid color block (images/PDF)

Masking Behavior

Format	Empty Replacement
Images, PDF	Solid color block overlay
Word, PowerPoint	`█` characters (same length as target)

Read Features

read.py supports:

Reading text from images, PDF, Word, and PowerPoint files
OCR for image files and embedded images
Page-aware output for PDF / Word / PowerPoint
--info structured output:

- - ... for OCR text extracted from images

JSON Output

Document-like files (pdf, docx, doc, pptx) output:

{
  "type": "pptx",
  "pages": [
    {
      "page_index": 1,
      "content": [
        { "type": "text", "text": "..." },
        { "type": "image", "text": "ocr text..." }
      ]
    }
  ]
}

Image files output:

{
  "type": "image",
  "content": "..."
}

Features

Feature	Image	PDF	Document	Presentation
Read text	✅	✅	✅	✅
JSON output	✅	✅	✅	✅
Text replacement	✅	✅	✅	✅
Solid color mask	✅	✅	-	-
█ character mask	-	-	✅	✅
OCR detection	✅	✅	✅ (images)	✅ (images)
Tables	-	✅	✅	✅
Headers/Footers	-	✅	✅	-
Embedded images	-	✅	✅	✅

Environment Setup

使用 uv 安装依赖：

# 进入 skill 目录 cd skills/redact

# 同步依赖（自动创建虚拟环境并安装） uv sync

Dependencies

Python 3.10+
PaddleOCR / PPStructureV3
python-docx, python-pptx, PyMuPDF, Pillow

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

Scripts

CSV Rules Format

Masking Behavior

Read Features

JSON Output

Features

Environment Setup

Dependencies

Scripts

CSV Rules Format

Masking Behavior

Read Features

JSON Output

Features

Environment Setup

Dependencies

安装命令点击复制