安全扫描
OpenClaw
安全
high confidenceThe skill's code, dependencies, and runtime instructions are coherent with a local redaction/ocr toolkit; nothing in the bundle appears to require unrelated credentials or network endpoints, though it relies on heavy OCR libraries and writes temporary files to disk.
评估建议
This package appears to be a legitimate local redaction/OCR toolkit. Before installing, note: 1) It requires heavy ML packages (paddlepaddle/paddleocr/paddlex) that may download large binaries or model files during install or first run and may require a lot of disk space and time. 2) The scripts create a cache directory (~/.cache/redact_temp) and temporary result directories; inspect and clean these if you need to avoid leaving extracted content on disk. 3) The code sets DISABLE_MODEL_SOURCE_CHE...详细分析 ▾
✓ 用途与能力
Name and description claim OCR-based redaction for images/PDF/docx/pptx; the repository includes scripts for reading and redacting each format and lists expected Python OCR and document libraries (PaddleOCR/PPStructureV3, PyMuPDF, python-docx, python-pptx, Pillow). These dependencies and scripts are expected for the stated purpose.
ℹ 指令范围
SKILL.md instructs running the included scripts with a rules CSV and to use 'uv sync' to install dependencies. The scripts operate on local files and perform OCR and in-place replacements/masking. They create temporary directories and a persistent cache directory (~/.cache/redact_temp) for intermediate outputs; some temp directories are cleaned up but others may remain depending on code paths. The runtime instructions do not ask for unrelated files, credentials, or external endpoints, but they do not explicitly warn about model weight downloads (see install_mechanism note).
ℹ 安装机制
There is no platform install spec in the registry; SKILL.md recommends using 'uv sync' to create a venv and install dependencies from pyproject. Dependencies include paddlepaddle/paddleocr/paddlex which are large and may pull model binaries or wheels from package/model hosting during installation or first use. No arbitrary URL downloads or obscure installers are present in the bundle itself.
✓ 凭证需求
The skill declares no required environment variables, credentials, or config paths. The code sets a few environment variables locally (e.g., DISABLE_MODEL_SOURCE_CHECK, FLAGS_use_mkldnn) which affect runtime behavior but are internal to the scripts. No secret-exposing env vars are requested.
✓ 持久化与权限
always is false and the skill does not request elevated or persistent platform privileges. It writes temporary data to disk (creates ~/.cache/redact_temp and various temp dirs) but does not modify other skills or system-wide agent settings.
安全有层次,运行前请审查代码。
运行时依赖
无特殊依赖
版本
latestv0.1.12026/3/22
- Added read.py script to support extracting text (via OCR) from images, PDFs, Word, and PowerPoint files - Updated SKILL.md with usage and output formats for read.py, including structured and JSON output - Switched to uv for environment setup and dependency management instructions - Removed the obsolete scripts/init-runtime.sh file
● 可疑
安装命令 点击复制
官方npx clawhub@latest install redact
镜像加速npx clawhub@latest install redact --registry https://cn.clawhub-mirror.com
技能文档
Privacy redaction toolkit using PPStructureV3 OCR for text detection and replacement.
Scripts
| Script | Format | Command |
|---|---|---|
read.py | Images / PDF / Word / PowerPoint | read.py [--info] [--mode json] |
redact-image.py | Images (png, jpg, etc.) | redact-image.py |
redact-pdf.py | redact-pdf.py | |
redact-document.py | Word (docx, doc) | redact-document.py |
redact-presentation.py | PowerPoint (pptx, ppt) | redact-presentation.py |
CSV Rules Format
target_text,replacement_text
张三,李四
手机号,
身份证号,
| Rule | Effect |
|---|---|
原文本,新文本 | Replace with new text |
原文本, | Empty = mask with █ (documents) or solid color block (images/PDF) |
Masking Behavior
| Format | Empty Replacement |
|---|---|
| Images, PDF | Solid color block overlay |
| Word, PowerPoint | █ characters (same length as target) |
Read Features
read.py supports:
- Reading text from images, PDF, Word, and PowerPoint files
- OCR for image files and embedded images
- Page-aware output for PDF / Word / PowerPoint
--infostructured output:
- ... for OCR text extracted from imagesJSON Output
Document-like files (pdf, docx, doc, pptx) output:
{
"type": "pptx",
"pages": [
{
"page_index": 1,
"content": [
{ "type": "text", "text": "..." },
{ "type": "image", "text": "ocr text..." }
]
}
]
}
Image files output:
{
"type": "image",
"content": "..."
}
Features
| Feature | Image | Document | Presentation | |
|---|---|---|---|---|
| Read text | ✅ | ✅ | ✅ | ✅ |
| JSON output | ✅ | ✅ | ✅ | ✅ |
| Text replacement | ✅ | ✅ | ✅ | ✅ |
| Solid color mask | ✅ | ✅ | - | - |
| █ character mask | - | - | ✅ | ✅ |
| OCR detection | ✅ | ✅ | ✅ (images) | ✅ (images) |
| Tables | - | ✅ | ✅ | ✅ |
| Headers/Footers | - | ✅ | ✅ | - |
| Embedded images | - | ✅ | ✅ | ✅ |
Environment Setup
使用 uv 安装依赖:
# 进入 skill 目录
cd skills/redact# 同步依赖(自动创建虚拟环境并安装)
uv sync
Dependencies
- Python 3.10+
- PaddleOCR / PPStructureV3
- python-docx, python-pptx, PyMuPDF, Pillow
数据来源:ClawHub ↗ · 中文优化:龙虾技能库
OpenClaw 技能定制 / 插件定制 / 私有工作流定制
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制