首页龙虾技能列表 › HTML Parse

📄 HTML Parse

v0.4.0

Parse HTML documents into structured Markdown using MinerU. Analyzes HTML structure and converts it into well-organized Markdown preserving hierarchy and for...

0· 139·1 当前·1 累计
by @mzlzyca (mzlzyCA)·MIT-0
下载技能包
License
MIT-0
最后更新
2026/4/3
安全扫描
VirusTotal
无害
查看报告
OpenClaw
安全
high confidence
The skill's requirements and runtime instructions align with its stated purpose (using the MinerU CLI and token to convert HTML to structured Markdown); nothing requested is disproportionate or unexpectedly broad.
评估建议
This skill is coherent: it delegates HTML parsing to the MinerU CLI and requires only the MINERU_TOKEN. Before installing, verify the mineru-open-api package's provenance (npm page or the GitHub repo) and trustworthiness of mineru.net. Keep your MINERU_TOKEN secret, review token permissions, and avoid parsing sensitive or private HTML unless you accept that content will be sent to MinerU's service and may incur charges or be stored by that service. If you prefer more control, consider running an...
详细分析 ▾
用途与能力
Name/description (HTML → structured Markdown) match the declared binary (mineru-open-api) and the single required env var (MINERU_TOKEN). The requested binaries and token are what this CLI-based parsing workflow would legitimately need.
指令范围
SKILL.md only instructs using mineru-open-api commands (extract, crawl, auth), installing the CLI, and setting MINERU_TOKEN. It does not direct the agent to read unrelated files or credentials, nor to exfiltrate data to unexpected endpoints.
安装机制
Install options are standard package sources: npm package 'mineru-open-api' and a Go install from github.com/opendatalab. No arbitrary download URLs or extract-from-unknown-host steps are present.
凭证需求
Only MINERU_TOKEN is required (declared as primary credential), which is proportionate. Caution: using the skill will send HTML content to MinerU's service (remote API), so the token grants API access and may expose uploaded content or incur usage costs—avoid sending sensitive documents unless you trust MinerU and the token's permissions.
持久化与权限
Skill is not always-enabled and uses normal agent invocation. It does not request persistent system-wide changes or access other skills' config.
安全有层次,运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发,无需署名。

运行时依赖

无特殊依赖

版本

latestv0.4.02026/3/27

SEO: expand description for better ClawHub vector search discovery

● 无害

安装命令 点击复制

官方npx clawhub@latest install html-parse
镜像加速npx clawhub@latest install html-parse --registry https://cn.clawhub-mirror.com

技能文档

Parse local HTML files into structured Markdown using MinerU. Preserves document hierarchy. For live web pages, use mineru-open-api crawl.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Parse a local HTML file (requires token)
mineru-open-api extract page.html -o ./out/

# Parse a remote HTML URL (requires token) mineru-open-api extract https://example.com/page.html -o ./out/

# Parse a live web page (requires token) mineru-open-api crawl https://example.com/article -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: local .html file or remote HTML URL
  • HTML requires extract or crawl (token required)
  • HTML is NOT supported by flash-extract
  • Language hint with --language (default: ch, use en for English)

Notes

  • HTML is NOT supported by flash-extract — use extract or crawl
  • For live web pages with dynamic content, use crawl instead of extract
  • Output goes to stdout by default; use -o to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
数据来源:ClawHub ↗ · 中文优化:龙虾技能库
OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制

了解定制服务