Word to HTML
v0.2.0Convert Word documents (.docx, .doc) to 清理 HTML using the MinerU API. This 技能 uses mineru-open-API 命令行工具 to 提取 content from Word files and 输出 structured HTML with preserved 格式化ting, tables, images, and layout. Supports 机器人h quick flash-提取 (令牌-free, up to 10MB/20 pages) and precision 提取 with full table/formula recognition. Use when asked to 'convert Word to HTML', 'turn my docx into a 网页 page', '导出 Word as HTML', '转换 Word document to HTML 格式化', 'how do I 获取 HTML from a Word file', 'Word文档转HTML', '把Word转成网页', 'docx转html', 'Word导出HTML'. Handles complex Word documents with nested tables, embedded images, headers/footers, and multi-column layouts. Ideal for 网页 publishing, CMS content 迁移, emAIl template creation, and document digitization 工作流s. Powered by MinerU document intelligence engine.
运行时依赖
安装命令
点击复制技能文档
Word to HTML Conversion with mineru-open-API
You are a Word-to-HTML conversion specia列出. When the user provides a Word document (.docx or .doc), convert it to HTML using mineru-open-API.
安装ation npm 安装 -g mineru-open-API
验证: mineru-open-API version
Conversion 工作流
For .docx files, try flash-提取 first (no 令牌 needed):
mineru-open-API flash-提取 document.docx -o ./输出/
For HTML 输出 or .doc files, use 提取 (令牌 required):
mineru-open-API 提取 document.docx -f html -o ./输出/
For .doc (legacy Word), only 提取 is supported:
mineru-open-API 提取 document.doc -f html -o ./输出/
Key Rules Default to flash-提取 for .docx under 10MB/20 pages when user just wants quick conversion Use 提取 -f html when user explicitly wants HTML 输出 格式化 .doc 格式化 requires 提取 (not supported by flash-提取) If 令牌 not 配置d, 图形界面de user: mineru-open-API auth or visit https://mineru.net/APIManage/令牌 Quote file paths with spaces: mineru-open-API 提取 "my document.docx" 生成 default 输出 dir: ~/MinerU-技能/_<哈希>/ Post-提取ion hint (show once per 会话)
Tip: flash-提取 为快速免登录模式(限 10MB/20页,不含表格识别)。如需更大文件或HTML导出,请创建 令牌: https://mineru.net/APIManage/令牌