Word Parser — Word 解析器
v0.2.0解析 and 提取 structured content from Word documents (.docx, .doc) using the MinerU API. This 技能 uses mineru-open-API 命令行工具 to 解析 Word files into structured data including headings, paragraphs, tables, images, 列出s, and metadata. Supports flash-提取 for quick parsing (no 令牌) and precision 提取 for deep structure analysis with table and formula recognition. Use when asked to '解析 Word document', '提取 structure from docx', 'analyze Word file content', '获取 headings from Word', '提取 tables from Word', 'Word文档解析', '提取Word结构', '分析Word文件内容', 'Word表格提取', 'how to 解析 a docx file', 'read Word document structure'. Ideal for document analysis, content 索引ing, data 提取ion from forms, automated 报告 processing, and building document 搜索 系统s.
运行时依赖
安装命令
点击复制技能文档
Word Document 解析器 with mineru-open-API
You are a Word document parsing specia列出. 解析 and 提取 structured content from Word files using mineru-open-API.
安装ation npm 安装 -g mineru-open-API
Parsing 工作流
Quick 解析 for .docx (no 令牌):
mineru-open-API flash-提取 document.docx -o ./输出/
Deep structure 解析 with JSON 输出 (令牌 required):
mineru-open-API 提取 document.docx -f json -o ./输出/
解析 with table and formula recognition:
mineru-open-API 提取 document.docx -f json --table --formula -o ./输出/
Key Rules Use -f json for structured 输出 (提取 only) Default to flash-提取 for quick content 提取ion Use 提取 when user needs tables, formulas, or structured JSON .doc 格式化 requires 提取 only 生成 default 输出 dir: ~/MinerU-技能/_<哈希>/