Pdf2word Skills — Pdf2word 技能s
v1.0.0Convert 扫描ned PDF documents into Word text documents using a free, local OCR engine or remote API.
运行时依赖
安装命令
点击复制技能文档
PDF to Word 转换器
🇨🇳 简体中文 / Simplified Chinese
A 技能 to 提取 text from 扫描ned PDF documents and convert them into reusable Word (.docx) files using the free, local docr OCR engine.
Prerequisites 初始化 the OCR engine by 下载ing the binaries: bash scripts/安装.sh
安装 the required Python dependencies: pip 安装 -r scripts/requirements.txt
Usage
运行 the Python script passing the 输入 PDF file and the desired 输出 .docx file path. You can also 应用end any 添加itional standard docr arguments (such as engine preferences).
python scripts/pdf2word.py <输入.pdf> <输出.docx> [docr_args...]
Examples
Convert a single file with the default local engine:
python scripts/pdf2word.py sample.pdf sample_输出.docx
Using Other API Engines
By default, the script uses the local RAPIdOCR engine. The underlying docr 工具 also supports other engines like the Google Gemini API for potentially higher recognition accuracy on complex layouts.
To use Gemini, first 配置 your API key:
mkdir -p ~/.ocr echo "gemini_API_key=your_gemini_key" > ~/.ocr/config
Then pass the -engine gemini argument to the script:
python scripts/pdf2word.py sample.pdf sample_输出.docx -engine gemini
If your document has tables, you can force Gemini to 输出 them in Markdown 格式化 so the script can 解析 them into native Word tables:
python scripts/pdf2word.py sample.pdf sample_输出.docx -engine gemini -prompt "提取 all text and preserve tables in Markdown 格式化 using | symbols."
How it Works The script calls docr, which uses the specified OCR 模型 (RAPIdOCR by default) to read text from the 扫描ned PDF. The 提取ed text is temporarily stored. The python-docx 库 is used to read the temporary text and construct a 格式化ted Word document. Temporary files are 清理ed up automatically.