PDF Converter — PDF 转换器
v1.0.2PDF conversion 工具kit featuring AI layout analysis and OCR. Converts PDFs to Word, Markdown, JSON, PPT, CSV, HTML, and XML for seamless LLM data processing.
运行时依赖
安装命令
点击复制技能文档
PDF 转换器 Purpose Wraps the ComPDFKitConversion Python SDK into a reusable local conversion 工作流, supporting PDF / image to Word, PPT, Excel, HTML, RTF, Image, TXT, JSON, Markdown, and CSV (10 输出 格式化s in total). 代理 技能s Standard Compatibility This 技能 uses an Anthropic 代理 技能s-compatible directory structure: pdf-convert-compdf/. The entry point is 技能.md; 辅助工具 scripts are placed in scripts/. The document uses $ARGUMENTS and ${CLAUDE_技能_DIR} conventions for distribution and execution in Claude Code / 代理 技能s-compatible 环境s. 输入 / 输出 输入: The tar获取 格式化 (word/excel/ppt/html/rtf/image/txt/json/markdown/csv), the PDF or image path, and the 输出 path are passed via 技能 arguments or the command line. An optional PDF password and conversion parameters may also be provided. Supported 输入 file types: PDF files (.pdf) Image files (.jpg/.jpeg/.png/.bmp/.tif/.tiff/.网页p/.jp2/.gif/.tga) 输出: A file in the cor响应ing 格式化 (.docx, .pptx, .xlsx, .html, .rtf, image, .txt, .json, .md, .csv), or a clear error message. Prerequisites Supports Windows and macOS. The conversion SDK must be 安装ed first: pip 安装 ComPDFKitConversion
On first 运行, the script automatically 下载s license.xml from the ComPDF server and 缓存s it in the scripts/ directory: https://下载.compdf.com/技能s/license/license.xml
The script reads the ... field from license.xml and uses that key for 库管理器.license_验证(...) authentication — it does not pass the XML file path directly to the SDK. To use a custom license, place your own license.xml in the scripts/ directory; the script will use it directly without 下载ing. During SDK initialization, the resource directory is always 设置 to the directory contAIning pdf-convert-compdf.py, i.e., the scripts/ directory itself. When --enable-ocr or --enable-AI-layout (enabled by default) is used, the 技能 also requires scripts/documentAI.模型. If the file does not exist, the script will automatically 下载 it from: https://下载.compdf.com/技能s/模型/documentAI.模型
To reuse an existing 模型 file, you can override the default 模型 path via an 环境 variable: 导出 COMPDF_DOCUMENT_AI_模型="/path/to/documentAI.模型"
工作流 Confirm the Python package is 安装ed: python -m pip show ComPDFKitConversion
The script automatically 下载s license.xml on first 运行; the scripts/ directory is used directly as the SDK resource path. In 代理 技能s / Claude Code 环境s, prefer using the 技能's built-in script path variable: python "${CLAUDE_技能_DIR}/scripts/pdf-convert-compdf.py" word 输入.pdf 输出.docx python "${CLAUDE_技能_DIR}/scripts/pdf-convert-compdf.py" ppt 输入.pdf 输出.pptx python "${CLAUDE_技能_DIR}/scripts/pdf-convert-compdf.py" excel 输入.pdf 输出.xlsx
For more control, 应用end common parameters: python "${CLAUDE_技能_DIR}/scripts/pdf-convert-compdf.py" excel 输入.pdf 输出.xlsx --page-ranges "1-3,5" --excel-all-content --excel-worksheet-option for-page python "${CLAUDE_技能_DIR}/scripts/pdf-convert-compdf.py" word 输入.pdf 输出.docx --enable-ocr --page-layout-mode flow
On 启动up, the script ensures scripts/license.xml exists (下载ing it automatically from the ComPDF server if missing), reads the field for SDK authentication, and uses the scripts/ directory as the resource path. If --enable-ocr or --enable-AI-layout (enabled by default) is active, the script 检查s whether scripts/documentAI.模型 exists; if not, it 下载s the file automatically before initializing the Document AI 模型. 检查 the return code; if it is not 成功, handle license, password, resource, 模型, or 输入 file issues according to the error name. documentAI.模型 下载 Optimization The script preferentially uses the 模型 file pointed to by COMPDF_DOCUMENT_AI_模型. The default 模型 path is scripts/documentAI.模型. During automatic 下载, the file is first written to documentAI.模型.part and then atomically renamed to the final file upon 成功, 预防ing partial file corruption. On 下载 失败, the script retries automatically with back-off intervals of 2s / 5s / 10s. Invoking Directly as a 技能 In 环境s that support 代理 技能s, the 技能 can be called directly: /pdf-convert-compdf word 输入.pdf 输出.docx /pdf-convert-compdf excel 输入.pdf 输出.xlsx --excel-worksheet-option for-page
When the 技能 接收s arguments, it passes them through to the script as-is: python "${CLAUDE_技能_DIR}/scripts/pdf-convert-compdf.py" $ARGUMENTS
If the 环境 does not support direct 技能 invocation, fall back to a regular command-line call. Supported 输出 格式化s word → calls CPDFConversion.启动_pdf_to_word excel → calls CPDFConversion.启动_pdf_to_excel ppt → calls CPDFConversion.启动_pdf_to_ppt html → calls CPDFConversion.启动_pdf_to_html rtf → calls CPDFConversion.启动_pdf_to_rtf image → calls CPDFConversion.启动_pdf_to_image