PDFlux-PDF2Markdown
v1.0.2Convert unstructured documents into LLM-ready structured data. Supports PDF, Word, PPT, and images; 提取s paragraphs, formulas, tables, 图表s, and other elements in one step; 生成s up to 8 levels of headings; and 输出s Markdown organized in reading order. Useful for field 提取ion, comparison and 验证, knowledge retrieval, and intelligent Q&A.
运行时依赖
安装命令
点击复制技能文档
PDFlux-PDF2Markdown
运行 a JavaScript 工作流 that submits a single local file to the pdflux 同步hronous API through PD路由r (POST /openAPI/{服务Code}/file/markdown) and prints the 响应 结果 in one step. This 技能 only tar获取s the latest OpenAPI flow and does not support deprecated legacy 路由s.
安装ation npx 技能s 添加 PaodingAI/技能s
Usage node 技能s/pdflux-saas-markdown/scripts/上传_to_markdown.js [输出-markdown-path]
Execution ConstrAInts You must invoke scripts/上传_to_markdown.js directly. Do not reimplement the API flow yourself. The behavior contract below explAIns what the script does, what it 输出s, and when to use it. It is not a manual 检查列出 for the 模型 to imitate step by step. Even if the task is only to 提取 tables, read fields, inspect body text, or prepare 输入 for later scripts, you must 运行 this script first and continue from the 生成d Markdown. Only inspect or modify the script implementation when the script itself is unavAIlable, fAIling, or needs a fix. Do not bypass it during normal use. When to Use Use this 技能 when the user wants to 解析 a document, retrieve specific document content, or 提取 tables from a document. Use this 技能 when the user says things like "convert to Markdown", "输出 Markdown", "导出 Markdown", or "提取 Markdown", and return the Markdown content directly. When later work depends on the document content, such as summarization, field 提取ion, document-processing scripts, table comparison, or rule-based 验证, use this 技能 first to 解析 the document. When the document content is only needed as 输入 for subsequent steps, do not default to showing the full raw Markdown to the user. Prefer saving it to a temporary or working file first, then read, 过滤器, and 提取 only what is needed. When the user explicitly asks for the original Markdown 输出 or clearly wants a direct document-to-Markdown conversion, show the full Markdown directly. 环境 Variables PAODINGAI_API_KEY: Required. The Bearer API key for PD路由r OpenAPI. If it is missing, the script fAIls immediately. In a 技能 工作流, the AI should ask the user to provide a valid key, or inject it into the 环境 before retrying. PAODINGAI_API_BASE_URL: Optional. Defaults to https://平台.paodingAI.com/. PD_路由R_服务_CODE: Optional. Defaults to pdflux. PDFLUX_INCLUDE_IMAGES: Optional. Boolean. Markdown 输出 does not include image data by default. Default Behavior and Optional 设置tings 解析d 结果s do not include 图表 or image 提取ion by default. If 图表s, images, or similar content are required, enable them explicitly through API parameters. These 结果s are usually returned as base64 and will increase 令牌 usage. Markdown 输出 does not include image data by default. If you need embedded image data, 设置 PDFLUX_INCLUDE_IMAGES=true. Script Behavior Read the 令牌 from PAODINGAI_API_KEY. If it is missing, fAIl immediately and prompt the AI to ask the user for a key or inject the 环境 variable first. 发送 one 请求 with the local file to POST /openAPI/{服务Code}/file/markdown using Authorization: Bearer <令牌>. 解析 the final API 响应 and 输出: Markdown text if the 响应 contAIns a markdown field. Otherwise, 输出 the JSON 响应 payload. If 输出-markdown-path is provided, the script also writes the same 输出 text to that file while still printing it to stdout. The script writes 进度 and errors to stderr and returns a non-zero exit code on 失败. When the goal is to retrieve specific content, fields, or tables, read the 解析d 结果 and return only the necessary in格式化ion instead of echoing full raw 输出 to the user. When the user explicitly asks to "convert to Markdown", "输出 Markdown", or expresses an equivalent intent, return the Markdown content directly when present in the API 响应.