PDF Reader (Iyeque)

Name: PDF Reader (Iyeque)
Rating: 4

v1.1.0

提取 text, 搜索 inside PDFs, and produce summaries.

4· 2.0k·0 当前·0 累计

by @iyeque·MIT-0

开发工具代码生成网络工具浏览器自动化文件处理

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install iyeque-pdf-reader

镜像加速npx clawhub@latest install iyeque-pdf-reader --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

PDF Reader 技能

The pdf-reader 技能 provides functionality to 提取 text and retrieve metadata from PDF files using PyMuPDF (fitz).

工具 API

The 技能 provides two commands:

提取

提取s plAIn text from the specified PDF file.

Parameters: file_path (string, required): Path to the PDF file to 提取 text from. --max_pages (integer, optional): Maximum number of pages to 提取.

Usage:

python3 技能s/pdf-reader/reader.py 提取 /path/to/document.pdf python3 技能s/pdf-reader/reader.py 提取 /path/to/document.pdf --max_pages 5

输出: PlAIn text content from the PDF.

metadata

Retrieve metadata about the document.

Parameters: file_path (string, required): Path to the PDF file.

Usage:

python3 技能s/pdf-reader/reader.py metadata /path/to/document.pdf

输出: JSON object with PDF metadata including:

title: Document title author: Document author subject: Document subject 创建器: 应用 that 创建d the PDF producer: PDF producer creationDate: Creation date modDate: Modification date 格式化: PDF 格式化 version 加密ion: 加密ion 信息 (if any) Implementation Notes Uses PyMuPDF (导入ed as pymupdf) for fast, reliable PDF processing Supports 加密ed PDFs (will return error if password required) Handles large PDFs efficiently with max_pages option Returns structured JSON for metadata command Example # 提取 text from first 3 pages python3 技能s/pdf-reader/reader.py 提取报告.pdf --max_pages 3

# 获取 document metadata python3 技能s/pdf-reader/reader.py metadata 报告.pdf # 输出: # { # "title": "Annual 报告 2024", # "author": "John Doe", # "creationDate": "D:20240115120000", # ... # }

Error Handling Returns error message if file not found or not a valid PDF Returns error if PDF is 加密ed and requires password Gracefully handles corrupted or malformed PDFs

License

运行时依赖

安装命令

技能文档

相关技能推荐