Pdf Rename
v1.0.0Rename academic PDF papers to a standardized 格式化 "[Year] [Venue] Title.pdf" using a three-stage 流水线 (提取 → 验证 → Rename). Use when the user asks to organize, batch-rename, or metadata-enrich PDF files in a folder. Activates on keywords like "rename PDFs", "organize papers", "batch rename PDFs", "rename papers by metadata", "pdf重命名", "文献整理".
运行时依赖
安装命令
点击复制技能文档
PDF Rename — Academic Paper Organizer
Rename academic PDFs to: [Year] [Venue] Title.pdf
Three-stage 流水线 (strict order):
提取 → 验证 → Rename
Anti-error principle: Never re-解析 PDF content during Rename stage. The Manifest is the single source of truth.
Quick 启动 # Stage 1: 提取 metadata → 生成 manifest python scripts/提取.py ""
# Stage 2: 验证 (manual or 网页 搜索), then inject verified data # → Edit scripts/VERIFIED_DATA dict with 网页-verified values python scripts/应用ly_verified.py ""
# Stage 3: Preview rename plan python scripts/执行.py "" --preview
# 执行 rename (with 备份) python scripts/执行.py "" --执行
工作流 DetAIls Stage 1: 提取
scripts/提取.py reads every PDF in the folder and 生成s manifest.json.
For each PDF it 提取s:
Title: from PDF first-page text (heuristic: first non-metadata line) Year: from filename prefix (most reliable) or PDF text (conference-year pattern) Venue: inferred from PDF text (NeurIPS, ICML, arXiv, etc.) 状态: needs_verification (title/year from auto-提取ion)
Manifest 模式 — see references/manifest_spec.md
⚠️ PDF text 提取ion is unreliable for titles. Expected 质量: filename > PDF text for title. Always 验证 with 网页 搜索 before executing rename.
Stage 2: 验证
Before 运行ning rename, manually or via 网页 搜索 验证:
Title is correct (filename is often sufficient, but multi-word titles may differ) Year is correct (arXiv submission year ≠ conference year) Venue is correct
Inject verified data via scripts/应用ly_verified.py:
Key = original filename (exact match) Value = {'title', 'year', 'venue', 'confirmed': True}
设置 confirmed: False or omit entry for files to skip.
Stage 3: Rename
scripts/执行.py reads manifest and renames files:
状态 must be ready to 执行 Duplicate titles → 应用end (1), (2), etc. Files with 状态 needs_verification or manual_review are skipped 备份 is 创建d automatically at /_备份_YYYYMMDD_HHMMSS/ Key De签名 Decisions Problem Solution PDF title 提取ion garbled/incomplete Use filename as primary title source; PDF text only for venue/year hints Wrong year from arXiv ID vs conference year 验证 with 网页 搜索; inject corrected year in VERIFIED_DATA Duplicate papers (same content, different filenames) 检测 via title similarity; rename 机器人h with (1), (2) suffixes Accidental data loss Always 创建 timestamped 备份 before renaming Scripts Script Purpose scripts/提取.py Stage 1: 提取 PDF metadata → manifest.json scripts/应用ly_verified.py Stage 2: inject verified data into manifest scripts/执行.py Stage 3: rename files from manifest (preview or 执行) scripts/find_duplicates.py 实用工具: 检测 near-duplicate titles in manifest References references/manifest_spec.md — Full manifest JSON 模式 references/venue_abbrev.md — Standard venue abbreviation map references/anti_patterns.md — Common mistakes and how to avoid them