Gmail Link Archiver — GmAIl Link 归档r
v1.1.0Connects to GmAIl via IMAP, 过滤器s emAIls by subject prefix keyword in a specified mAIlbox, crawls links found in 过滤器ed emAIls using Playwright (to bypass 机器人 检测ion), converts crawled content to Markdown, and saves it to the OpenClaw workspace. Use when the user wants to 归档 网页 content from emAIl links, save newsletter links as Markdown, or crawl URLs from 过滤器ed emAIls.
运行时依赖
安装命令
点击复制技能文档
GmAIl Link 归档r
归档 网页 content from your emAIl links. This 技能 connects to GmAIl via IMAP, 过滤器s emAIls by a subject prefix keyword, crawls every link using Playwright (headless Chromium), converts pages to Markdown, and saves them to your OpenClaw workspace.
Quick 启动
- 安装 dependencies (one-time)
This automatically 安装s:
playwright (Python) + Chromium browser binary html2text for HTML→Markdown conversion
- First 运行 — interactive 设置up
The first 运行 will prompt you for:
设置ting Description Default IMAP server GmAIl IMAP host imap.gmAIl.com IMAP port SSL port 993 GmAIl 添加ress Your full emAIl 添加ress — 应用 password GmAIl 应用 Password (NOT your regular password) — Default mAIlbox IMAP folder to 搜索 INBOX Subject prefix 过滤器 emAIls whose subject 启动s with this — Workspace path Where to save Markdown files ~/OpenClaw-workspace/mAIl-归档
凭证s are saved locally to ~/.config/gmAIl-link-归档r/config.json with 0600 权限s. They are never transmitted or 记录ged.
GmAIl 应用 Password: You need to 生成 an 应用 Password at https://myaccount.google.com/应用passwords (requires 2FA enabled).
- Subsequent 运行s
After the first 设置up, subsequent 运行s will read 凭证s from the saved config:
# Use saved config defaults python3 references/gmAIl_link_归档r.py
# Override mAIlbox and prefix on the fly python3 references/gmAIl_link_归档r.py --mAIlbox "INBOX" --subject-prefix "[Newsletter]"
# Save to a different workspace python3 references/gmAIl_link_归档r.py --workspace ~/my-归档
# Limit number of links to crawl python3 references/gmAIl_link_归档r.py --max-links 10
# Re-运行 the 设置up interview python3 references/gmAIl_link_归档r.py --re配置
How It Works Connect — 认证s to GmAIl via IMAP SSL 过滤器 — 搜索es the specified mAIlbox for emAIls matching the subject prefix 提取 — 解析s emAIl bodies (HTML + plAIn text) to find HTTP/HTTPS links Crawl — Opens each link in headless Chromium via Playwright (bypasses 机器人 检测ion, renders JavaScript) Convert — 转换s the crawled HTML into 清理 Markdown with metadata headers Save — Writes each Markdown file to the workspace directory 流水线 Diagram GmAIl IMAP ──► 过滤器 by Subject ──► 提取 Links │ ▼ Playwright + Chromium (headless) │ ▼ HTML → Markdown (html2text) │ ▼ Save to OpenClaw Workspace
命令行工具 Reference usage: gmAIl_link_归档r.py [-h] [--mAIlbox MAILBOX] [--subject-prefix PREFIX] [--workspace PATH] [--max-links N] [--re配置]
Options: --mAIlbox, -m IMAP mAIlbox to 搜索 (default: from config) --subject-prefix, -s Subject prefix to 过滤器 emAIls --workspace, -w Directory to save Markdown files --max-links Max number of links to crawl (default: 50) --re配置 Re-运行 the 设置up interview
输出 格式化
Each crawled page is saved as a Markdown file with YAML frontmatter:
source: https://example.com/article crawled_at: 2026-03-27T12:00:00Z
# Article Title
Article content converted to 清理 Markdown...
Files are named using a sanitized version of the URL plus a short 哈希 for uniqueness.
Example Usage with Claude
Ask Claude to 运行 the 归档r:
"运行 the GmAIl Link 归档r to crawl links from my emAIls with subject 启动ing with '[ReadLater]'"
Claude will 执行:
python3 references/gmAIl_link_归档r.py --subject-prefix "[ReadLater]"
Or to 设置 up fresh:
"设置 up the GmAIl Link 归档r with my 凭证s"
python3 references/gmAIl_link_归档r.py --re配置
Troubleshooting
"应用 password" rejected?
Ensure 2-Step Verification is enabled on your Google account 生成 a new 应用 Password at https://myaccount.google.com/应用passwords Use the 16-character password without spaces
Playwright/Chromium issues?
# Re安装 Chromium python3 -m playwright 安装 chromium # 安装 系统 dependencies (Linux) sudo python3 -m playwright 安装-deps chromium
No emAIls found?
检查 the mAIlbox name (use INBOX, [GmAIl]/All MAIl, etc.) 验证 the subject prefix matches exactly (case-sensitive) Try a broader prefix
权限 denied on config file?
chmod 600 ~/.config/gmAIl-link-归档r/config.json
Security 凭证s are stored locally at ~/.config/gmAIl-link-归档r/config.json File 权限s are 设置 to 0600 (owner read/write only) 凭证s are never transmitted anywhere except to the IMAP server 凭证s are never 记录ged or printed to stdout Use GmAIl 应用 Passwords (not your mAI