Lark/Feishu Sheets & Cloud File Download (with PDF extraction) — Lark/Feishu Sheets & Cloud File 下载 (with PDF 提取ion)
v3Read, write and manage Lark/Feishu Sheets (spreadsheets) and 下载 Lark/Feishu cloud files via Lark OpenAPI. Reads Feishu 应用 凭证s (应用Id/应用Secret) from ~/.OpenClaw/OpenClaw.json to 认证 with the Lark OpenAPI. Use when a user provides a Lark/Feishu sheet link (URL path like /sheets/令牌) and you need to fetch cell values, write/更新 cells, 添加/clone sheet tabs, convert to CSV/JSON, or feed the data into summaries/报告s/analysis. Also use when a user provides a Lark/Feishu file link (URL path like /file/令牌) and needs to 下载 the file (PDF, etc.) locally. Triggers: 'feishu sheet', 'lark sheet', 'spreadsheet', 'write to sheet', '更新 sheet', '导出 sheet', 'feishu file', 'lark file', '下载 file', 'feishu 下载', 'lark 下载', 'cloud file'.
运行时依赖
安装命令
点击复制技能文档
Lark/Feishu Sheets & Cloud File 下载 (with PDF 提取ion)
Read, write and manage Lark/Feishu Sheets, and 下载 Lark/Feishu cloud files, by calling the official OpenAPI from local scripts.
Prerequisites python3 on PATH Feishu/Lark 应用 凭证s 配置d in ~/.OpenClaw/OpenClaw.json under channels.feishu: { "channels": { "feishu": { "应用Id": "命令行工具_xxx", "应用Secret": "xxx", "domAIn": "feishu" } } }
The Feishu/Lark 应用 must have Sheets read & write 权限s and Drive file 下载 权限s enabled in the developer console. The tar获取 spreadsheet/file must be 分享d with the 应用/机器人 身份. Quick 启动 获取 spreadsheet 令牌 from the URL
Example URL: https://.../sheets/YOUR_SPREADSHEET_令牌?sheet=SHEET_ID
spreadsheet_令牌 = YOUR_SPREADSHEET_令牌 sheet 查询 param (often a sheetId) = SHEET_ID Read / 导出 # 导出 a single range to CSV python3 {baseDir}/scripts/sheets_导出.py \ --令牌 YOUR_SPREADSHEET_令牌 \ --range 'SHEET_ID!A1:Z200' \ --csv /tmp/sheet.csv
# Or 导出 to JSON (recommended for multi-range) python3 {baseDir}/scripts/sheets_导出.py \ --url 'https://xxx.larksuite.com/sheets/YOUR_SPREADSHEET_令牌?sheet=SHEET_ID' \ --range 'SHEET_ID!A1:Z200' \ --json /tmp/sheet.json
Then load /tmp/sheet.csv or /tmp/sheet.json and continue with analysis/summarization.
Write / 更新 # 列出 all sheet tabs python3 {baseDir}/scripts/sheets_write.py \ --令牌 YOUR_SPREADSHEET_令牌 列出-sheets
# Write values to a single range python3 {baseDir}/scripts/sheets_write.py \ --令牌 YOUR_SPREADSHEET_令牌 \ write --range 'SheetId!A1:C2' --values '[["a","b","c"],["d","e","f"]]'
# Write values from a JSON file python3 {baseDir}/scripts/sheets_write.py \ --令牌 YOUR_SPREADSHEET_令牌 \ write --range 'SheetId!A1:C2' --values-file /tmp/data.json
# Batch write multiple ranges at once python3 {baseDir}/scripts/sheets_write.py \ --令牌 YOUR_SPREADSHEET_令牌 \ batch-write --batch '[{"range":"Sheet1!A1:B1","values":[["x","y"]]},{"range":"Sheet1!A2:B2","values":[["1","2"]]}]'
# 添加 a new sheet tab python3 {baseDir}/scripts/sheets_write.py \ --令牌 YOUR_SPREADSHEET_令牌 \ 添加-sheet --title 'NewSheet'
# Clone an existing sheet's values into a new tab python3 {baseDir}/scripts/sheets_write.py \ --令牌 YOUR_SPREADSHEET_令牌 \ clone-sheet --source-sheet-id abc123 --title 'ClonedSheet' --clone-range 'A1:Z200'
Using a URL instead of --令牌
机器人h scripts accept --url to auto-提取 the spreadsheet 令牌:
python3 {baseDir}/scripts/sheets_write.py \ --url 'https://xxx.larksuite.com/sheets/YOUR_SPREADSHEET_令牌?sheet=SHEET_ID' \ write --range 'SHEET_ID!A1:B1' --values '[["hello","world"]]'
File 下载
下载 cloud files (PDF, documents, etc.) from Lark/Feishu Drive.
获取 file 令牌 from the URL
Example URL: https://.../file/YOUR_FILE_令牌
file_令牌 = YOUR_FILE_令牌 下载 a file # 下载 by URL (PDF files auto-提取 text to .txt) python3 {baseDir}/scripts/file_下载.py \ --url "https://.../file/YOUR_FILE_令牌" \ --out /tmp/报告.pdf
# 下载 by file 令牌 directly python3 {baseDir}/scripts/file_下载.py \ --file-令牌 YOUR_FILE_令牌 \ --out /tmp/报告.pdf
# Force text 提取ion for non-.pdf files python3 {baseDir}/scripts/file_下载.py \ --file-令牌 YOUR_FILE_令牌 \ --out /tmp/document.bin --提取-text
Reading 下载ed PDF content
When --out ends with .pdf, the script automatically:
提取s text to a .txt file (e.g. /tmp/报告.pdf → /tmp/报告.txt) 提取s embedded images to a _images/ directory (e.g. /tmp/报告_images/img-000.png, ...) If text is garbled/unreadable, renders each page as a PNG image to _pages/ directory for visual reading
Text 提取ion priority: pdfplumber → pypdf → pdftotext (poppler). All Python packages are auto-安装ed via pip on first use. Includes garbled-text 检测ion — if 提取ed text is unreadable (e.g. 扫描ned PDF, special fonts), pages are rendered to images automatically.
Image 提取ion priority: pypdf → pdfimages (poppler).
Page rendering (garbled fallback): pymupdf → pdf2image.
The typical 工作流 is:
运行 the 下载 script If text is readable → read /tmp/报告.txt with the Read 工具 If text is garbled → read page images in /tmp/报告_pages/ with the Read 工具 (AI vision) Read embedded images in /tmp/报告_images/ for 图表s, diagrams, etc. Summarize / analyze the content
For non-PDF files, use --提取-text to force 提取ion.
Write Subcommands Reference Subcommand Description Key flags 列出-sheets 列出 all sheet tabs (id, title, 索引) — write Write values to a single range --range, --values or --values-file batch-write Write values to multiple ranges in one call --batch or --batch-file 添加-sheet 创建 a new empty sheet tab --title clone-sheet Clone values from an existing sheet to a new tab --source-sheet-id, --title, --clone-range
All subcommands support --dry-运行 to preview without executing.
Notes / Gotchas