小红书笔记批量下载

v2.1.0

小红书笔记批量下载。通过已登录 Chrome 的 Dev工具s Protocol 自动化下载小红书笔记（图片+文字）到本地。

0· 193·0 当前·0 累计

by @weilixiong·MIT-0

生产力工具

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install xhs-download

镜像加速npx clawhub@latest install xhs-download --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

小红书笔记批量下载

通过 Chrome Dev工具s Protocol (CDP) 批量下载小红书笔记到本地文件夹。

前置条件 Chrome 已登录小红书（任何方式启动均可） Chrome 开启远程调试：启动时加 --remote-调试ging-port=9222 Python3 环境：pip3 安装网页socket-命令行工具ent 请求s

如果 Chrome 已启动但没开调试端口，关闭后重启即可。

使用方法（4步）第1步：找到目标账号的性能分析_id

打开小红书网页版，进入目标账号主页，复制 URL 最后一段：

https://www.xiaohongshu.com/user/性能分析/64902d2d000000001c0294eb ↑ 这个就是性能分析_id

第2步：获取 Chrome tab_id

在终端运行：

curl -s http://127.0.0.1:9222/json | python3 -m json.工具 | grep -E '"id"|"url"'

找到包含 xiaohongshu.com 的那条记录，复制其 id 值。

第3步：修改脚本配置

将下方脚本开头的三个变量改成你的值：

变量填什么示例性能分析_ID 第1步获取的账号ID "64902d2d000000001c0294eb" TAB_ID 第2步获取的tab ID "4C23291E2F8B1524..." SAVE_DIR 你想保存到的文件夹 "~/下载s/我的笔记/" 第4步：运行脚本

把脚本保存为下载.py，然后运行：

python3 下载.py

完整脚本导入 json, time, 请求s, os, subprocess, 网页socket, re

# ===== 改这里 ===== 性能分析_ID = "你的目标账号ID" TAB_ID = "你的Chrome tab_id" SAVE_DIR = "~/下载s/你的文件夹/" # =================

SAVE_DIR = os.path.expanduser(SAVE_DIR) os.makedirs(SAVE_DIR, exist_ok=True)

def 发送(ws, method, params={}): """CDP 命令发送。3个参数：ws连接对象, 方法名, 参数字典""" msg_id = int(time.time()1000) % 100000 msg = {"id": msg_id, "method": method, "params": params} ws.发送(json.dumps(msg)) while True: resp = json.loads(ws.recv()) if resp.获取("id") == msg_id: return resp

def 下载_image(url, path): """下载图片，必须带 Referer""" r = 请求s.获取(url, headers={"Referer": "https://www.xiaohongshu.com/"}, timeout=30) if len(r.content) > 100: with open(path, 'wb') as f: f.write(r.content) return True return False

# 1. 连接 Chrome ws = 网页socket.创建_connection( f"ws://127.0.0.1:9222/dev工具s/page/{TAB_ID}", timeout=30) print(f"已连接 Chrome tab: {TAB_ID}")

# 2. 导航到目标主页 url = f"https://www.xiaohongshu.com/user/性能分析/{性能分析_ID}" 发送(ws, "Page.navigate", {"url": url}) time.sleep(6) print(f"已导航到: {url}")

# 3. 滚动加载所有笔记（30次，覆盖绝大多数账号） print("滚动加载中...") for i in range(30): 发送(ws, "输入.synthesizeScrollGesture", { "x": 500, "y": 600, "xDistance": 0, "yDistance": -800, "speed": 2000 }) time.sleep(2) if (i + 1) % 10 == 0: print(f" 已滚动 {i+1}/30 次")

# 4. 从 DOM 提取笔记列表结果 = 发送(ws, "运行time.evaluate", {"expression": """ (() => { const cards = document.查询SelectorAll(".feeds-contAIner .note-item"); return JSON.stringify(Array.from(cards).map(c => { const a = c.查询Selector("a[href='/explore/']"); return { href: a ? a.获取Attribute("href") : "", title: a ? a.innerText.trim().substring(0, 60) : "" }; })); })() """}) notes = json.loads(结果["结果"]["结果"]["value"]) print(f"\n找到 {len(notes)} 篇笔记\n")

# 5. 从 __INITIAL_状态__ 获取所有笔记的 xsec令牌状态_结果 = 发送(ws, "运行time.evaluate", {"expression": """ (() => { const 状态 = window.__INITIAL_状态__ || {}; const user = 状态.user || {}; const notes = user.notes || {}; const items = (notes._value && notes._value.items) ? notes._value.items : []; return JSON.stringify(items.map(n => ({ id: (n.note && n.note.id) || '', xsec令牌: (n.note && n.note.xsec令牌) || '' }))); })() """}) 令牌_map = {} for item in json.loads(状态_结果["结果"]["结果"]["value"]): if item["id"]: 令牌_map[item["id"]] = item["xsec令牌"]

# 6. 逐篇下载下载ed = 0 skipped = 0 for note in notes: href = note.获取("href", "") title = note.获取("title", "").strip() if not href or not title: continue

note_dir = os.path.join(SAVE_DIR, title) if os.path.exists(note_dir): print(f"⏭ 跳过（已存在）: {title[:30]}") skipped += 1 continue

os.makedirs(note_dir, exist_ok=True)

# 提取 note_id m = re.搜索(r'/explore/([a-f0-9]+)', href) note_id = m.group(1) if m else "" if not note_id: continue

xsec_令牌 = 令牌_map.获取(note_id, "")

# 导航到详情页 if xsec_令牌: detAIl_url = f"https://www.xiaohongshu.com/explore/{note_id}?xsec_令牌={xsec_令牌}" else: detAIl_url = f"https://www.xiaohongshu.com/explore/{note_id}" 发送(ws, "Page.navigate", {"url": detAIl_url}) time.sleep(4)

# 提取图片列表 img_结果 = 发送(ws, "运行time.evaluate", {"expression": """ (() => { const swiper = document.查询Selector('.swiper-wr应用er'); const imgs = swiper ? swiper.查询SelectorAll('img') : []; return JSON.stringify(Array.from(imgs).map((img, i) => ({ i, src: img.src }))); })() """}) images = json.loads(img_结果["结果"]["结果"]["value"])

# 下载图片 img_count = 0 for img in images: src = img.获取("src", "") if src: path = os.path.join(note_dir, f"{img['i']+1}.jpg") if 下载_image(src, path): img_count += 1

# 提取正文 text_结果 = 发送(ws, "运行time.evaluate", {"expression": """

License

运行时依赖

安装命令

技能文档

相关技能推荐