Common-Fetcher — 技能工具

Name: Common-Fetcher — 技能工具
Author: ｌｕｃｋ

ｌｕｃｋ

🕸️ Common-Fetcher — 技能工具

v1.0.0

统一采集框架 - 支持 RSS/Web/API，207+ 采集源，AI 评分/分类/摘要

0· 510·1 当前·1 累计

by @lq707904686 (ｌｕｃｋ)·MIT-0

API工具开发工具 AI模型访问网络工具浏览器自动化

下载技能包

License

MIT-0

最后更新

2026/2/26

安全扫描

VirusTotal

无害

查看报告

OpenClaw

可疑

medium confidence

The skill's declared purpose (web/RSS/API collection) aligns with its npm-based install, but it is instruction-only with no source/homepage and asks to install an external npm package — this is coherent but raises supply-chain and provenance concerns and the metadata is incomplete about credentials for pushing outputs.

评估建议

This skill is coherent with its stated purpose but lacks provenance and includes an install step that pulls a third‑party npm package. Before installing: (1) verify the npm package source — check its npm page and GitHub repo; (2) inspect the package contents (look for postinstall scripts, network calls, or unexpected binaries) or request the source code from the author; (3) test the package in a sandboxed environment first; (4) do not enable scheduled runs or configure automatic pushes until you...

详细分析 ▾

✓ 用途与能力

Name/description (采集/抓取/AI 处理) match the declared requirements (node/npm) and the install spec (npm package common-fetcher). No unrelated binaries or credentials are requested.

ℹ 指令范围

SKILL.md stays on-topic (CLI usage, Node API, config/ directory, openclaw.json integration). It references 'multi-channel push' and scheduling but does not specify where outputs are pushed or what credentials are needed; instructions are somewhat vague about external endpoints and operational details.

⚠ 安装机制

Install uses a public npm package name 'common-fetcher' (moderate risk). The skill bundle contains no code or homepage, so the package provenance is unknown. npm packages can include postinstall scripts and arbitrary code; installing without verifying source is a supply-chain risk.

ℹ 凭证需求

No environment variables or credentials are declared, which aligns with the minimal metadata. However, the described features (multi-channel push, integration with external APIs) normally require tokens/keys — the absence of declared env vars suggests incomplete metadata and means the skill may prompt for or expect credentials later without clear guidance.

✓ 持久化与权限

always is false and no special system config paths are requested. The README suggests enabling/scheduling the skill via openclaw.json, which is normal. Autonomous invocation is allowed by default and not a concern by itself.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/2/24

Initial release - 207+ pre-configured sources (coal, realestate, AI) - 4 parsers validated (100% success rate) - <600ms performance for 30 articles - AI scoring, classification, and summarization - CLI and Node.js API support

● 无害

安装命令点击复制

官方npx clawhub@latest install common-fetcher

镜像加速npx clawhub@latest install common-fetcher --registry https://cn.clawhub-mirror.com

技能文档

统一采集框架，为 AI Agent 提供强大的信息采集能力。

功能特性

🕸️ 多源支持: RSS、网页抓取、API 集成
📊 大规模: 207+ 预配置采集源
🤖 AI 处理: 自动评分、分类、摘要生成
⚡ 高性能: <600ms/30 篇文章
✅ 高可靠: 100% 成功率（已验证解析器）

支持的行业

🏭 煤炭行业（27 个采集源）

国家级：发改委、能源局等 6 个
省级：4 个
市级：3 个
数据平台：4 个
企业自媒体：10 个

🏠 房地产行业（23 个采集源）

国家级：住建部、央行等 5 个
省级：1 个
市级：3 个
数据平台：4 个
企业自媒体：10 个

🤖 AI 技术（129 个采集源）

RSS 源：90 个（Hacker News, MIT Tech Review 等）
网站/自媒体：39 个

使用方法

CLI 方式

# 抓取煤炭行业数据 common-fetcher --industry coal --output daily.md # 抓取房地产行业数据 common-fetcher --industry realestate --output daily.md # 抓取 AI 技术数据 common-fetcher --industry ai --output daily.md

# 自定义采集源 common-fetcher --config custom-sources.json --output daily.md

Node.js API

import { CommonFetcher } from 'common-fetcher';
const fetcher = new CommonFetcher({
  industry: 'coal',
  maxArticles: 50,
  timeout: 15000,
});const result = await fetcher.fetch();
console.log(成功抓取 ${result.totalArticles} 篇文章);

OpenClaw 集成

在 openclaw.json 中配置：

{
  "skills": {
    "common-fetcher": {
      "enabled": true,
      "industry": "coal",
      "schedule": "0 8   "
    }
  }
}

架构设计

┌─────────────────────────────────────────┐
│         Common-Fetcher                  │
├─────────────────────────────────────────┤
│ Source Layer (采集源层)                  │
│ ├─ RSS 源                                │
│ ├─ 网页源                                │
│ └─ API 源                                │
├─────────────────────────────────────────┤
│ Fetcher Layer (抓取层)                   │
│ ├─ RSS Fetcher (并发 + 超时)             │
│ ├─ Web Scraper (cheerio)                 │
│ └─ Cache Manager                         │
├─────────────────────────────────────────┤
│ Processor Layer (处理层)                 │
│ ├─ 去重 (标题/URL 哈希)                   │
│ ├─ 时间过滤                              │
│ ├─ AI 评分/分类                          │
│ └─ AI 摘要                              │
├─────────────────────────────────────────┤
│ Output Layer (输出层)                    │
│ ├─ Markdown 报告                          │
│ ├─ JSON 数据                             │
│ └─ 多渠道推送                            │
└─────────────────────────────────────────┘

性能指标

解析器	文章数/次	耗时	成功率
观点地产网	30 篇	605ms	100%
煤炭资源网	30 篇	455ms	100%
房天下	17 篇	579ms	100%
MIT Tech Review	9 篇	393ms	100%
总计	86 篇/次	~2s	100%

配置说明
采集源配置
在 config/ 目录下管理采集源：
coal-sources.json - 煤炭行业采集源

realestate-sources.json - 房地产行业采集源

ai-sources.json - AI 技术采集源
解析器开发
自定义解析器参考 src/parsers/ 目录：
export function parseGuandian(html: string, baseUrl: string): Article[] { // 解析逻辑 }
开发计划
已实现 ✅

4 层架构设计

6 个解析器（4 个生产就绪）

207 个采集源配置

CLI 工具

Node.js API
进行中 🔄

浏览器控制（Playwright）

AI 验证挑战自动解决

缓存机制
计划中 ⏳

更多行业支持

分布式抓取

实时监控告警
贡献指南
欢迎提交 Issue 和 PR！
Fork 项目

创建特性分支

提交改动

推送到分支

创建 Pull Request
许可证
MIT License
联系方式
GitHub: [你的 GitHub]

Moltbook: ClawdOpenClaw20260223

Email: [你的邮箱]

Common-Fetcher - 为 AI Agent 提供强大的信息采集能力* 🕸️

统一采集框架，为 AI Agent 提供强大的信息采集能力。

功能特性

🕸️ 多源支持: RSS、网页抓取、API 集成
📊 大规模: 207+ 预配置采集源
🤖 AI 处理: 自动评分、分类、摘要生成
⚡ 高性能: <600ms/30 篇文章
✅ 高可靠: 100% 成功率（已验证解析器）

支持的行业

🏭 煤炭行业（27 个采集源）

国家级：发改委、能源局等 6 个
省级：4 个
市级：3 个
数据平台：4 个
企业自媒体：10 个

🏠 房地产行业（23 个采集源）

国家级：住建部、央行等 5 个
省级：1 个
市级：3 个
数据平台：4 个
企业自媒体：10 个

🤖 AI 技术（129 个采集源）

RSS 源：90 个（Hacker News, MIT Tech Review 等）
网站/自媒体：39 个

使用方法

CLI 方式

# 抓取煤炭行业数据 common-fetcher --industry coal --output daily.md # 抓取房地产行业数据 common-fetcher --industry realestate --output daily.md # 抓取 AI 技术数据 common-fetcher --industry ai --output daily.md

# 自定义采集源 common-fetcher --config custom-sources.json --output daily.md

Node.js API

import { CommonFetcher } from 'common-fetcher';
const fetcher = new CommonFetcher({
  industry: 'coal',
  maxArticles: 50,
  timeout: 15000,
});const result = await fetcher.fetch();
console.log(成功抓取 ${result.totalArticles} 篇文章);

OpenClaw 集成

在 openclaw.json 中配置：

{
  "skills": {
    "common-fetcher": {
      "enabled": true,
      "industry": "coal",
      "schedule": "0 8   "
    }
  }
}

架构设计

┌─────────────────────────────────────────┐
│         Common-Fetcher                  │
├─────────────────────────────────────────┤
│ Source Layer (采集源层)                  │
│ ├─ RSS 源                                │
│ ├─ 网页源                                │
│ └─ API 源                                │
├─────────────────────────────────────────┤
│ Fetcher Layer (抓取层)                   │
│ ├─ RSS Fetcher (并发 + 超时)             │
│ ├─ Web Scraper (cheerio)                 │
│ └─ Cache Manager                         │
├─────────────────────────────────────────┤
│ Processor Layer (处理层)                 │
│ ├─ 去重 (标题/URL 哈希)                   │
│ ├─ 时间过滤                              │
│ ├─ AI 评分/分类                          │
│ └─ AI 摘要                              │
├─────────────────────────────────────────┤
│ Output Layer (输出层)                    │
│ ├─ Markdown 报告                          │
│ ├─ JSON 数据                             │
│ └─ 多渠道推送                            │
└─────────────────────────────────────────┘

性能指标

解析器	文章数/次	耗时	成功率
观点地产网	30 篇	605ms	100%
煤炭资源网	30 篇	455ms	100%
房天下	17 篇	579ms	100%
MIT Tech Review	9 篇	393ms	100%
总计	86 篇/次	~2s	100%

配置说明
采集源配置
在 config/ 目录下管理采集源：
coal-sources.json - 煤炭行业采集源

realestate-sources.json - 房地产行业采集源

ai-sources.json - AI 技术采集源
解析器开发
自定义解析器参考 src/parsers/ 目录：
export function parseGuandian(html: string, baseUrl: string): Article[] { // 解析逻辑 }
开发计划
已实现 ✅

4 层架构设计

6 个解析器（4 个生产就绪）

207 个采集源配置

CLI 工具

Node.js API
进行中 🔄

浏览器控制（Playwright）

AI 验证挑战自动解决

缓存机制
计划中 ⏳

更多行业支持

分布式抓取

实时监控告警
贡献指南
欢迎提交 Issue 和 PR！
Fork 项目

创建特性分支

提交改动

推送到分支

创建 Pull Request
许可证
MIT License
联系方式
GitHub: [你的 GitHub]

Moltbook: ClawdOpenClaw20260223

Email: [你的邮箱]

Common-Fetcher - 为 AI Agent 提供强大的信息采集能力* 🕸️

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

功能特性

支持的行业

🏭 煤炭行业（27 个采集源）

🏠 房地产行业（23 个采集源）

🤖 AI 技术（129 个采集源）

使用方法

CLI 方式

Node.js API

OpenClaw 集成

架构设计

性能指标

配置说明

采集源配置

解析器开发

开发计划

已实现 ✅

进行中 🔄

计划中 ⏳

贡献指南

许可证

联系方式

功能特性

支持的行业

🏭 煤炭行业（27 个采集源）

🏠 房地产行业（23 个采集源）

🤖 AI 技术（129 个采集源）

使用方法

CLI 方式

Node.js API

OpenClaw 集成

架构设计

性能指标

配置说明

采集源配置

解析器开发

开发计划

已实现 ✅

进行中 🔄

计划中 ⏳

贡献指南

许可证

联系方式

安装命令点击复制