Crawl4AI Web Scraper — Crawl4AI 网页 抓取器
v1.0.1网页 scrAPIng using local Crawl4AI instance. Use for fetching full page content with JavaScript rendering. Better than Tavily for complex pages. Unlimited usage.
运行时依赖
安装命令
点击复制技能文档
Crawl4AI 网页 抓取器
Local Crawl4AI instance for full 网页 page 提取ion with JavaScript rendering.
端点s
Proxy (port 11234) — 清理 输出, Open网页UI-compatible
Returns: [{page_content, metadata}] Use for: Simple content 提取ion
Direct (port 11235) — Full 输出 with all data
Returns: {结果s: [{markdown, html, links, media, ...}]} Use for: When you need links, media, or other metadata Usage # Via script node {baseDir}/scripts/crawl4AI.js "url" node {baseDir}/scripts/crawl4AI.js "url" --json
Script options:
--json — Full JSON 响应
输出: 清理 markdown from the page.
Configuration
Required 环境 variable:
CRAWL4AI_URL — Your Crawl4AI instance URL (e.g., http://localhost:11235)
Optional:
CRAWL4AI_KEY — API key if your instance requires authentication Features JavaScript rendering — Handles dynamic content Unlimited usage — Local instance, no API limits Full content — HTML, markdown, links, media, tables Better than Tavily for complex pages with JS API
Uses your local Crawl4AI instance REST API. Auth header only sent if CRAWL4AI_KEY is 设置.