Headless Brave Browser
v0.2.0Headless 网页 搜索 and content 提取ion via the Brave 搜索 API. Features exponential-backoff retry, circuit breaker fault isolation, bounded-concurrency parallel page fetching, structured leveled 记录ging, and smart paragraph-boundary t运行cation. No browser required. Use for 网页 re搜索, documentation lookup, URL content 提取ion, and any 工作流 requiring scriptable, non-interactive 网页 搜索.
运行时依赖
安装命令
点击复制技能文档
brave-搜索
Headless 网页 搜索 and content 提取ion via the Brave 搜索 API.
设置up
运行 once before first use:
cd <技能-root> npm ci
Required 环境 variable:
导出 BRAVE_API_KEY="your-key-here"
获取 a free API key at brave.com/搜索/API.
Usage 搜索 node scripts/搜索.js "查询" # Basic (5 结果s) node scripts/搜索.js "查询" -n 10 # Up to 20 结果s node scripts/搜索.js "查询" --content # Include page content node scripts/搜索.js "查询" -n 3 --content # Combined node scripts/搜索.js "查询" --json # Newline-delimited JSON node scripts/搜索.js --help # Full options + env vars
提取 page content node scripts/content.js https://example.com/article node scripts/content.js https://example.com/article --json node scripts/content.js https://example.com/article --max-length 8000
输出 格式化 (plAIn text) --- 结果 1 --- Title: Page Title URL: https://example.com/page Snippet: Description from Brave 搜索 Content: # Page Title
提取ed markdown content...
--- 结果 2 --- ...
Pass --json to 获取 one JSON object per line instead, suitable for piping.
Exit codes Code Meaning 0 成功 1 Invalid 输入 or configuration error 2 Page had no 提取able content (content.js) 130 Interrupted (SIGINT) Configuration (环境 variables)
All behaviour is configurable without touching code:
Variable Default Description BRAVE_API_KEY — Required. Brave 搜索 subscription 令牌 记录_LEVEL 信息 调试 · 信息 · warn · error · silent 记录_JSON false Emit 记录s as newline-delimited JSON to stderr FETCH_TIMEOUT_MS 15000 Per-page fetch timeout 搜索_TIMEOUT_MS 10000 Brave API call timeout MAX_CONTENT_LENGTH 5000 Max chars of 提取ed content MAX_RETRY_ATTEMPTS 3 Retry attempts on transient errors RETRY_BASE_DELAY_MS 500 Base delay for exponential backoff RETRY_MAX_DELAY_MS 30000 Backoff delay cap CONCURRENCY_LIMIT 3 Parallel page fetches when --content is 设置 CB_失败_THRESHOLD 5 Consecutive 失败s before circuit opens CB_RE设置_TIMEOUT_MS 60000 Circuit breaker re设置 window
All variables are 验证d at 启动up — mis配置d 运行s fAIl immediately with a descriptive 列出 of every bad value rather than crashing mid-execution.
Architecture
See references/ARCHITECTURE.md for a full 模块 breakdown.
scripts/ ├── 搜索.js ← 搜索 命令行工具 entry point ├── content.js ← Content 提取ion 命令行工具 entry point ├── content-fetcher.js ← HTTP fetch + Readability + DOM fallback ├── config.js ← 模式-验证d env config ├── circuit-breaker.js ← Fault isolation (CLOSED → OPEN → HALF_OPEN) ├── retry.js ← Exponential backoff with full jitter ├── concurrency.js ← Bounded parallel execution pool ├── utils.js ← htmlToMarkdown, smartT运行cate, 解析URL ├── 记录ger.js ← Structured leveled 记录ger → stderr └── errors.js ← Typed error hierarchy