proxy-web-fetch
v1.0.0Proxy Web Page Fetch Tool - Fetches and parses web page content into structured Markdown or text via the OpenClaw Manager proxy. Use when: - Need to fetch an...
详细分析 ▾
运行时依赖
版本
Initial release of the Proxy Web Page Fetch Tool. - Fetches and parses web page content to Markdown or plain text via the OpenClaw Manager proxy. - Supports options for caching, image retention, page summaries, and metadata extraction. - No manual API key configuration needed; authentication handled internally by the Manager. - Configurable via the required `WEB_FETCH_PROXY_URL` environment variable. - Includes a handy shell script for command-line usage and various fetch scenarios.
安装命令 点击复制
技能文档
Fetch and parse web page content via the OpenClaw Manager Web Fetch Proxy. The Manager handles API key injection from encrypted storage automatically — no manual key configuration needed.
The proxy URL is configured via the WEB_FETCH_PROXY_URL environment variable (required). If not set, the skill will not be available.
Quick Start
Basic cURL Usage
curl --request POST \
--url "${WEB_FETCH_PROXY_URL}/" \
--header 'Content-Type: application/json' \
--data '{
"url": "https://www.example.com"
}'
Script Usage
A wrapper shell script is provided for convenience.
# Basic Fetch (returns Markdown by default)
./scripts/proxy_fetch.sh --url "https://www.example.com"# Fetch as plain text, no cache
./scripts/proxy_fetch.sh \
--url "https://docs.python.org/3/" \
--format text \
--no-cache
# Fetch with image and link summaries
./scripts/proxy_fetch.sh \
--url "https://news.example.com/article" \
--images-summary \
--links-summary
# Fetch without images, disable GFM
./scripts/proxy_fetch.sh \
--url "https://blog.example.com/post" \
--no-images \
--no-gfm
Authentication
No authentication required — the proxy reads API keys internally from the Manager's encrypted secrets store.
API Parameter Reference
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | ✅ | - | URL of the web page to fetch |
timeout | integer | - | 20 | Request timeout in seconds |
no_cache | boolean | - | false | Disable caching (true/false) |
return_format | string | - | markdown | Return format: markdown or text |
retain_images | boolean | - | true | Retain images in output (true/false) |
no_gfm | boolean | - | false | Disable GitHub Flavored Markdown (true/false) |
keep_img_data_url | boolean | - | false | Keep image data URLs (true/false) |
with_images_summary | boolean | - | false | Include images summary (true/false) |
with_links_summary | boolean | - | false | Include links summary (true/false) |
Response Structure
The proxy returns JSON with the parsed page content.
{
"id": "task-id",
"created": 1704067200,
"request_id": "request-id",
"model": "model-name",
"reader_result": {
"title": "Page Title",
"description": "Brief page description",
"url": "https://www.example.com",
"content": "Parsed page content (Markdown or text)",
"external": {
"stylesheet": {}
},
"metadata": {
"keywords": "page, keywords",
"viewport": "width=device-width",
"description": "Meta description",
"format-detection": "telephone=no"
}
}
}
Key Response Fields
| Field | Description |
|---|---|
reader_result.content | Main parsed content (body text, images, links) |
reader_result.title | Page title |
reader_result.description | Brief page description |
reader_result.url | Original page URL |
reader_result.metadata | Page metadata (keywords, viewport, etc.) |
Common Use Cases
| Scenario | Command |
|---|---|
| Read a documentation page | --url |
| Extract text only (no images) | --url |
| Force fresh fetch (bypass cache) | --url |
| Get content with all summaries | --url |
| Long page with extended timeout | --url |
Environment Requirements
- OpenClaw Manager must be running with the Web Fetch Proxy enabled.
WEB_FETCH_PROXY_URLenvironment variable must be set to the proxy URL (required, no default).curlcommand must be available in your system path.
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制