详细分析 ▾
运行时依赖
版本
重大更新:v3.0引入延迟技能加载并重构优化逻辑以实现最大token节省。添加了延迟技能加载的原生支持和文档(基于SKILLS.md),现在是最大的单一优化。删除了捆绑的cronjob指南和多提供商参考/配置文件;外部策略现在被引用但不包含在内。简化文档以专注于核心、本地优化和集成模式。更新AGENTS.md和心跳优化模板以兼容会话修剪和更新的OpenClaw功能。
安装命令
点击复制技能文档
Comprehensive toolkit for reducing token usage and API costs in OpenClaw deployments. Combines smart model routing, optimized heartbeat intervals, usage tracking, and multi-provider strategies.
Quick 开始
Immediate actions (否 配置 changes needed):
- Generate optimized AGENTS.md (BIGGEST WIN!):
python3 scripts/context_optimizer.py generate-agents
# Creates AGENTS.md.optimized — review and replace your current AGENTS.md
- Check 什么 context 您 ACTUALLY 需要:
python3 scripts/context_optimizer.py recommend "hi, how are you?"
# Shows: Only 2 files needed (not 50+!)
- Install optimized heartbeat:
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
- Enforce cheaper models 对于 casual chat:
python3 scripts/model_router.py "thanks!"
# Single-provider Anthropic setup: Use Sonnet, not Opus
# Multi-provider setup (OpenRouter/Together): Use Haiku for max savings
- Check current 令牌 budget:
python3 scripts/token_tracker.py check
Expected savings: 50-80% reduction 在...中 令牌 costs 对于 typical workloads (context optimization biggest factor!).
Core Capabilities
0. Lazy Skill Loading (新的 在...中 v3.0 — BIGGEST WIN!)
single highest-impact optimization 可用. 最多 agents burn 3,000–15,000 tokens per 会话 loading skill files 它们 never 使用. 停止 第一个.
pattern:
- 创建 lightweight
SKILLS.mdcatalog 在...中 workspace (~300 tokens — 列表 的 skills + 当...时 到 加载 them) - 仅 加载 individual SKILL.md files 当...时 task actually needs them
- Apply 相同 logic 到 memory files — 加载 MEMORY.md 在 startup, daily logs 仅 在...上 demand
令牌 savings:
| Library size | Before (eager) | After (lazy) | Savings |
|---|---|---|---|
| 5 skills | ~3,000 tokens | ~600 tokens | 80% |
| 10 skills | ~6,500 tokens | ~750 tokens | 88% |
| 20 skills | ~13,000 tokens | ~900 tokens | 93% |
## Skills
At session start: Read SKILLS.md (the index only — ~300 tokens). Load individual skill files ONLY when a task requires them. Never load all skills upfront.
满 implementation (带有 catalog 模板 + optimizer script):
clawhub install openclaw-skill-lazy-loader
The companion skill openclaw-skill-lazy-loader includes a SKILLS.md.template, an AGENTS.md.template lazy-loading section, and a context_optimizer.py CLI that recommends exactly which skills to load for any given task.
Lazy loading handles context loading costs. remaining capabilities 下面 handle runtime costs. Together 它们 cover 满 令牌 lifecycle.
1. Context Optimization (新的!)
Biggest 令牌 saver — 仅 加载 files 您 actually 需要, 不 everything upfront.
Problem: 默认 OpenClaw loads 所有 context files every 会话:
- SOUL.md, AGENTS.md, 用户.md, TOOLS.md, MEMORY.md
- docs/*/.md (hundreds 的 files)
- memory/2026-.md (daily logs)
- 总计: Often 50K+ tokens 之前 用户 甚至 speaks!
Solution: Lazy loading based 在...上 prompt complexity.
Usage:
python3 scripts/context_optimizer.py recommend ""
Examples:
# Simple greeting → minimal context (2 files only!) context_optimizer.py recommend "hi" → Load: SOUL.md, IDENTITY.md → Skip: Everything else → Savings: ~80% of context# Standard work → selective loading context_optimizer.py recommend "write a function" → Load: SOUL.md, IDENTITY.md, memory/TODAY.md → Skip: docs, old memory, knowledge base → Savings: ~50% of context
# Complex task → full context context_optimizer.py recommend "analyze our entire architecture" → Load: SOUL.md, IDENTITY.md, MEMORY.md, memory/TODAY+YESTERDAY.md → Conditionally load: Relevant docs only → Savings: ~30% of context
输出 格式:
{ "complexity": "simple", "context_level": "minimal", "recommended_files": ["SOUL.md", "IDENTITY.md"], "file_count": 2, "savings_percent": 80, "skip_patterns": ["docs//.md", "memory/20.md"] }
Integration pattern: Before loading context for a new session:
from context_optimizer import recommend_context_bundleuser_prompt = "thanks for your help" recommendation = recommend_context_bundle(user_prompt)
if recommendation["context_level"] == "minimal": # Load only SOUL.md + IDENTITY.md # Skip everything else # Save ~80% tokens!
Generate optimized AGENTS.md:
context_optimizer.py generate-agents
# Creates AGENTS.md.optimized with lazy loading instructions
# Review and replace your current AGENTS.md
Expected savings: 50-80% reduction 在...中 context tokens.
2. Smart 模型 Routing (ENHANCED!)
Automatically classify tasks and route to appropriate model tiers.
新的: Communication pattern enforcement — Never waste Opus tokens 在...上 "hi" 或 "thanks"!
Usage:
python3 scripts/model_router.py "" [current_model] [force_tier]
Examples:
# Communication (NEW!) → ALWAYS Haiku python3 scripts/model_router.py "thanks!" python3 scripts/model_router.py "hi" python3 scripts/model_router.py "ok got it" → Enforced: Haiku (NEVER Sonnet/Opus for casual chat)# Simple task → suggests Haiku python3 scripts/model_router.py "read the log file"
# Medium task → suggests Sonnet python3 scripts/model_router.py "write a function to parse JSON"
# Complex task → suggests Opus python3 scripts/model_router.py "design a microservices architecture"
Patterns enforced 到 Haiku (NEVER Sonnet/Opus):
Communication:
- Greetings: hi, hey, hello, yo
- Thanks: thanks, thank 您, thx
- Acknowledgments: ok, sure, got , understood
- Short responses: 是, 否, yep, nope
- Single words 或 very short phrases
Background tasks:
- Heartbeat checks: "check email", "monitor servers"
- Cronjobs: "scheduled task", "periodic check", "reminder"
- Document parsing: "解析 CSV", "extract data 从 log", "读取 JSON"
- Log scanning: "scan 错误 logs", "process logs"
Integration pattern:
from model_router import route_taskuser_prompt = "show me the config" routing = route_task(user_prompt)
if routing["should_switch"]: # Use routing["recommended_model"] # Save routing["cost_savings_percent"]
Customization:
Edit ROUTING_RULES or COMMUNICATION_PATTERNS in scripts/model_router.py to adjust patterns and keywords.
3. Heartbeat Optimization
Reduce API calls from heartbeat polling with smart interval tracking:
Setup:
# Copy template to workspace cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
# Plan which checks should run python3 scripts/heartbeat_optimizer.py plan
Commands:
# Check if specific type should run now heartbeat_optimizer.py check email heartbeat_optimizer.py check calendar# Record that a check was performed heartbeat_optimizer.py record email
# Update check interval (seconds) heartbeat_optimizer.py interval email 7200 # 2 hours
# Reset state heartbeat_optimizer.py reset
如何 works:
- Tracks 最后的 check 时间 对于 每个 类型 (email, 日历, weather, etc.)
- Enforces minimum intervals 之前 re-checking
- Respects quiet hours (23:00-08:00) — skips 所有 checks
- Returns
HEARTBEAT_OK当...时 nothing needs attention (saves tokens)
默认 intervals:
- Email: 60 minutes
- 日历: 2 hours
- Weather: 4 hours
- Social: 2 hours
- Monitoring: 30 minutes
Integration 在...中 HEARTBEAT.md:
## Email Check
Run only if: heartbeat_optimizer.py check email → should_check: true
After checking: heartbeat_optimizer.py record email
Expected savings: 50% reduction 在...中 heartbeat API calls.
模型 enforcement: Heartbeat 应该 ALWAYS 使用 Haiku — see updated HEARTBEAT.模板.md 对于 模型 override instructions.
4. Cronjob Optimization (新的!)
Problem: Cronjobs often 默认 到 expensive models (Sonnet/Opus) 甚至 对于 routine tasks.
Solution: Always specify Haiku 对于 90% 的 scheduled tasks.
See: assets/cronjob-模型-guide.md 对于 comprehensive guide 带有 examples.
Quick reference:
| Task Type | Model | Example |
|---|---|---|
| Monitoring/alerts | Haiku | Check server health, disk space |
| Data parsing | Haiku | Extract CSV/JSON/logs |
| Reminders | Haiku | Daily standup, backup reminders |
| Simple reports | Haiku | Status summaries |
| Content generation | Sonnet | Blog summaries (quality matters) |
| Deep analysis | Sonnet | Weekly insights |
| Complex reasoning | Never use Opus for cronjobs |
# Parse daily logs with Haiku cron add --schedule "0 2" \ --payload '{ "kind":"agentTurn", "message":"Parse yesterday error logs and summarize", "model":"anthropic/claude-haiku-4" }' \ --sessionTarget isolated
示例 (bad):
# ❌ Using Opus for simple check (60x more expensive!)
cron add --schedule "/15 " \
--payload '{
"kind":"agentTurn",
"message":"Check email",
"model":"anthropic/claude-opus-4"
}' \
--sessionTarget isolated
Savings: 使用 Haiku 代替 的 Opus 对于 10 daily cronjobs = $17.70/month saved per agent.
Integration 带有 model_router:
# Test if your cronjob should use Haiku
model_router.py "parse daily error logs"
# → Output: Haiku (background task pattern detected)
5. 令牌 Budget Tracking
Monitor usage and alert when approaching limits:
Setup:
# Check current daily usage python3 scripts/token_tracker.py check# Get model suggestions python3 scripts/token_tracker.py suggest general
# Reset daily tracking python3 scripts/token_tracker.py reset
输出 格式:
{
"date": "2026-02-06",
"cost": 2.50,
"tokens": 50000,
"limit": 5.00,
"percent_used": 50,
"status": "ok",
"alert": null
}
Status levels:
ok: 下面 80% 的 daily limitwarning: 80-99% 的 daily limitexceeded: 在...上 daily limit
Integration pattern: Before starting expensive operations, check budget:
import json import subprocessresult = subprocess.run( ["python3", "scripts/token_tracker.py", "check"], capture_output=True, text=True ) budget = json.loads(result.stdout)
if budget["status"] == "exceeded": # Switch to cheaper model or defer non-urgent work use_model = "anthropic/claude-haiku-4" elif budget["status"] == "warning": # Use balanced model use_model = "anthropic/claude-sonnet-4-5"
Customization:
Edit daily_limit_usd and warn_threshold parameters in function calls.
6. Multi-Provider Strategy
See references/PROVIDERS.md for comprehensive guide on:
- Alternative providers (OpenRouter, Together.ai, Google AI Studio)
- Cost comparison tables
- Routing strategies 由 task complexity
- Fallback chains 对于 rate-limited scenarios
- API 键 management
Quick reference:
| Provider | Model | Cost/MTok | Use Case |
|---|---|---|---|
| Anthropic | Haiku 4 | $0.25 | Simple tasks |
| Anthropic | Sonnet 4.5 | $3.00 | Balanced default |
| Anthropic | Opus 4 | $15.00 | Complex reasoning |
| OpenRouter | Gemini 2.5 Flash | $0.075 | Bulk operations |
| Google AI | Gemini 2.0 Flash Exp | FREE | Dev/testing |
| Together | Llama 3.3 70B | $0.18 | Open alternative |
Configuration Patches
See assets/config-patches.json for advanced optimizations:
Implemented 由 skill:
- ✅ Heartbeat optimization (fully functional)
- ✅ 令牌 budget tracking (fully functional)
- ✅ 模型 routing logic (fully functional)
Native OpenClaw 2026.2.15 — apply directly:
- ✅ 会话 pruning (
contextPruning: 缓存-ttl) — auto-trims 旧的 tool results 之后 Anthropic 缓存 TTL expires - ✅ Bootstrap size limits (
bootstrapMaxChars/bootstrapTotalMaxChars) — caps workspace file injection size - ✅ 缓存 retention long (
cacheRetention: "long"对于 Opus) — amortizes 缓存 写入 costs
Requires OpenClaw core support:
- ⏳ Prompt caching (Anthropic API feature — 验证 current status)
- ⏳ Lazy context loading (使用
context_optimizer.pyscript today) - ⏳ Multi-provider fallback (partially supported)
Apply 配置 patches:
# Example: Enable multi-provider fallback
gateway config.patch --patch '{"providers": [...]}'
Native OpenClaw Diagnostics (2026.2.15+)
OpenClaw 2026.2.15 added built-in commands that complement this skill's Python scripts. Use these first for quick diagnostics before reaching for the scripts.
Context breakdown
/context list → token count per injected file (shows exactly what's eating your prompt)
/context detail → full breakdown including tools, skills, and system prompt sections
使用 之前 applying bootstrap_size_limits — see 哪个 files oversized, 然后 设置 bootstrapMaxChars accordingly.Per-响应 usage tracking
/usage tokens → append token count to every reply
/usage full → append tokens + cost estimate to every reply
/usage cost → show cumulative cost summary from session logs
/usage off → disable usage footer
Combine 带有 token_tracker.py — /usage cost gives 会话 totals; token_tracker.py tracks daily budget.会话 status
/status → model, context %, last response tokens, estimated cost
缓存 TTL Heartbeat Alignment (新的 在...中 v1.4.0)
problem: Anthropic charges ~3.75x 更多 对于 缓存 writes 比 缓存 reads*. 如果 agent goes idle 和 1h 缓存 TTL expires, 下一个 请求 re-writes entire prompt 缓存 — expensive.
fix: 设置 heartbeat 间隔 到 55min (只是 在...下 1h TTL). heartbeat keeps 缓存 warm, 所以 every subsequent 请求 pays 缓存-读取 rates 代替.
# Get optimal interval for your cache TTL python3 scripts/heartbeat_optimizer.py cache-ttl # → recommended_interval: 55min (3300s) # → explanation: keeps 1h Anthropic cache warm
# Custom TTL (e.g., if you've configured 2h cache) python3 scripts/heartbeat_optimizer.py cache-ttl 7200 # → recommended_interval: 115min
Apply 到 OpenClaw 配置:
{
"agents": {
"defaults": {
"heartbeat": {
"every": "55m"
}
}
}
}
谁 benefits: Anthropic API 键 users 仅. OAuth profiles 已经 默认 到 1h heartbeat (OpenClaw smart 默认). API 键 profiles 默认 到 30min — bumping 到 55min both cheaper (fewer calls) 和 缓存-warm.
Deployment Patterns
对于 Personal 使用
- Install optimized
HEARTBEAT.md - Run budget checks 之前 expensive operations
- Manually 路由 complex tasks 到 Opus 仅 当...时 needed
Expected savings: 20-30%
对于 Managed Hosting (xCloud, etc.)
- 默认 所有 agents 到 Haiku
- 路由 用户 interactions 到 Sonnet
- Reserve Opus 对于 explicitly complex requests
- 使用 Gemini Flash 对于 background operations
- Implement daily budget caps per customer
Expected savings: 40-60%
对于 High-Volume Deployments
- 使用 multi-provider fallback (OpenRouter + Together.ai)
- Implement aggressive routing (80% Gemini, 15% Haiku, 5% Sonnet)
- Deploy local Ollama 对于 离线/cheap operations
- Batch heartbeat checks (every 2-4 hours, 不 30 min)
Expected savings: 70-90%
Integration Examples
Workflow: Smart Task Handling
# 1. User sends message user_msg="debug this error in the logs"# 2. Route to appropriate model routing=$(python3 scripts/model_router.py "$user_msg") model=$(echo $routing | jq -r .recommended_model)
# 3. Check budget before proceeding budget=$(python3 scripts/token_tracker.py check) status=$(echo $budget | jq -r .status)
if [ "$status" = "exceeded" ]; then # Use cheapest model regardless of routing model="anthropic/claude-haiku-4" fi
# 4. Process with selected model # (OpenClaw handles this via config or override)
Workflow: Optimized Heartbeat
## HEARTBEAT.md# Plan what to check result=$(python3 scripts/heartbeat_optimizer.py plan) should_run=$(echo $result | jq -r .should_run)
if [ "$should_run" = "false" ]; then echo "HEARTBEAT_OK" exit 0 fi
# Run only planned checks planned=$(echo $result | jq -r '.planned[].type')
for check in $planned; do case $check in email) check_email ;; calendar) check_calendar ;; esac python3 scripts/heartbeat_optimizer.py record $check done
Troubleshooting
Issue: Scripts 失败 带有 "模块 不 found"
- Fix: Ensure Python 3.7+ installed. Scripts 使用 仅 stdlib.
Issue: State files 不 persisting
- Fix: Check
~/.openclaw/workspace/memory/directory exists 和 writable.
Issue: Budget tracking shows $0.00
- Fix:
token_tracker.pyneeds integration 带有 OpenClaw'ssession_statustool. Currently tracks manually recorded usage.
Issue: Routing suggests wrong 模型 tier
- Fix: Customize
ROUTING_RULES在...中model_router.py对于 specific patterns.
Maintenance
Daily:
- Check budget status:
token_tracker.py check
Weekly:
- Review routing accuracy ( suggestions 正确?)
- Adjust heartbeat intervals based 在...上 activity
Monthly:
- Compare costs 之前/之后 optimization
- Review 和 更新
PROVIDERS.md带有 新的 options
Cost Estimation
示例: 100K tokens/day workload
Without skill:
- 50K context tokens + 50K conversation tokens = 100K 总计
- 所有 Sonnet: 100K × $3/MTok = $0.30/day = $9/month
| Strategy | Context | Model | Daily Cost | Monthly | Savings |
|---|---|---|---|---|---|
| Baseline (no optimization) | 50K | Sonnet | $0.30 | $9.00 | 0% |
| Context opt only | 10K (-80%) | Sonnet | $0.18 | $5.40 | 40% |
| Model routing only | 50K | Mixed | $0.18 | $5.40 | 40% |
| Both (this skill) | 10K | Mixed | $0.09 | $2.70 | 70% |
| Aggressive + Gemini | 10K | Gemini | $0.03 | $0.90 | 90% |
xCloud hosting scenario (100 customers, 50K tokens/customer/day):
- Baseline (所有 Sonnet, 满 context): $450/month
- 带有 令牌-optimizer: $135/month
- Savings: $315/month per 100 customers (70%)
Resources
Scripts (4 总计)
context_optimizer.py— Context loading optimization 和 lazy loading (新的!)model_router.py— Task classification, 模型 suggestions, 和 communication enforcement (ENHANCED!)heartbeat_optimizer.py— 间隔 management 和 check schedulingtoken_tracker.py— Budget monitoring 和 alerts
References
PROVIDERS.md— Alternative AI providers, pricing, 和 routing strategies
Assets (3 总计)
HEARTBEAT.模板.md— Drop-在...中 optimized heartbeat 模板 带有 Haiku enforcement (ENHANCED!)cronjob-模型-guide.md— Complete guide 对于 choosing models 在...中 cronjobs (新的!)配置-patches.json— Advanced configuration examples
Future Enhancements
Ideas for extending this skill:
- Auto-routing integration — 钩子 进入 OpenClaw 消息 pipeline
- Real-时间 usage tracking — 解析 session_status automatically
- Cost forecasting — Predict monthly spend based 在...上 recent usage
- Provider health monitoring — Track API latency 和 failures
- /B testing — Compare quality 穿过 不同 routing strategies