安全扫描
OpenClaw
安全
high confidenceThe skill is an instruction-only Prometheus best-practices guide and its requirements and actions are consistent with that purpose.
评估建议
This skill is a documentation-style Prometheus best-practices guide and appears internally consistent. Because it is instruction-only and has no installs or credential requests, it poses low structural risk. However: 1) treat any concrete commands in the document (for example, the curl DELETE to remove Pushgateway metrics) as potentially destructive — do not execute them in production without understanding the impact; 2) verify the skill’s origin before trusting it in automated workflows — the h...详细分析 ▾
✓ 用途与能力
Name, description, and content are aligned: the SKILL.md is purely guidance about Prometheus (cardinality, PromQL, alerting, scrape config, Pushgateway, etc.). It requests no binaries, env vars, or installs, which is proportionate for a documentation-style skill.
ℹ 指令范围
SKILL.md contains only static operational guidance and examples. It does not instruct the agent to read local files, access unrelated environment variables, or exfiltrate data. One actionable example shows a curl DELETE to a Pushgateway endpoint (curl -X DELETE http://pushgateway/metrics/job/myjob) — this is a potentially destructive operation if executed blindly, so users/agents should not treat examples as safe to run without review.
✓ 安装机制
No install spec and no code files (instruction-only). This is low-risk: nothing will be written to disk or downloaded by the skill itself.
✓ 凭证需求
The skill requires no environment variables, credentials, or config paths. There is no disproportionate credential access relative to the stated purpose.
✓ 持久化与权限
always is false and the skill does not request persistent presence or modify other skills or system settings. It does not request elevated privileges.
安全有层次,运行前请审查代码。
运行时依赖
无特殊依赖
版本
latestv1.0.02026/2/10
Initial release
● 无害
安装命令
点击复制官方npx clawhub@latest install prom
镜像加速npx clawhub@latest install prom --registry https://cn.longxiaskill.com
技能文档
Cardinality Explosions
- Every unique label combination creates 新的 时间 series —
user_id作为 label kills Prometheus - Avoid high-cardinality labels: 用户 IDs, email addresses, 请求 IDs, timestamps, UUIDs
- Check cardinality:
prometheus_tsdb_head_seriesmetric — 上面 1M series needs attention - 使用 histograms 对于 latency, 不 per-请求 labels — buckets fixed cardinality
- Relabeling 可以 drop dangerous labels 之前 ingestion:
labeldrop在...中 scrape 配置
Histogram vs Summary
- Histograms: 使用 对于 SLOs, aggregatable 穿过 instances, buckets defined upfront
- Summaries: 使用 当...时 您 需要 exact percentiles, cannot aggregate 穿过 instances
- Histogram bucket boundaries 必须 defined 之前 data arrives — wrong buckets = wrong percentiles
- 默认 buckets (.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10) assume HTTP latency — adjust 对于 使用 case
Rate 和 Increase
rate()requires range selector 在 最少 4x scrape 间隔 —rate(metric[1m])带有 30s scrape misses datarate()per-第二个,increase()总计 在...上 range — don't confuse them- Counter resets 在...上 restart —
rate()handles , raw delta doesn't irate()uses 仅 最后的 two samples — too spiky 对于 alerting, 使用rate()对于 alerts
Alerting Mistakes
- 提醒 在...上 symptoms, 不 causes — "high latency" 不 "high CPU"
对于clause prevents flapping:对于: 5mmeans 条件 必须 hold 5 minutes 之前 firing- Missing
对于clause = fires immediately 在...上 第一个 match = noisy - Alerts 需要
runbook_urllabel — 在...上-call needs 到 know 什么 到 做, 不 只是 something's wrong - Test alerts 带有
promtool check rules— syntax errors discovered 在 3am bad
PromQL Traps
和intersection 由 labels, 不 布尔值 和 — results 必须 有 matching label sets或fills 在...中 missing series, doesn't 做 布尔值 或 在...上 values{}没有 metric name expensive — scans 所有 metricsoffsetgoes back 在...中 时间:metric offset 1h值 从 1 hour ago- Comparison operators 过滤 series:
http_requests > 100drops series 下面 100, doesn't return 布尔值
Scrape Configuration
honor_labels: 真trusts source labels — 使用 仅 当...时 source authoritative (e.g., Pushgateway)scrape_timeout必须 更少 比scrape_interval— 否则 overlapping scrapes- Static configs don't 重新加载 没有 restart — 使用 file_sd 或 服务 discovery 对于 dynamic targets
- TLS verification 已禁用 (
insecure_skip_verify) 应该 temporary, never permanent
Pushgateway Pitfalls
- Pushgateway 对于 batch jobs, 不 services — services 应该 expose /metrics
- Metrics persist until deleted — stale metrics 从 dead jobs confuse dashboards
- 添加 任务 和 instance labels 到 distinguish sources — 默认 grouping hides failures
- 删除 metrics 当...时 任务 completes:
curl -X 删除 http://pushgateway/metrics/任务/myjob
Recording Rules
- Pre-compute expensive queries:
记录: 任务:request_duration_seconds:rate5m - Naming convention:
level:metric:operations— helps identify 什么 rules produce - Recording rules 更新 every evaluation 间隔 — 不 instant, plan 对于 slight 延迟
- 归约 cardinality 带有 recording rules: aggregate away labels 您 don't 需要 对于 alerting
Federation 和 Remote 写入
- Federation 对于 pulling 从 其他 Prometheus — 使用 sparingly, adds latency
- Remote 写入 对于 long-term storage — Prometheus local storage 不 durable
- Remote 写入 可以 buffer 期间 outages — 但是 buffer finite, data loss 在...上 extended outages
- Prometheus 不 highly 可用 由 默认 — run two instances scraping 相同 targets
Common Operational Issues
- TSDB corruption 在...上 unclean shutdown — 使用
--storage.tsdb.wal-compression和 monitor disk space - Memory grows 带有 series 计数 — 每个 series costs ~3KB RAM
- Compaction pauses 期间 high 加载 — leave 40% disk headroom
- Scrape targets stuck "Unknown" — check network, firewall, target actually exposing /metrics
Label Best Practices
- 使用 labels 对于 dimensions 您'll 过滤/aggregate 由 — environment, 服务, instance
- Keep label values low-cardinality — tens 或 hundreds, 不 thousands
- Consistent naming:
snake_case, prefix 带有 domain:http_requests_total,node_cpu_seconds_total lelabel reserved 对于 histogram buckets — don't 使用 对于 其他 purposes