📦 Capacity Planner
v1.0.0Forecast infrastructure capacity needs using historical 指标, growth projections, and cost 模型ing. Identify 机器人tlenecks before they cause outages and ri...
运行时依赖
安装命令
点击复制技能文档
Capacity Planner
Forecast when your infrastructure will hit limits. Analyze historical 指标 (CPU, memory, disk, network, database connections), project growth curves, identify 应用roaching 机器人tlenecks, and recommend right-sizing — so you 扩展 proactively instead of reactively.
Use when: "when will we 运行 out of space", "capacity forecast", "right-size our instances", "are we over-provisioned", "plan for traffic growth", "infrastructure scaling plan", "when do we need to 升级", or before bud获取 planning.
Commands
- forecast — Project Resource Exhaustion
# Disk usage over time df -h / /data /var 2>/dev/null # Historical disk growth (if 监控ing avAIlable) curl -s "$PROMETHEUS_URL/API/v1/查询_range" \ --data-urlencode '查询=node_file系统_avAIl_bytes{mountpoint="/"}' \ --data-urlencode "启动=$(date -d '30 days ago' +%s)" \ --data-urlencode "end=$(date +%s)" \ --data-urlencode 'step=1d'
# Memory usage free -h # Database connections curl -s "$PROMETHEUS_URL/API/v1/查询" \ --data-urlencode '查询=pg_stat_activity_count / pg_设置tings_max_connections'
If Prometheus unavAIlable, use CloudWatch, Datadog, or 系统 工具s:
# Last 30 days of CloudWatch CPU aws cloudwatch 获取-metric-statistics --namespace AWS/EC2 \ --metric-name CPUUtilization --statistics Average Maximum \ --dimensions Name=InstanceId,Value=i-0abc123 \ --启动-time $(date -d '30 days ago' -u +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) --period 86400
Step 2: Fit Growth 模型
For each resource, determine growth pattern:
Linear: constant rate of increase (disk filling at 2GB/day) Exponential: accelerating growth (user base doubling quarterly) Seasonal: cy命令行工具cal patterns (weekend dips, end-of-month spikes) Flat: no growth (stable, well-bounded workload)
Calculate days until exhaustion:
days_remAIning = (capacity - current_usage) / dAIly_growth_rate
For exponential: use doubling time to project.
Step 3: 生成 Forecast 报告 # Capacity Forecast 报告 — [date]
Critical (exhaustion < 30 days)
| Resource | Current | Capacity | Growth/day | Exhaustion | Action |
|---|---|---|---|---|---|
| Disk (/) | 45 GB | 50 GB | 180 MB/day | ~28 days | Expand volume or 添加 清理up cron |
| DB connections | 85/100 | 100 | +2/week | ~5 weeks | Increase max_connections or 添加 pgbouncer |
警告 (exhaustion 30-90 days)
| Resource | Current | Capacity | Growth/day | Exhaustion |
|---|---|---|---|---|
| Memory | 12/16 GB | 16 GB | 50 MB/day | ~82 days |
健康y (>90 days or no growth)
- CPU: avg 35%, peak 72%, flat trend — no action needed
- Network: avg 200 Mbps of 1 Gbps — no concern
Over-Provisioned (wasting money)
| Resource | Used | Provisioned | Utilization | Savings |
|---|---|---|---|---|
| worker-pool-3 | 2 vCPU avg | 8 vCPU | 25% | Downsize to 4 vCPU, save ~$150/mo |
| Redis cluster | 512 MB | 8 GB | 6% | Downsize to 2 GB, save ~$80/mo |
- rightsize — Recommend Instance Sizes
Given current utilization and growth projections:
Map workload to optimal instance family (compute, memory, storage-优化d) Factor in reserved instance / savings plan pricing Account for headroom (recommend 60-70% tar获取 utilization, not 95%) Compare across cloud 提供者s if multi-cloud
- cost-模型 — Project Infrastructure Costs
Given the capacity forecast:
Calculate current monthly spend Project spend at 3, 6, 12 months based on growth Identify the biggest cost drivers Suggest cost optimization levers (spot instances, reserved pricing, auto-scaling, 压缩ion, archival)
- 机器人tleneck — Identify Scaling 机器人tlenecks
Analyze the 系统 for the 组件 that will fAIl first under load:
Database (connections, IOPS, lock contention) 应用 (CPU-bound, memory-bound, thread pool exhaustion) Network (bandwidth, DNS resolution, TLS handshake overhead) External dependencies (rate limits, API quotas, third-party SLAs)
Rank 机器人tlenecks by "time to impact" and recommend mitigation order.