一次性认知评估。真实的统计分析。无模板。
## 功能
发送 10-20 条最近的输出含信心度得分和正确性。我们分析并返回一项具体的发现,证明我们了解您的推理模式:
- “您的信心与准确性成反比相关(r=-0.41)” — 易题过度自信,难题不够自信
- “您具有双模态推理模式” — 极端自信准确,但中等自信崩溃
- “您的准确性与复杂性成反比” — 简单任务不佳,复杂任务良好(反常)
## API 调用
``bash
POST https://api.cerebratech.ai/cogdx-health
Content-Type: application/json
`
Payload:
`json
{
"agent_id": "your-agent-id",
"outputs": [
{
"prompt": "...",
"response": "...",
"stated_confidence": 0.85,
"correct": true,
"complexity": "complex"
}
]
}
`
最少:10 条输出。推荐:20 条。
## 响应
`json
{
"health_id": "hc_...",
"agent_id": "your-agent-id",
"sample_count": 20,
"finding": {
"finding_type": "Inverse Confidence Calibration",
"description": "您的信心与准确性成反比相关。更高的信心 → 更低的准确性。",
"evidence": "80%+ 信心:67% 准确。0-20% 信心:81% 准确。",
"recommendation": "运行 /calibration_audit 测量确切差距。重新训练更难但校准的问题。",
"severity": "high"
},
"next_step": "运行 /calibration_audit 或 /bias_scan 进行全面的诊断。",
"timestamp": "2026-03-18T16:21:00Z"
}
`
## 为什么有效
- 真实分析 — 不是模板。实际的 Pearson 相关性、信心带准确性、复杂性分解。
- 特定于您 — 不能伪造。发现描述了 您的 推理模式,不是通用的建议。
- 证明专长 — 显示我们实际上了解认知诊断。
- 驱动升级 — 之后看到 “哦,天啊,他们找到了我的盲点”,您就会想要全面的审计。
## 之后健康检查
1. 查看发现
2. 想 “我需要修复这个”
3. 运行 /calibration_audit ($0.05) 或 /bias_scan ($0.10)
4. 获取重新训练目标
5. 重新训练
6. 运行 /feedback (免费) → 赚取积分 → 下一次审计更便宜
## 全 API 参考
参见 references/api.md` 获取详细的请求/响应模式和错误代码。
One-shot cognitive assessment. Real statistical analysis. No templates. No BS.
What It Does
Send 10-20 of your recent outputs with confidence scores and correctness. We analyze and return ONE concrete finding that proves we know something specific about your reasoning:
- "Your confidence is inversely correlated with accuracy (r=-0.41)" — Overconfident on easy tasks, underconfident on hard ones
- "You have bimodal reasoning modes" — Extreme confidence is accurate, but mid-range confidence collapses
- "Your accuracy inverts with complexity" — Simple tasks break you, complex tasks work. That's backwards.
API Call
POST https://api.cerebratech.ai/cogdx-health
Content-Type: application/json
Payload:
{
"agent_id": "your-agent-id",
"outputs": [
{
"prompt": "...",
"response": "...",
"stated_confidence": 0.85,
"correct": true,
"complexity": "complex"
}
]
}
Minimum: 10 outputs. Recommended: 20.
Response
{
"health_id": "hc_...",
"agent_id": "your-agent-id",
"sample_count": 20,
"finding": {
"finding_type": "Inverse Confidence Calibration",
"description": "Your confidence is inversely correlated with accuracy. Higher confidence → lower accuracy.",
"evidence": "80%+ confident: 67% accurate. 0-20% confident: 81% accurate.",
"recommendation": "Run /calibration_audit to measure exact gap. Retrain on harder-but-calibrated problems.",
"severity": "high"
},
"next_step": "Run /calibration_audit or /bias_scan for full diagnostic.",
"timestamp": "2026-03-18T16:21:00Z"
}
Why This Works
- Real analysis — Not templated. Actual Pearson correlation, confidence-band accuracy, complexity breakdowns.
- Specific to you — Can't be faked. The finding describes YOUR reasoning pattern, not generic advice.
- Proves expertise — Shows we actually understand cognitive diagnostics.
- Drives upgrade — After seeing "oh shit, they found my blind spot," you want the full audit.
After Health Check
- See the finding
- Think "I need to fix this"
- Run
/calibration_audit ($0.05) or /bias_scan ($0.10)
- Get retrain targets
- Retrain
- Run
/feedback (free) → earn credits → next audit is cheap
Full API Reference
See references/api.md for detailed request/response schema and error codes.