"A code review that only checks style is a spell-check on a ransom note. The grammar is fine — the content is the problem."
它做什么
人工审查者擅长发现代码风格问题、明显的 bug 和高级设计问题。但他们不擅长发现:
- 每百万次请求才发生一次、但发生时会导致数据损坏的边界情况
- 隐藏在
<= 中应该是 < 的差一错误
- 两个"总是足够快"的操作之间的竞态条件
- 错误被捕获、记录后……函数继续执行仿佛什么都没发生的静默失败n- 输入总是排序的、总是非空的、总是 UTF-8 的隐式假设
Review Lens 通过七个专业透镜检查代码,发现人类容易忽略的问题。
七个透镜
透镜 1:边界分析
问题: 边缘情况下会发生什么?
检查项:
├── 空输入:如果数组是 [] 呢?字符串是 "" 呢?映射是 {} 呢?
├── 单元素:如果恰好有一个项目呢?
├── 最大输入:如果有数百万个项目呢?
├── Null/undefined:如果任何参数是 nil 呢?
├── 零值:如果数字是 0 呢?负数?NaN?Infinity?
├── Unicode:如果字符串包含表情符号、RTL 字符或零宽字符呢?
├── 并发:如果两个线程同时调用呢?
└── 时间:如果在午夜运行呢?2月29日?夏令时转换期间?
它能发现:
// 人工审查者:"看起来不错,正确计算平均值"
function average(numbers) {
return numbers.reduce((a, b) => a + b) / numbers.length;
}// Review Lens:"空数组 → reduce 抛出 TypeError。
// 单元素 → 正常工作。
// 非常大的数组 → 潜在的浮点累积错误。
// 包含 NaN 的数组 → 结果是 NaN(静默损坏)。"
透镜 2:失败路径分析
问题: 当这失败时,会发生什么?
检查项:
├── 每个错误都被捕获了吗?(不仅仅是预期的那些)
├── 当错误被捕获时,函数是否仍然正确行为?
│ └── 它是否返回合理的值?还是返回 undefined/null?
├── 错误消息有用吗?(包含上下文,而不仅仅是"出了点问题")
├── 错误是否正确传播?(不被吞掉,不被双重处理)
├── 失败时副作用是否被清理?(事务回滚、文件关闭、锁释放)
└── 调用者能否区分"无结果"和"错误"?
它能发现:
# 人工审查者:"好的,处理了错误情况"
try:
user = db.get_user(user_id)
except DatabaseError:
logger.error("Failed to get user")
return None# Review Lens:"调用者收到 None 对于'用户未找到'
# 和'数据库宕机'这两种情况。这是根本不同的
# 条件。调用者无法区分用户缺失和
# 系统故障。另外:如果 user_id 被记录且包含
# PII——日志目标是否合规?"
透镜 3:状态转换分析
问题: 这段代码能达到无效状态吗?
检查项:
├── 所有状态转换都有效吗?(没有无效的中间状态)
├── 状态是否原子更新?(没有半更新状态对其他可见)
├── 状态转换能乱序发生吗?
├── 有无法退出的状态吗?(死锁、无限循环)
├── 清理有保证吗?(finally 块、defer、析构函数)
└── 布尔标志使用正确吗?(不是多个布尔创建无效组合)
它能发现:
// 人工审查者:"订单状态机,看起来完整"
order.setStatus("processing");
payment.charge(order.getTotal());
order.setStatus("paid");
inventory.reserve(order.getItems());
order.setStatus("confirmed");// Review Lens:"如果 payment.charge() 成功但 inventory.reserve()
// 失败,订单将停留在'已支付'状态但没有库存。
// 客户被扣款但订单无法完成。没有回滚
// 机制。另外:在 setStatus('processing') 和 charge() 之间,
// 订单处于看起来在处理但尚未被扣款的状态——如果进程在这里崩溃,
// 它将永远停留在'processing'状态且没有重试机制。"
透镜 4:隐式假设分析
问题: 这段代码要正常工作必须满足什么?
检查项:
├── 排序假设:这是否假设输入已排序?
├── 唯一性假设:这是否假设没有重复?
├── 格式假设:这是否假设特定的编码、区域设置、时区?
├── 大小假设:这是否假设数据适合内存?
├── 时序假设:这是否假设操作"足够快"完成?
├── 环境假设:这是否假设特定的操作系统、权限或网络?
├── 依赖假设:这是否假设特定版本的库?
└── 业务假设:这是否假设可能改变的规则?
它能发现:
// 人工审查者:"干净的函数,结构良好"
func findUser(email string) (User, error) {
results, err := db.Query("SELECT FROM users WHERE email = ?", email)
if err != nil {
return nil, err
}
return results[0], nil // 返回第一个匹配
}// Review Lens:"假设 email 是唯一的(未检查 UNIQUE 约束)。
// 假设至少有一个结果(空 results[0] 会 panic)。
// 假设大小写敏感性与数据库排序规则匹配。
// 假设 email 已 trim(尾随空格可能导致不匹配)。
// SELECT 拉取所有列——模式变更会破坏结构体映射。"
透镜 5:性能悬崖分析
问题: 这从哪里变快到灾难性?
检查项:
├── N+1 查询:循环中每次迭代都进行数据库调用
├── 无界增长:无限增长的集合
├── 缺少分页:返回所有结果的查询
├── 二次(或更差)算法:对同一数据进行嵌套循环
├── 不必要的分配:热循环中创建对象
├── 缺少短路:可以提前退出的昂贵操作
├── 序列化:每个请求序列化大对象
└── 正则表达式:灾难性回溯模式
它能发现:
// 人工审查者:"获取所有活跃用户及其订单,看起来不错"
async function getActiveUsersWithOrders() {
const users = await db.query('SELECT FROM users WHERE active = true');
for (const user of users) {
user.orders = await db.query('SELECT * FROM orders WHERE user_id = ?', user.id);
}
return users;
}// Review Lens:"N+1 查询模式。100 个用户:101 次查询。
// 10,000 个用户:10,001 次查询。没有分页——将所有
// 活跃用户加载到内存中。两个表都选择所有列。
// 规模扩大:10 秒响应时间,可能 OOM,数据库连接
// 池耗尽。在开发环境用 12 个测试用户运行良好。"
透镜 6:安全面分析
问题: 这在哪里可以被滥用?
检查项:
├── 输入信任:用户输入在使用前是否经过验证?
├── 输出编码:输出是否为目标上下文进行了转义?
├── 身份验证:调用者的身份是否经过验证?
├── 授权:调用者是否被允许做这个特定操作?
├── 注入:用户输入能否更改查询、命令或模板?
├── 信息泄露:错误消息是否暴露内部细节?
├── 时序:响应时间能否泄露秘密?
└── TOCTOU:在检查权限和执行操作之间是否有间隙?
透镜 7:变更下的正确性
问题: 未来的变更有多容易破坏这段代码的假设?
检查项:
├── 魔数:有人会理解为什么这是 86400(秒/天)吗?
├── 隐式排序:如果有人重新排序这些行,会破坏吗?
├── 命名漂移:如果行为改变,名称还匹配吗?
├── 接口脆弱性:向结构体添加字段会破坏这吗?
├── 复制粘贴陷阱:这段代码与附近可能分叉的代码相似吗?
└── 删除安全:如果有人删除它依赖的函数,错误是否清晰?
审查输出格式
╔══════════════════════════════════════════════════════════════╗
║ REVIEW LENS ║
║ File: src/checkout/payment.ts ║
║ Lines changed: 47 (+32 / -15) ║
╠══════════════════════════════════════════════════════════════╣
║ ║
║ FINDINGS: 4 issues (1 critical, 1 high, 2 medium) ║
║ ║
║ 🔴 CRITICAL [State Transition] Line 34-38 ║
║ Payment charged before inventory reserved. If reservation ║
║ fails, customer is charged for unfulfillable order. ║
║ → Fix: Wrap in transaction. Reserve first, charge second. ║
║ ║
║ 🟠 HIGH [Performance Cliff] Line 22 ║
║ getOrderItems() inside loop = N+1 query pattern. ║
║ Fine with test data (5 orders), catastrophic in production ║
║ (5,000+ orders). Response time: O(n) DB calls. ║
║ → Fix: Batch query with WHERE order_id IN (...). ║
║ ║
║ 🟡 MEDIUM [Boundary] Line 15 ║
║ items.reduce() on potentially empty array throws. ║
║ → Fix: Add initialValue: items.reduce((a,b) => a+b, 0) ║
║ ║
║ 🟡 MEDIUM [Implicit Assumption] Line 41 ║
║ Assumes currency is always USD (hardcoded cents conversion).║
║ → Fix: Use order.currency to determine decimal places. ║
║ ║
║ ✅ PASSED: Failure paths, security surface, change safety ║
╚══════════════════════════════════════════════════════════════╝
何时调用
- 每次 PR 之前。 在请求人工审查之前,先在自己的代码上运行它。
- 审查不熟悉的代码时(透镜帮助你了解应该看哪里)
- 编写任何处理金钱、认证或用户数据的代码之后
- 审查在测试中"运行良好"但即将投入生产的代码时
- 当你有种感觉有问题但无法准确表达是什么时
为什么它重要
大多数代码审查只能发现 15-30% 的缺陷。漏掉的缺陷不是那些明显的——而是边界情况、竞态条件、性能悬崖和隐式假设,这些只有在生产条件下才会暴露。
Review Lens 不能替代人工审查者。它捕捉的是人类系统性不擅长发现的问题。
零外部依赖。零 API 调用。纯结构和逻辑分析。
"A code review that only checks style is a spell-check on a ransom note. The grammar is fine — the content is the problem."
What It Does
Human reviewers are good at catching style issues, obvious bugs, and high-level design problems. They're bad at catching:
- The edge case that happens once per million requests but corrupts data when it does
- The off-by-one hiding in a
<= that should be <
- The race condition between two operations that are "always fast enough"
- The silent failure where an error is caught, logged, and then... the function continues as if nothing happened
- The implicit assumption that the input is always sorted, always non-empty, always UTF-8
Review Lens examines code through seven specialized lenses that catch what humans skim over.
The Seven Lenses
Lens 1: Boundary Analysis
Question: What happens at the edges?
CHECKS:
├── Empty inputs: What if the array is []? The string is ""? The map is {}?
├── Single element: What if there's exactly one item?
├── Maximum inputs: What if there are millions of items?
├── Null/undefined: What if any parameter is nil?
├── Zero values: What if a number is 0? Negative? NaN? Infinity?
├── Unicode: What if the string contains emoji, RTL, or zero-width chars?
├── Concurrent: What if this is called simultaneously by two threads?
└── Time: What if this runs at midnight? On Feb 29? During DST transition?
What it catches:
// Human reviewer: "Looks good, calculates average correctly"
function average(numbers) {
return numbers.reduce((a, b) => a + b) / numbers.length;
}
// Review Lens: "Empty array → reduce throws TypeError.
// Single element → works.
// Very large array → potential floating point accumulation error.
// Array with NaN → result is NaN (silent corruption)."
Lens 2: Failure Path Analysis
Question: When this fails, what happens?
CHECKS:
├── Is every error caught? (not just the expected ones)
├── When an error IS caught, does the function still behave correctly?
│ └── Does it return a sensible value? Or does it return undefined/null?
├── Are error messages useful? (contain context, not just "something went wrong")
├── Is the error propagated correctly? (not swallowed, not double-handled)
├── Are side effects cleaned up on failure? (transactions rolled back, files closed, locks released)
└── Can the caller distinguish between "no result" and "error"?
What it catches:
# Human reviewer: "Good, handles the error case"
try:
user = db.get_user(user_id)
except DatabaseError:
logger.error("Failed to get user")
return None# Review Lens: "Caller receives None for BOTH 'user not found'
# AND 'database is down'. These are fundamentally different
# conditions. The caller can't distinguish a missing user from
# a system failure. Also: what if user_id is logged and contains
# PII — is the log destination PII-compliant?"
Lens 3: State Transition Analysis
Question: Can this code reach an invalid state?
CHECKS:
├── Are all state transitions valid? (no invalid intermediate states)
├── Is state updated atomically? (no half-updated state visible to others)
├── Can state transitions happen out of order?
├── Are there states that can never be exited? (deadlocks, infinite loops)
├── Is cleanup guaranteed? (finally blocks, defer, destructors)
└── Are boolean flags used correctly? (not multiple booleans creating invalid combinations)
What it catches:
// Human reviewer: "Order state machine, looks complete"
order.setStatus("processing");
payment.charge(order.getTotal());
order.setStatus("paid");
inventory.reserve(order.getItems());
order.setStatus("confirmed");// Review Lens: "If payment.charge() succeeds but inventory.reserve()
// fails, the order is stuck in 'paid' state with no inventory.
// Customer is charged but order can't be fulfilled. No rollback
// mechanism. Also: between setStatus('processing') and charge(),
// the order is in a state where it appears processing but hasn't
// been charged — if the process crashes here, it stays 'processing'
// forever with no retry mechanism."
Lens 4: Implicit Assumption Analysis
Question: What must be true for this code to work?
CHECKS:
├── Ordering assumptions: Does this assume input is sorted?
├── Uniqueness assumptions: Does this assume no duplicates?
├── Format assumptions: Does this assume a specific encoding, locale, timezone?
├── Size assumptions: Does this assume the data fits in memory?
├── Timing assumptions: Does this assume an operation completes "fast enough"?
├── Environment assumptions: Does this assume specific OS, permissions, or network?
├── Dependency assumptions: Does this assume a specific version of a library?
└── Business assumptions: Does this assume a rule that might change?
What it catches:
// Human reviewer: "Clean function, well-structured"
func findUser(email string) (User, error) {
results, err := db.Query("SELECT FROM users WHERE email = ?", email)
if err != nil {
return nil, err
}
return results[0], nil // Return first match
}// Review Lens: "Assumes email is unique (no UNIQUE constraint checked).
// Assumes at least one result exists (panics on empty results[0]).
// Assumes case-sensitivity matches DB collation.
// Assumes email is trimmed (trailing spaces could cause mismatch).
// SELECT pulls all columns — schema changes break the struct mapping."
Lens 5: Performance Cliff Analysis
Question: Where does this go from fast to catastrophic?
CHECKS:
├── N+1 queries: Loop that makes a DB call per iteration
├── Unbounded growth: Collections that grow without limit
├── Missing pagination: Queries that return ALL results
├── Quadratic (or worse) algorithms: Nested loops over the same data
├── Unnecessary allocations: Creating objects in hot loops
├── Missing short-circuits: Expensive operations that could bail early
├── Serialization: Serializing large objects for every request
└── Regex: Catastrophic backtracking patterns
What it catches:
// Human reviewer: "Gets all active users and their orders, looks fine"
async function getActiveUsersWithOrders() {
const users = await db.query('SELECT FROM users WHERE active = true');
for (const user of users) {
user.orders = await db.query('SELECT * FROM orders WHERE user_id = ?', user.id);
}
return users;
}// Review Lens: "N+1 query pattern. With 100 users: 101 queries.
// With 10,000 users: 10,001 queries. No pagination — loads ALL
// active users into memory. All columns selected for both tables.
// At scale: 10-second response time, potential OOM, DB connection
// pool exhaustion. Works fine in dev with 12 test users."
Lens 6: Security Surface Analysis
Question: Where can this be abused?
CHECKS:
├── Input trust: Is user input validated before use?
├── Output encoding: Is output escaped for the destination context?
├── Authentication: Is the caller's identity verified?
├── Authorization: Is the caller allowed to do this specific thing?
├── Injection: Can user input alter queries, commands, or templates?
├── Information leak: Do error messages expose internal details?
├── Timing: Can response times reveal secrets?
└── TOCTOU: Is there a gap between checking permission and performing action?
Lens 7: Correctness Under Change
Question: How easily can a future change break this code's assumptions?
CHECKS:
├── Magic numbers: Will someone understand why this is 86400 (seconds/day)?
├── Implicit ordering: If someone reorders these lines, does it break?
├── Naming drift: Do the names still match if the behavior changes?
├── Interface fragility: Does adding a field to a struct break this?
├── Copy-paste traps: Is this code similar to nearby code that could diverge?
└── Delete safety: If someone removes the function this depends on, is the error clear?
Review Output Format
╔══════════════════════════════════════════════════════════════╗
║ REVIEW LENS ║
║ File: src/checkout/payment.ts ║
║ Lines changed: 47 (+32 / -15) ║
╠══════════════════════════════════════════════════════════════╣
║ ║
║ FINDINGS: 4 issues (1 critical, 1 high, 2 medium) ║
║ ║
║ 🔴 CRITICAL [State Transition] Line 34-38 ║
║ Payment charged before inventory reserved. If reservation ║
║ fails, customer is charged for unfulfillable order. ║
║ → Fix: Wrap in transaction. Reserve first, charge second. ║
║ ║
║ 🟠 HIGH [Performance Cliff] Line 22 ║
║ getOrderItems() inside loop = N+1 query pattern. ║
║ Fine with test data (5 orders), catastrophic in production ║
║ (5,000+ orders). Response time: O(n) DB calls. ║
║ → Fix: Batch query with WHERE order_id IN (...). ║
║ ║
║ 🟡 MEDIUM [Boundary] Line 15 ║
║ items.reduce() on potentially empty array throws. ║
║ → Fix: Add initialValue: items.reduce((a,b) => a+b, 0) ║
║ ║
║ 🟡 MEDIUM [Implicit Assumption] Line 41 ║
║ Assumes currency is always USD (hardcoded cents conversion). ║
║ → Fix: Use order.currency to determine decimal places. ║
║ ║
║ ✅ PASSED: Failure paths, security surface, change safety ║
╚══════════════════════════════════════════════════════════════╝
When to Invoke
- Before every PR. Run it on your own code before asking humans to review.
- During review of unfamiliar code (lenses help you know where to look)
- After writing any code that handles money, auth, or user data
- When reviewing code that "works in testing" before it hits production
- When you have a nagging feeling something is wrong but can't articulate what
Why It Matters
Most code review catches 15-30% of defects. The defects that slip through aren't the obvious ones — they're the edge cases, the race conditions, the performance cliffs, and the implicit assumptions that only reveal themselves under production conditions.
Review Lens doesn't replace human reviewers. It catches what humans are systematically bad at seeing.
Zero external dependencies. Zero API calls. Pure structural and logical analysis.