Review code for implementation correctness, scalability, and production readiness. Use when reviewing PRs, auditing code quality, checking for performance issues, or evaluating production readiness. Triggers on "review", "audit", "check code", "production ready", "scalability".
A systematic code review skill emphasizing implementation correctness, scalability, and production readiness for high-performance Python/PyTorch codebases.
When asked to review code:
Algorithm correctness:
Type safety:
Memory correctness:
Memory complexity:
Compute complexity:
Scaling characteristics:
Error handling:
Robustness:
Observability:
API design:
Readability:
Documentation:
Testing implications:
Device handling:
.to(device) used correctly without unnecessary transfers?Gradient flow:
torch.no_grad() contexts used appropriately?Performance patterns:
torch.compile compatible (no graph breaks)?Critical: Will cause incorrect results, crashes, or data corruption
High: Significant impact on performance or reliability at scale
Medium: Affects maintainability or has minor performance impact
Low: Style issues or minor improvements
Structure your review as follows:
## Summary
[1-2 sentence overview of the code's purpose and overall assessment]
## Critical Issues
### [Issue Title]
**Location**: `file.py:123`
**Problem**: [Description of the issue]
**Impact**: [What could go wrong]
**Fix**:
```python
# Before
problematic_code()
# After
correct_code()
[Same format as Critical]
[Same format, can be briefer]
[Bullet points acceptable]
## Common Patterns to Flag
### Memory Issues
```python
# BAD: Materializes O(T*K*C²) tensor
edge = torch.zeros(batch, T, K, C, C)
for t in range(T):
for k in range(K):
edge[:, t, k] = compute_edge(t, k)
# GOOD: Compute on-the-fly with O(KC) memory
for t in range(T):
for k in range(K):
edge_block = compute_edge(t, k) # (batch, C, C)
process_and_discard(edge_block)
# BAD: Overflow at large T
cumsum = torch.cumsum(x, dim=1) # Values grow to O(T)
# GOOD: Zero-center to keep values O(√T)
x_centered = x - x.mean(dim=1, keepdim=True)
cumsum = torch.cumsum(x_centered, dim=1)
# BAD: Breaks gradient flow
if condition:
return tensor.detach() # Silent gradient death
return tensor
# GOOD: Explicit gradient handling
with torch.no_grad():
# Operations that shouldn't have gradients
...
# BAD: Creates tensor on CPU, data on GPU
mask = torch.zeros(batch, T) # CPU!
result = data * mask # Error or silent transfer
# GOOD: Match device explicitly
mask = torch.zeros(batch, T, device=data.device)
When reviewing code:
Read thoroughly: Read all files before commenting. Understand the full context.
Check the math: Verify algorithms against any documented formulas or papers.
Think at scale: Consider behavior at 10x, 100x, 1000x the test scale.
Be specific: Point to exact lines, provide concrete fixes.
Prioritize: Focus on correctness first, then scalability, then style.
Be constructive: Explain why something is problematic, not just that it is.
Acknowledge good code: Note well-implemented patterns and clever solutions.
Consider context: A research prototype has different standards than production code.
When reviewing Semi-Markov CRF code (streaming vs exact backends, edge tensors, loop bounds), consult:
This document covers:
lengths + 1 convention when using SemiMarkov with streaming-style edge tensorsdur_idx = min(k, K - 1))Review the streaming.py implementation for production readiness
Audit the backward pass for numerical stability issues
Check if this code will scale to T=400K sequences
Review this PR for memory leaks and gradient issues