Find similar vulnerabilities and bugs across codebases using pattern-based analysis. Use when hunting bug variants, building CodeQL/Semgrep queries, analyzing security vulnerabilities, or performing systematic code audits after finding an initial issue.
Understanding unfamiliar code (use audit-context-building for deep comprehension first)
The Five-Step Process
Step 1: Understand the Original Issue
Related Skills
Before searching, deeply understand the known bug:
What is the root cause? Not the symptom, but WHY it's vulnerable
What conditions are required? Control flow, data flow, state
What makes it exploitable? User control, missing validation, etc.
Step 2: Create an Exact Match
Start with a pattern that matches ONLY the known instance:
rg -n "exact_vulnerable_code_here"
Verify: Does it match exactly ONE location (the original)?
Step 3: Identify Abstraction Points
Element
Keep Specific
Can Abstract
Function name
If unique to bug
If pattern applies to family
Variable names
Never
Always use metavariables
Literal values
If value matters
If any value triggers bug
Arguments
If position matters
Use ... wildcards
Step 4: Iteratively Generalize
Change ONE element at a time:
Run the pattern
Review ALL new matches
Classify: true positive or false positive?
If FP rate acceptable, generalize next element
If FP rate too high, revert and try different abstraction
Stop when false positive rate exceeds ~50%
Step 5: Analyze and Triage Results
For each match, document:
Location: File, line, function
Confidence: High/Medium/Low
Exploitability: Reachable? Controllable inputs?
Priority: Based on impact and exploitability
For deeper strategic guidance, see METHODOLOGY.md.
Tool Selection
Scenario
Tool
Why
Quick surface search
ripgrep
Fast, zero setup
Simple pattern matching
Semgrep
Easy syntax, no build needed
Data flow tracking
Semgrep taint / CodeQL
Follows values across functions
Cross-function analysis
CodeQL
Best interprocedural analysis
Non-building code
Semgrep
Works on incomplete code
Key Principles
Root cause first: Understand WHY before searching for WHERE
Start specific: First pattern should match exactly the known bug
One change at a time: Generalize incrementally, verify after each change
Know when to stop: 50%+ FP rate means you've gone too generic
Search everywhere: Always search the ENTIRE codebase, not just the module where the bug was found
Expand vulnerability classes: One root cause often has multiple manifestations
Critical Pitfalls to Avoid
These common mistakes cause analysts to miss real vulnerabilities:
1. Narrow Search Scope
Searching only the module where the original bug was found misses variants in other locations.
Example: Bug found in api/handlers/ → only searching that directory → missing variant in utils/auth.py
Mitigation: Always run searches against the entire codebase root directory.
2. Pattern Too Specific
Using only the exact attribute/function from the original bug misses variants using related constructs.
Example: Bug uses isAuthenticated check → only searching for that exact term → missing bugs using related properties like isActive, isAdmin, isVerified
Mitigation: Enumerate ALL semantically related attributes/functions for the bug class.
3. Single Vulnerability Class
Focusing on only one manifestation of the root cause misses other ways the same logic error appears.
Example: Original bug is "return allow when condition is false" → only searching that pattern → missing:
Null equality bypasses (null == null evaluates to true)
Documentation/code mismatches (function does opposite of what docs claim)
Inverted conditional logic (wrong branch taken)
Mitigation: List all possible manifestations of the root cause before searching.
4. Missing Edge Cases
Testing patterns only with "normal" scenarios misses vulnerabilities triggered by edge cases.
Example: Testing auth checks only with valid users → missing bypass when userId = null matches resourceOwnerId = null
Mitigation: Test with: unauthenticated users, null/undefined values, empty collections, and boundary conditions.