Run a comprehensive security audit of the current codebase. Use when the user asks to audit security, run a security scan, check for vulnerabilities, do a security review, or similar. Activates for phrases like "security audit", "check security", "find vulnerabilities", "security scan", "pentest this", "OWASP check".
A comprehensive security audit skill that covers all 14 OWASP Secure Coding
Practices categories, informed by STRIDE threat modeling, enhanced with
domain-specific profiles, and calibrated against CVSS severity standards.
How It Works
Focused passes per vulnerability class. Each subagent receives ONLY its
category's checklist (15-25 concrete detection patterns). This eliminates the
"boil the ocean" problem and reduces false positives.
Deployment-aware severity. Deployment context (PaaS, CDN, self-hosted)
is detected and injected into every subagent, so findings like "force_ssl
disabled" get the right severity for the actual infrastructure.
STRIDE-informed context injection. Before subagents launch, the parent
performs a lightweight threat model. Each subagent receives the relevant threat
categories.
Tech stack awareness. Categories irrelevant to the detected stack are
skipped entirely.
Domain-specific amplification. Built-in profiles (fintech, healthcare,
e-commerce, SaaS/multi-tenant) inject per-category severity elevations and
additional checks into each subagent. Subagents must acknowledge which
elevations they applied.
Related Skills
Cross-finding severity calibration. After all findings are collected and
reviewed, a calibration pass normalizes severities using the rubric as the
single source of truth. This catches inter-subagent inconsistencies,
over-classifications (missing headers rated as HIGH), under-classifications
(domain elevations missed), and findings mitigated by controls that a
single-category subagent could not see.
Remediation roadmap. A dedicated subagent groups all findings into three
waves (immediate/short-term/medium-term), maps dependencies between fixes,
and identifies root causes that resolve multiple findings at once.
Invocation
The user invokes the skill naturally. Parse intent from the user's message:
"Run a security audit" -> full scan, all categories
"Security audit, focus on auth and crypto" -> category focus mode
"Security scan, only high and critical" -> severity threshold mode
"Security audit and fix what you find" -> scan-and-fix mode
"Security audit for a fintech app" -> full scan + fintech domain
Modes
Mode
Trigger
Behavior
scan (default)
No special keywords
Scan all relevant categories, produce report
scan-and-fix
"fix", "remediate", "patch"
Scan, then create branch and commit fixes
category-focus
Names specific categories
Scan only named categories
severity-threshold
"only critical", "high and above"
Scan all but filter report to threshold
Argument Parsing
Extract these parameters from the user's message:
MODE: scan | scan-and-fix | category-focus | severity-threshold
CATEGORIES: list of category slugs (for category-focus mode)
THRESHOLD: critical | high | medium | low (for severity-threshold mode)
DOMAIN: fintech | healthcare | ecommerce | saas-multi-tenant | <custom>
TARGET_PATH: specific directory to audit (defaults to project root)
Subagent Orchestration Principles
These principles govern ALL phases of the audit. Every phase description below
is subject to these rules.
Why Subagents
Every subagent starts with a clean context window. This is the single most
important quality control mechanism in the skill. When the orchestrating agent
searches dozens of files, reads thousands of lines, and tracks findings across
14 categories, its context fills with noise. Pattern matching degrades.
Confirmation bias accumulates -- once the orchestrator "believes" a finding is
real, it stops scrutinizing. Subagents break this cycle: each one sees only the
evidence it gathers and the instructions it receives.
The orchestrating agent is a coordinator, not an author. It detects the tech
stack, performs the STRIDE pre-pass, constructs subagent prompts, launches
subagents, validates their output, and assembles the final deliverable from
subagent-written sections. It MUST NOT write findings, report prose, remediation
advice, attack chain narratives, or executive summaries. If text requires
judgment, synthesis, or security expertise, a subagent writes it.
The Write-Then-Review Pattern
Every significant text output follows this pipeline:
Author Subagent --> Review Subagent --> Final Text
(writes) (validates) (used in report)
Author subagent produces the text (findings, report section, remediation
roadmap, attack chain narrative).
Review subagent receives the author's output PLUS the evidence references
(file paths, line numbers, code snippets the author cited). The reviewer
validates the text against the evidence. It does NOT re-search the codebase
from scratch -- it spot-checks the specific claims.
Orchestrator incorporates the reviewed text into the final report. If the
reviewer flagged issues, the orchestrator uses the reviewer's corrected
version, not the original.
What Counts as "Significant"
The write-then-review pattern applies to:
Category findings (Phase 2 subagent output) -- already written by
subagents; these get review subagents
Executive summary (Phase 4) -- written by a synthesis subagent, reviewed
Attack chain narratives -- if findings combine into multi-step attack
paths, a subagent writes the narrative, another reviews it
Remediation roadmap -- if scan-and-fix mode produces a prioritized fix
plan, a subagent writes it, another reviews it
Domain compliance analysis (Phase 3) -- written by a subagent, reviewed