Exploit-driven smart-contract audit skill for Solidity and Foundry repos. Adds preflight repair checks, semantic indexing, authority and dependency graphs, contradiction-based finding generation, source confirmation, proof planning, and proof-oriented test scaffolding.
Use this skill when the user asks to:
Do not use this skill for:
NO BLANK ANNOTATIONS (MANDATORY): NEVER leave TODO markers, empty headings, or placeholder text in completed artifacts. Scaffold templates are the ONLY exception: they are delivered to the user containing TODOs. All other
.audit_board/*.mdartifacts MUST be fully populated before the phase ends. If information is genuinely unavailable, write 'Not determined — [reason]' rather than TODO. MUST be filled: every heading in 01_threat_model.md, 02_static_analysis.md, 03_attack_vectors.md. All finding confirmation entries MUST includerejection_reason(or null),confidence_score, and . All proof plan entries MUST include (one of the three labels) and .
scope_statusconfirmabilityfalse_positive_riskSEPARATE ENTRIES PER SUBJECT (MANDATORY): NEVER merge multiple contracts or vulnerability classes into a single analysis entry. Each contract MUST have its own storage layout section if
forge inspectsucceeded. Each attack vector MUST be a separate numbered entry — do not group unrelated paths. When multiple Slither printers succeed, present each as its own labelled section. Correct: '### Attack Vector 1 — Reentrancy in withdraw()' and '### Attack Vector 2 — …'
init -> preflight_or_repair -> architecture -> semantic_index -> ast_semantic_index -> action_catalog -> authority_graph -> dependency_graph -> rule_scan -> invariant_candidates -> finding_candidates -> finding_confirmation -> proof_planning -> hypothesis_triage -> poc_design -> scaffold_tests -> submission_bundle when those later phases are needed.
semantic_index.json and ast_semantic_index.json are present, downstream phases MUST prefer the AST-backed version for all structural data (function bodies, call edges, storage access, guard extraction).ast_semantic_index phase is marked degraded when neither forge build-info nor slither --json output is available; the regex fallback index is still produced..audit_board/status.json and confirm the prior phase produced usable artifacts. If a prior phase has "ok": false only because tooling degraded, inspect the artifacts and warnings first, note the limitation explicitly, and continue only if the output is still materially useful..audit_board/ or test/.scaffold_tests.py must not overwrite existing tests unless you intentionally pass --force.meta/improvement_hotspots.json if it exists and use the top recurring hotspots as a preflight checklist.status.json plus the newly created artifacts and append structured observations to the persistent skill log whenever the skill misbehaves, produces weak output, or exposes a clearly actionable improvement.init is rerunnable and should preserve existing blackboard notes; do not treat it as a destructive reset.forge is installed in a standard user-local path but missing from PATH, prefer fixing the environment or invoking the discovered binary path rather than treating Foundry as unavailable.README.md, scope.txt, or out_of_scope.txt exist, ingest them during architecture and preserve that metadata for later severity and scope reasoning.test/Base.t.sol when scaffolding proofs.scaffold_tests.py should only report success when the generated tests compile in the local repo.hypothesis -> source_confirmed -> locally_reproduced -> deterministic_poc -> submission_ready. A sixth state, rejected, applies when false-positive rejection paths (guard equivalence, parent-guard shadow, out-of-scope disqualification, dead-code filtering) explicitly rule out a finding candidate.poc_spec.md.minimal_with_poc, detailed_with_instructions, severity_argument_only.severity_assessment.md instead of overselling it.Cerberus ships with a local references/ corpus for focused checklists and taxonomy support:
references/sharp_edges.md to pressure-test attack hypotheses and explain why a pattern is dangerousreferences/validations.md to sanity-check whether suspicious code patterns deserve explicit mention in 02_static_analysis.mdreferences/patterns.md only when proposing mitigations, safer rewrites, or test ideasTreat the vendored references as support material, not as a replacement for repo-specific source review, generated artifacts, or deterministic tests. Do not cite every matching pattern mechanically; include only patterns that materially map to the current codebase.
target_contract_dir (string): Directory that contains Solidity source files.target_contract_name (string): Solidity contract name for Foundry scaffolding.target_contract_path (string): Repo-relative .sol import path for that contract.primary_contract_path (string): Repo-relative .sol file to prioritize when the repo has multiple entrypoints or many interfaces.primary_contract_name (string): Contract name to prioritize when selecting the primary architecture target.force_scaffold_overwrite (boolean): Only use when replacing an existing scaffold is explicitly intended.All scripts write into .audit_board/ and print a single JSON object describing the phase result. Improvement feedback is stored persistently in this skill's own meta/ directory so future runs can learn from it.
Status schema:
{
"phase": "architecture",
"ok": true,
"mode": "full",
"artifacts": {
"flattened": ".audit_board/context_flattened.sol",
"topology": ".audit_board/topology_map.txt"
},
"warnings": [],
"errors": [],
"details": {
"target_dir": "/abs/path/to/src"
}
}
Interpretation:
ok: the phase produced at least one usable artifactmode: full only when required external tools are available and they produced usable analysis output; otherwise degradedwarnings: soft failures the agent can reason arounderrors: hard blockers that require correction before retryingNO CONTEXT BLOAT (MANDATORY): NEVER dump entire files, logs, or tool outputs into artifacts or responses. Extract ONLY the targeted fields, sections, or entries requested. When a tool returns more than ~100 lines, summarize and reference — do not reproduce verbatim. Incorrect: pasting the full
phase1_tooling.logas analysis output. Correct: extracting the 3-5 lines relevant to the current phase question.
.audit_board/status.json.audit_board/skill_monitor_context.json.audit_board/01_threat_model.md.audit_board/02_static_analysis.md.audit_board/03_attack_vectors.md.audit_board/04_proofs.md.audit_board/contest_context.json.audit_board/privilege_map.md.audit_board/invariant_map.md.audit_board/rule_scan.json.audit_board/rule_scan.md.audit_board/exploit_rankings.md.audit_board/exploit_hypotheses.md.audit_board/proof_status.json.audit_board/poc_spec.md.audit_board/severity_assessment.md.audit_board/submission_notes.md.audit_board/context_flattened.sol.audit_board/topology_map.txt.audit_board/external_calls.txt.audit_board/storage_layout.json.audit_board/phase1_tooling.log.audit_board/phase2_tooling.log.audit_board/PoC/Persistent improvement memory:
meta/skill_improvements.jsonlmeta/skill_improvements.mdmeta/improvement_hotspots.jsonmeta/resolved_hotspots.jsonBefore running Phase 0, perform a local monitoring pass setup for the skill execution.
Monitoring responsibilities:
.audit_board/status.json after every phasemeta/improvement_hotspots.json before starting so recurring weaknesses are checked firstWhen to log:
SKILL.mdWhen not to log:
Logging command:
python3 resources/log_improvement.py \
--run-id <run_id> \
--phase <phase_name> \
--severity medium \
--category output_quality \
--summary "Threat model omitted privileged roles section" \
--suggested-fix "Add a post-generation validator that checks for required threat-model headings." \
--evidence .audit_board/01_threat_model.md
Suggested monitoring checklist:
Review `meta/improvement_hotspots.json` before the run. After each phase, inspect status plus the newly generated artifacts. If you see unintended behavior, low-quality output, contradictions with SKILL.md, or clear opportunities to improve determinism, safety, or usefulness, append exactly one structured log entry with resources/log_improvement.py. The log will refresh the hotspot and summary artifacts automatically. Do not block the main execution unless the failure is severe enough that the skill should stop.
Before Phase 0:
meta/improvement_hotspots.json if it existsAfter any call to resources/log_improvement.py:
meta/skill_improvements.md and meta/improvement_hotspots.json refresh automaticallyWhen the same hotspot recurs in 2+ runs:
When a hotspot is fixed:
python3 resources/resolve_hotspot.py --action resolve --fingerprint <id> --resolution-note "<what changed>"--action reopenRun:
python3 resources/init_workspace.py --target-dir <target_contract_dir>Purpose:
.audit_board/forge and slither are available.audit_board/skill_monitor_context.json with run_id and persistent log pathsSuccess criteria:
.audit_board/ exists.audit_board/status.json contains an init entrydetails.dependenciesrun_idRun:
python3 resources/analyze_architecture.py --target-dir <target_contract_dir>Behavior:
forge flatten when availableslither --list-printers and selects the first supported printer from the configured candidates for each sectionforge build --build-info plus slither . --ignore-compile --print <printer> when forge is availableNo contract was analyzed / analyzed (0 contracts) as unusable noise, not successful printer output.audit_board/phase1_tooling.logArtifacts:
.audit_board/context_flattened.sol.audit_board/topology_map.txt.audit_board/contest_context.json.audit_board/privilege_map.mdAgent follow-up:
.audit_board/01_threat_model.md.audit_board/contest_context.json before deciding what is in-scope, known, or likely severity-capped.audit_board/privilege_map.md to pressure-test stale-role and incomplete-rotation bugsreferences/sharp_edges.md to pressure-test upgrade, oracle, access-control, and composability assumptions before finalizing conclusionsRun:
python3 resources/extract_state_vectors.py --target-dir <target_contract_dir>Behavior:
slither --list-printers and uses the first supported state-surface printer from function-summary, entry-points, and vars-and-authforge build --build-info plus slither . --ignore-compile --print <printer> when forge is availableNo contract was analyzed / analyzed (0 contracts) as unusable noise, not successful state-surface outputdelegatecall usage directly in source files while ignoring commentsforge inspect <path:Contract> storageLayout for concrete non-test contracts when availableforge inspect ... --force when storage layout is missing from artifacts due to cache issuesdegraded when forge inspect produces no usable storage layouts for the inspected contracts, even if delegatecall scanning still found matches.audit_board/phase2_tooling.logArtifacts:
.audit_board/external_calls.txt.audit_board/storage_layout.json.audit_board/invariant_map.mdAgent follow-up:
.audit_board/02_static_analysis.md.audit_board/invariant_map.md for setters that update address-linked state without role or permission rotationreferences/validations.md as a checklist for code patterns that deserve explicit confirmation or rejectionRun:
python3 resources/ast_semantic_index.py --target-dir <target_contract_dir>Behavior:
slither --json output via slither --ignore-compile --json -, falling back to slither --json output.json, then slither with auto-compile, then slither --solc-remapIdentifier nodes for state_reads, resolved Assignment targets for writes, true_writes/false_writes, external/internal call edges, member_calls, auth guards (onlyRole, onlyOwner, require(msg.sender == ...)), and slot assignmentsmode: "full" when Slither AST is consumed; mode: "degraded" when regex fallback was used_ast_mode: true and _ast_source: "slither-json"|"regex-fallback" in the outputArtifacts:
.audit_board/ast_semantic_index.jsonDownstream consumers:
extract_actions.py, build_authority_graph.py, generate_finding_candidates.py, confirm_findings.py MUST prefer ast_semantic_index.json over semantic_index.json when the former is present.This phase is manual reasoning by the agent.
Inputs:
.audit_board/01_threat_model.md.audit_board/02_static_analysis.md.audit_board/contest_context.json.audit_board/privilege_map.md.audit_board/invariant_map.md.audit_board/rule_scan.md.audit_board/exploit_rankings.md.audit_board/exploit_hypotheses.md.audit_board/proof_status.jsonWrite:
.audit_board/03_attack_vectors.mdEach attack vector should include:
in-scope, known issue, severity-capped, needs-PoCWhen relevant, use references/sharp_edges.md to avoid missing known exploit families, and use references/patterns.md only when proposing concrete mitigations or proof strategies.
Run:
python3 resources/rule_scan.py --target-dir <target_contract_dir>Behavior:
references/validations.md into native regex/context checksreferences/sharp_edges.md into exploit-family heuristicsArtifacts:
.audit_board/rule_scan.json.audit_board/rule_scan.md.audit_board/exploit_rankings.mdAgent follow-up:
.audit_board/rule_scan.md to seed 02_static_analysis.md with concrete, file-backed checks instead of manual grep alone.audit_board/exploit_rankings.md to rank attack vectors before spending time on lower-signal hypothesesRun:
python3 resources/triage_hypotheses.py --target-dir <target_contract_dir>Behavior:
privilege_map, invariant_map, rule_scan, exploit_rankings, and existing attack-vector notesArtifacts:
.audit_board/exploit_hypotheses.md.audit_board/proof_status.jsonAgent follow-up:
03_attack_vectors.mdRun:
python3 resources/design_poc.py --finding-title "<finding_title>" [--target-contract-name <name>] [--target-contract-path <path>]Behavior:
Artifacts:
.audit_board/poc_spec.md.audit_board/severity_assessment.md.audit_board/submission_notes.mdAgent follow-up:
Run:
python3 resources/scaffold_tests.py --contract-name <target_contract_name> --contract-path <target_contract_path>--output-dir test/root-audit to colocate scaffolds with audit-specific proofs--force only if replacing an existing scaffold is explicitly intendedBehavior:
test/root-audit/test/Base.t.sol automatically when present.audit_board/ into the scaffold commentsArtifacts:
<output-dir>/<ContractName>ExploitPoC.t.sol<output-dir>/<ContractName>InvariantTest.t.solAgent follow-up:
.audit_board/04_proofs.md as confirmed, inconclusive, or rejected.audit_board/proof_status.json when a hypothesis changes maturityRun:
python3 resources/build_submission_bundle.py --finding-id <slug> --title "<title>" --severity <severity> --template <minimal_with_poc|detailed_with_instructions|severity_argument_only> --poc-path <path>Behavior:
.audit_board/PoC/final_submission/ when missingArtifacts:
.audit_board/PoC/final_submission/.audit_board/PoC/final_submission/MANIFEST.mdAgent follow-up:
Proof maturity tracks each finding through a defined progression:
| Label | Meaning |
|---|---|
hypothesis | Structural signal detected; not yet reviewed in source |
source_confirmed | Guard drift or authority gap confirmed in source code |
locally_reproduced | Deterministic test demonstrates the vulnerability |
deterministic_poc | PoC is stable and auditable |
submission_ready | Report written and peer-reviewed |
rejected | Explicit false-positive rejection path fired (guard equivalence, parent-guard shadow, out-of-scope, dead-code) |
Each proof plan entry produced by plan_proofs.py MUST include a confirmability field:
| Label | Criteria |
|---|---|
confirmable_and_reproducible | source_confirmed status, confidence ≥ 0.75, no disqualifiers, concrete transaction sequence possible |
confirmable_but_weak | source_confirmed status with confidence ≥ 0.5 or evidence present, but no concrete reproduction path or only soft assertions |
interesting_but_unconfirmed | weak_signal or rejected status — still worth documenting with a "what would it take" note |
Each proof plan also MUST include:
false_positive_risk: low | medium | high | unknown — derived from guard equivalence, scope status, and parent inheritance checksblocking_assumptions: explicit list of preconditions that must hold for the PoC to workminimum_test_commands: concrete forge test / forge script commands that would exercise the findingEach entry in finding_confirmations.json MUST include:
| Field | Type | Description |
|---|---|---|
candidate_id | string | Finding ID from finding_candidates.json |
status | string | source_confirmed | weak_signal | rejected |
rejection_reason | string | null | Explicit reason if rejected; null otherwise |
scope_status | string | in_scope | out_of_scope | unknown |
confidence_score | float | Adjusted score in [0.0, 1.0] |
confidence_boost | float | Delta applied to base confidence |
guard_analysis | string | Human-readable guard-surface analysis |
disqualifiers | string[] | Blocking unknowns or scope issues |
Bias triage toward:
De-rank vectors that:
Severity coaching:
High only when the proof demonstrates direct fund loss, permanent lockup, or irrecoverable liveness failure over protocol-owned assetsMedium for concrete but bounded asset or trust-boundary failuresLow for deterministic governance/recovery weaknesses that do not directly freeze or steal assets by themselvessubmission_notes.mdInitialize:
python3 resources/init_workspace.py --target-dir src
Architecture analysis:
python3 resources/analyze_architecture.py --target-dir src
State extraction:
python3 resources/extract_state_vectors.py --target-dir src
Rule scan:
python3 resources/rule_scan.py --target-dir src
Hypothesis triage:
python3 resources/triage_hypotheses.py --target-dir src
PoC design:
python3 resources/design_poc.py --finding-title "Expired auction remains unsettled due to stuck filler" --target-contract-name DutchTrade --target-contract-path contracts/plugins/trading/DutchTrade.sol
Scaffold tests:
python3 resources/scaffold_tests.py --contract-name Vault --contract-path src/Vault.sol
Build submission bundle:
python3 resources/build_submission_bundle.py --finding-id vault-lockup --title "Vault funds can be permanently locked" --severity High --template minimal_with_poc --poc-path test/root-audit/VaultExploitPoC.t.sol
After changing the skill's contract-selection or architecture logic, run:
python3 resources/eval_regressions.py
This eval builds a minimal Foundry-style fixture with both IEVault.sol and EVault.sol and fails unless:
/test/, /interfaces/, /interface/, /mock/, /mocks/, /script/, /scripts/, and .t.sol filessrc/Scaffold overwrite:
python3 resources/scaffold_tests.py --contract-name Vault --contract-path src/Vault.sol --force
Improvement logging:
python3 resources/log_improvement.py --run-id run-2026-03-25T12-00-00+00-00 --phase architecture --severity high --category runtime_error --summary "forge flatten failed in a configured Foundry repo" --suggested-fix "Capture and classify stderr so the agent can retry intelligently." --evidence .audit_board/phase1_tooling.log